WPF, or how trees can be hard to climb

Paul Klee-Homes Of The TreeIntroduction

This article is meant to deeply scare you, wannabe WPF developers. Seriously, consider these facts before you enter the WPF door. For those who dare read on, here’s the full story…

Since .Net 3.0, Windows Presentation Foundation (WPF) has been proposed by Microsoft as the framework of choice for developing user interfaces. The introduction of Windows 8.0 caused a bit of panic as developers felt that WPF and .Net were going to join the fate of Silverlight, but so far there have been no signs of this. If anything, it’s clear that some form of WPF will be the UI technology of the future, either in the shape of WPF itself or its new XAML offspring. So my perception is that a developer working on microsoft systems should be well aware of these technologies. But is it easy to jump onboard? After using WPF for quite some time in different projects, I can say that unfortunately the answer is no, you are definitely not going to have a free ride this time. Why?

The language of trees: XAML

WPF uses XML files to describe the user interface. The main conceptual difference between coding a UI in c# and XAML is that c# is an imperative language, whereas XAML is a declarative one, and thus it describes more of the “what” than the “how”. C# is a list of steps in cronological order; with XAML you describe the logic containment of user interface items and let the runtime build the objects in the right order and connect them properly. C# is a sequence of steps, XAML is a tree (as XML).
From a conceptual perspective, this is all good and dandy; but there are dark sides to it.
The main trouble is that you have to learn this new XAML language. You were thinking you would be drawing shapes and panels to define your UI, and you find yourself browsing XML files. It would be great to define the visual appearance of an application by means of visual tools, but with WPF you will soon be looking at XML/XAML code.
What’s worse is that it’s not just another language with some different syntax, but a language with a different paradigm. Functional languages are considered by some as the ultimate weapon for producing bug free code in a breeze, but most developers find imperative approaches quite a bit easier to read and extend. I believe XAML is a little like a functional language.
For example, variables are a simple and well understood concept in imperative languages. With XAML, it’s not always obvious how to reference an element from another element. The situation gets surprisingly complex when you add name scopes and cross assembly references to the mix.
You’ll often be tempted to switch back to C#, as in WPF you can always choose. But even simple extensions to XAML require a fairly large amount of boilerplate code. For example, converting values for display require a small class to be written (a converter), then you need to declare an instance of this in XAML, then reference the instance in the appropriate places. All of this is really a lot of burden when you might be applying a simple formula to the input.
Another situation where the code behind temptation will be strong is when dealing with animations. Animations seem like an imperative concept more than a declarative one (animations are sequences of steps after all) so it feels natural to subscribe to events and play pause animations from handlers. Due to how animations affect properties, this is often not so straightforward. Just check the number of questions about properties being stuck on values from animations on stackoverflow.
Conclusion #1: XAML is a new language with a different syntax and paradigm compared to C# and moving code between them is not simple, so be prepared to learn and evaluate on a case by case basis what should end up in XAML and what in C#.  To put it in other words, WPF with XAML is like climbing a tree with many branches, whereas c# developers are mostly used to linear code.

Concept overriding

As I suggest in another article, an object oriented library is like a little language extension in itself. When using an OOP library, no programmer is surprised to see new domain related classes and interfaces. New classes describe the new concepts and interfaces state how the concept link to each other and are used to build systems. What is surprising, and not in a good sense, is when a library introduces new classes for basic language concepts.
WPF does this pretty regularly, and the first example is properties. Fields and properties are a well known, basic OOP tool. WPF turns this all around with specific classes to represent values in user interface objects. There are of course good reasons for all of this, but the fact that a basic language concept becomes much more complex to use is baffling. It’s especially baffling as the minds behind this architecture could have chosen for a different approach based either on language extensions or more heavily relying on programming tools, as it’s all Microsoft products and technologies.
Notifications from non user interface classes to UI are moreover handled in yet another way, by means of the INotifiable interface. Simple properties in WPF thus turn either into complex dependency properties or into INotifiable properties.
Conclusion #2: WPF as a library has a steep learning curve as it redefines even basic programming concepts. As far as properties are concerned, use dependecy properties only for UI specific configuration properties, and use INotifiable properties for signaling data model changes to the UI.

Infrastructure rich vs. feature poor

WPF clearly provides a very rich infrastructure with the XAML declarative language for UI and many new concepts and classes to learn. It supports themes, animations, transparencies, data binding. More than a simple UI library, WPF is a complex, rich application development framework. On the other hand, considering what you get with WPF out of the box, the number of actual UI widgets and tools is rather limited. You might be surprised to see there’s no file open dialogs or any other standard dialogs. There’s no chart control of any kind, no graph, no bar plot or the likes. If you are looking for panels that automatically support touch based interaction with basic animations, you’ll find none. This lack of features is at the beginning quite disarming. You would expect to use additional UI libraries for advanced cases, but it feels like standard WPF controls don’t even cover the basic ones. Or they do so in such generic ways as to require a lot of work. The ItemsControl and its derivatives are examples of this uber-generality: any list of things fits the ItemsControl, be it an image gallery, a data grid, etc. This is great in a way, but it also means that the generic control does not exactly provide parameters to tune interactions for each specific case; when you want to customize the ItemsControl, you have to learn about the templating side of WPF. It’s a new way of extending a control, which is not based either on the familiar inheritance of a basic control type or on the composability of user controls. It’s not bad, it’s just yet another way of extending WPF.
Given the limited amount of controls, the other extension you might be interested in is to tweak the appearance or interaction of standard controls. For example, I needed to specify a different thumb for a slider control that was controlling the brightness of an image. Personally, I would have liked and expected a property on the control with a name like “ThumbIcon” or similar. Unfortunately, there’s nothing like that. What you need to do to make such simple tweak is to understand that WPF uses control templates to specify the visual appearance of any control (please note this a different type of template, not the data template used in the ItemsControl). This template can be extended, modified, or even completely replaced. In order to change the slider thumb as in my example, I had to update the control template. First of all, such control template is not immediately available in Visual Studio. After digging this template, I was confronted with a rather huge piece of XAML code that accounts for both vertical and horizontal sliders. Moreover, templates of built in controls are based on another WPF concept, the “parts and states” model. At the end of this experience, I was left with the feeling that writing my own slider from scratch would have been a lot easier. If you are creating simple controls that are used in well defined scenarios, consider whether you should really extend a built in control or rather write a more specific one from scratch.
Conclusion #3: you will not get a whole lot of controls with WPF. Be prepared to evaluate WPF control libraries, and be prepared to write at least some of the controls yourself.

Non linear-complexity decision paths

A last tricky point of WPF is that certain decisions might lead you in surprising dead ends. For example, you can create new controls either by composing controls, or by deriving from a basic control class. It seems like any of the two paths should be possible, with the second one possibly giving you more freedom to tweak the behavior of the control, whereas the first one is the easier route with more support from the visual editor. This might be true for a simple widget; if the control you are writing is a container control, the situation is different. Due to how WPF handles names, the difficult route is the one to take for containers or otherwise you won’t be able to use the container with named content. The fact is that it’s not straightforward to turn a composed control into a custom control; in the first case you have a XAML that specifies the concrete appearance of the control, in the second case you need to write a style, which is still XAML but with slightly different constraints. Turning a user control into a custom control is thus not just a matter of changing the base class name.

Conclusion #4: choosing the right starting point/class for using WPF might not be easy. Read the manuals before you start extending the wrong control.

Advertisements

Objects from the magic box

2001MonolithObjects are data, functions, behaviors, contracts, everything. If you came from the plain-old-C age, you would be familiar with a much simpler way of structuring your code: structures as records of data fields, and functions as collections of transformation steps that affect these data structures.
The procedural approach to programming is more strictly structured than OOP.

OOP was born out of procedural programming, as an extension. That extension, called classes, did not narrow down the possibilities or put additional constraints. It opened up a rather complex world of possibilites, by allowing a free unrestricted mix of data and functions called an “object”. One common rant agains OOP is that “OOP forces everything to be an object”. Joe Armstrong, designer of the Erlang functional Language, expressed this rant very strongly . I think the truth is quite the opposite. It’s not that OOP forces everything into an object, it’s that an object can be everything in OOP, and as such it’s hard to say what the object construct is meant for. I would rather second objections along the lines of Jeff Atwood’s entry  in that OOP is not a free ride.
A class can be a data structure, and in this case the encapsulation traits are probably not extremely interesting and the class itself could well be a structure. A class can be a collection of methods only, without shared state. In this case it’s like a C module. A class can be a contract, when it contains virtual functions. A class can be the implementation of a contract with hidden and encapsulated state. A class can be many more things.
I think that one of the productivity issues with OOP, at least the C++ way (and all other derivatives) is that all these different use cases are syntactically represented in the same way, as a class. The class construct is totally devoid of any specialization, and as such it’s both extremely powerful and hard to handle. The software architect needs to specialize the class into a meaningful tool for the problem at hand. OOP in this sense is a meta-programming paradigm, which does require some thoughtful selection of language features and how these should be bent to the goals of product creation. This becomes even more evident if you look into all the “companion” language features of C++, like templates, multiple inheritance or friend classes. If you choose OOP, you have to define rules of how to use the language, much more so than in the procedural language case. Java and C# made some moderate attempts at specializing the class construct by adding the interface keyword. It might be interesting to see what an even more constrained OOP language could look like. A language with special syntax for data classes, behavior classes, user interface classes, and so on. A language that naturally leads the developer to choose a nearly optimal tool for the job. Any language designer out there? For the time being, architects are called to do this step in frameworks instead of languages.

So, if OOP requires so much planning and choice of the tools, why has it become so popular? In my mind, it’s because of two reasons. First, because flexible structuring allows software designers to create libraries and frameworks with the reuse patterns they have in mind and they need. As Spiderman said, with great power comes great responsibility, and that’s what OOP gives and demands.

The second, maybe the most important reason, is that the object way of decomposing a problem is one of the most natural ways of handling complexity. When you plan your daily work activities, are you concerned about the innards of the car you are driving to reach the office? Do you need to know how combustion in the engine works? Do you need to check out the little transistors in your CPU to see they are all twitting correctly? Me not. I rely on those things working as expected. I don’t need to know details of their internal state. I appreciate that somebody encapsulated and hid their variables and workings in convenient packages for me to consume. It’s like this with objects, and it’s like this with human organizations. We all regularly delegate important work to others and trust them, maybe after signing some contract, that they will provide us with the results we need. Delegation and work-by-contract is what defines human structures as well as OOP, which is why OOP is popular for large software architectures.

There’s maybe one last perspective. Object orientation might favour static structures over processes made of steps, or state machines where state keeps changing. By hiding the changing state, OOP could give the impression of a perfect world of static relationships. The word “perfect” comes in fact from the latin composition of per-factum, that is complete, finished, done. If it’s done it does not change anymore and it’s thus static. Clearly a static structure is easier to observe than something which keeps changing, so perfection of static structures is more in the eyes of the beholder who can then appreciate all details. Science, for instance, is about capturing what changes in formulas that do not change and thus can be used for predictions. It’s not just an observer perspective, as static and long lasting structures are more worthy of investigation than brief temporary situations.
To sum it up, the bias of OOP towards static structures is natural and useful in describing large architectures.

 

Atomic software patterns

Software engineering, really?

ImageIt’s typical of software engineers to feel a little apart in the company of engineers of other specialties. Why? Because engineering and software don’t really get along that nicely. Engineers who build a house, a bridge or even an electronic circuit have little margin of error. That makes the process a little more constrained than say the typical write-build-run cycle we are used to. Being more constrained requires self discipline and a lot of book reading which usually makes you grow big spectacles and makes you lose hair. Software engineers, on the other hand, have compilers and unit tests. That frees software engineers from discipline and introduces one single constraint, that of a comfortable chair. That does not have a proved influence on your hair, but surely a comfortable massage chair is more forgiving of a few extra calories in your diet. So if the standard picture of an engineer is a square, that of a software engineer will be more like a circle. If there’s a lot of discipline in bridge engineering, there’s a lot of make-break in software engineering. Finally, a bridge engineer could use its books as construction material. Honestly, how many of you developers could name a book you must keep by your side while you develop?

We are going to fix this empty spot by your side, now and forever. All the knowledge you need, the light you’ve been waiting for, is coming in the following lines. To be fair, there have been attempts in the past. One of the most notable is the arch-famous “Design patterns” from the Gang of Four, the book of all software pattern books. I argue that the Gang of Four is not the theory, but rather a well organized set of common practices instead. Does the Gang of Four contain equations? Nope. Full of pictures, so it can’t be an engineering book. Is there one common formula, one law that rules them all? nope, just some common solutions to common problems.
In order to extract the one law (well, one or two) we need to go back to the basic action of software design. In a previous post I stated that software engineer is the art of copying data. I will rectify: software programs are artistic ways of copying data. How do we make this software programs?

The GOF pictures contain a lot of UML boxes with many connections, but the simple beginning of it all is one problem we need to solve. Software engineering is the process of splitting that problem in parts to manage its complexity. The GOF pictures are not the basic rules because they represents systems where the problem has already been split in many, many parts. This splitting has to start from somewhere, and that’s where we will find the grounding rule of software enineering.

Rule number 1: If a problem is too big, split it in two problems

ImageBy splitting you get two little problems instead of one. The immediate advantage is that the two little problems might be more tractable. If each of the two little problems by themselves fit in your brain, you might be able to cook a solution for each and combine.

If not, you could buy coffee for a colleague who knows how to solve part A, and pizza for another colleague who’s an expert in dealing with part B.
The split gives you more tractable problems, and the ability to replace one part of the system (the dependent part) without touching the rest of the components.

Rule number 2: the way of the dependency

ImageIt seems that by applying the rule Nr 1 we could design all possible software architectures. We split A in B and C, then C in D and E. If all dependencies go the same way what we achieved is that the first component depends on the entire chain. Only the last one is a self standing entity. Rule number 1 is thus enough only as a beginning.

In order to be a successful split, the dependency between the two parts of the split needs to be one directional. That is, if you split A in B and C, it should be that B depends on C and not viceversa.

ImageIf you have a two-way dependency, you are specifying a chat between two parties rather than a software design. If B depends on C and viceversa, you cannot use any of the two entities independently, thus looking from far away, those two entities could very well be one. By the same transitive logic, the dependency chain b->c->d from above is no better than two components.

 

ImageThe rule nr 2 generates this super useful corollary: if you split A in B, C and D, make sure that the there is no chain. In practice this means B and D both depend on C, which acts as the contract or interface between the two.

 

And that was it. The book of software engineering is composed of two atomic rules from which all patterns derive.

Side note: I have taken the admittely very biased approach of making software design coincide with object oriented design. I’ll explain why in another post.

It all started with an Apple

Some of theNEWTONAP2 biggest revolutions started with an apple. While men and women were dwelling peacefully without needing much apart from each other, it took an apple to shake up the balance and start the evolution of mankind. After a while, it was all due to an apple if that Newton concluded that stars and planets are following the same universal rules as we do on earth, thus opening up the possibility of understanding and exploring the universe.

After a few more years, it took another apple to rewrite the evolution of mankind once more. We were all living somewhat peacefully in the land of Java, if you were on the server islands, or .Net/Windows, if you were a desktop application developer. We were all amazed that we could run the same applications on desktops as on laptops, and we even had friends who would read emails on the their cathodic tube tvs connected to something they called a media center. From a developer perspective, there were a few uber-frameworks to deal with. Java on servers, .Net on clients, HTML with SQL backends on the web. Living in a world of few languages had its benefits. Microsoft was pouring a good deal of its resources on the .Net platform, which kept improving and extending its reach. Java was also maturing, and so were the development environments for it.
This was quite convenient for software designers, as the technology mattered a lot less than the concepts and the implementation. If you consider that c#/.net was basically a clone of Java/JVM initially, you should concur that the picture looked so flat it sounded almost boring for developers.

All of a sudden, year 2007, the apple changed it all for good (in many ways), again. The IPhone showed that there were other options for running applications apart from desktop pcs, or laptops. It showed that PCs can be thin, light, pocketable, look good, and work as phones as well as pcs. That was the revolution. In a matter of a few months, people got acquainted with the idea of using applications on tiny touch screens. Then the IPad came, and the poor laptop went to join the fate of the desktop as the tool for nerds. People who use the PC (I don’t mean software engineers) stopped caring about what operating system the device was running. They stopped caring whether the wordprocessing app could create multi-column layouts, because all they started writing were emails. Who takes care of formatting emails with pretty fonts? The idea of a general purpose PC, with very generic purpose applications like spreadsheets and super feature-loaded word processing applications was being questioned very heavily. The phone and tablet became the preferred content consumption device. We started using a myriad of very small, very focused applications in place of the big office suite guns. The “write once, run anywhere” line, which was already rusting to be honest, was forgotten and erased from all books. Now we write iOS applications with iOS tools, Android applications with a mixture of Eclipse and other tools, WinPhone applications with, uh, well, a myriad of tools, and Windows desktop applications with yet another set of tools. If you’re in the web space, then you cannot count the amount of languages, scripts, and specific technologies for specific needs.

The balance has somewhat shifted from architecture and uber frameworks to technology and small pragmatic solutions. If you like making software, it’s fun to choose the right tool for the job these days, there are quite a few options. To be honest, this mindset change has also to do with the economic crysis which imposed faster turnaround times for projects. It’s often less structure, less organization, more action. Time for hacking!
So what happened to uber-frameworks? There’s surely less and less investment in those. Microsoft was supporting the largest ecosystem of languages, frameworks and tools with .Net. The efforts of the company are now clearly directed towards the mobile space where it’s playing the catch up game. .Net has not evolved much, the loose ends are now mostly handled by the community. Bits and pieces of the framework (XAML, some libraries) were carried over to native (or almost) c++, which could be defined a “lower level language” compared to .net/Java. The new Directx 12 seems to go for a lower level approach as well. In general, lower level is more trendy nowadays than it was a just few years ago. Is this turning us into coding machines without architectures? I don’t think so, possibly it’s the opposite. It’s the mini frameworks, home brewed, that can save the day by pulling together all the tools and giving a structure to all those different-minded pieces of software.

The mini framework topic is basically the focus of most posts here, when I am not digressing about software philosophy or history, as done in this case.

Everything is asynchronous

Asynchronous APIs

In one of the previous posts I stated that all processing that occurs as a result of a user interaction should be delegated to background processing so that the user interface is always responsive and smooth. In ordfuturismoer to keep it (even) simpler, one might say that all classes in an application which deal with data or business logic should only expose asynchronous methods. To be more specific, we can start by categorizing classes and components in an application. Some classes are user controls, like buttons, comboboxes or other more complex collections of user-facing components. All the other classes are somehow related with the actual domain of the application, from data model classes to business logic components that deal with business processes. So, if we want to be strict, we can say that all user-interface classes which call into business classes should do so via an asynchronous call that delegates the work to another thread. In the other direction, from background workers to user interface, the UI frameworks typically require all calls to be directed to the thread owning the user interface components (there might be multiple), so our rule is already enforced. One of the issues with this approach is that it leads to too much/unwanted parallelism. When business objects start calling other business objects, every call turns into a new thread. The asynchronous calls should be enforced only when coming from a user interface component.

With thread-based APIs, this is difficult to achieve. Whenever you design a business object with a method A that can potentially take a very long time, you delegate the work to a background thread. This is appropriate if the caller is the UI, but what if the caller is another business oject? It might be a better choice to run the lengthy method in the same thread as the caller. The solution to this problem, as usual in software engineering, comes via a layer of abstraction. The thread is the low level way of doign parallel computation, the task hides the details of the thread. You can start thousands of tasks, but the “runtime” (language library or virtual machine) will only execute a reasonable number of tasks in parallel, where reasonable depends on several factors including the number of real available cpu cores. Many languages provide some task based abstraction: C++ 11, C#, Javascript and Java as well (JDK8).

While tasks were becoming the trend in parallel programming, I was designing an API which would be used in a mostly asynchronous way. So I asked myself whether I should simply shape the API to return Tasks insted of plain simple results. Back then I chose to offer both task-based as well as non-task based (synchronous) APIs. That meant an API like this:

public class A {
int FindSomeValue(int someKey);
Task BeginFindSomeValue(int someKey);
}

Normally you would not clutter an API with utility functions. If the user of the API can easily achieve the desired behavior with the existing API, don’t add anything. The smaller the API, the more understandable, the more usable, the more productive. So why would we want to expose both synchronous and asynchronous APIs? After all, it’s easy to turn a call into an asynchronous call in .net:

int someValue = await Task<int>.Run(()=>FindSomeValue(someKey));
int someOtherValue = someValue+1;

The previous lines do a lot of things: start the FindSomeValue function in another thread (to simplify a little), return control to the caller and set up an event so that when the result of the asynchronous call is available (the someValue result), it can continue the computation and finally perform someValue+1. So, although not entirely trivial, it’s at least possible with little code to turn synchronous into asynchronous. Why did I put two versions in the API then? The reason is that I wanted to handle the scheduling myself. The BeginFindSomeValue would use a combination of resources that performed suboptimally when loaded with too many parallel workloads. .Net would allow to specify a custom scheduler, but asking a user of an API to chew all the custom scheduling way of calling threads would be too much work put on the user, and ultimately would mean exposing implementation details of the API. This is the most practical reason to expose both an asynchronous and a synchronous API: custom scheduling. Doing the scheduling internally allows the API implementor to choose how much parallelism to allow for optimal performance. For example, a database might have different scaling characteristics than a simple file storage on disk. .Net schedulers essentially schedule work for the CPU to perform, but in modern computation architectures there’s much more than CPUs: GPUs, remote computing servers, remote data servers. The logic used to schedule tasks on CPUs does not necessarily work well on network bound operations or GPU bound operations. For example, loading the GPU with many more operations than available cores is rather normal. The ratio tasks/cores is much lower on CPUs due to the different architectures. The ratio on a network is again different. A gigabit link can “perform” many network calls per second, and in most circumstances will be limited by latency more than bandwidth. Combining CPU, GPU and network workloads thus require some custom scheduling to achieve the best peformance. In these scenarios, explicitly async APIs give the implementors the freedom to keep this advanced scheduling internal.

In all other cases, which version should we expose, synchronous or async? Unless you or some team mates find the Task api difficult to understand, the synchronous version should be used, as the other one can easily be realized by using the Task factory methods in combination with synchronous APIs. Synchronous APIs are easier to read, and in case the parallelism is already achieved by other means (e.g. explicit thread creation), the asynchronous versions would be useless.
What about the ideal solution? If we have some knowledge about the types of tasks, maybe with a little help from the developer, such as an attribute, we could do better than simple CPU scheduling:

[HeavyNetworkTraffic]
int FindSomeValue(int a, int b) {...}

[LengthyComputation]
int ComputeSomeValue(int a, int b) {...}

Now, let’s say the typical use case involves calling the FindSomeValue, then call the ComputeSomeValue locally. This is in fact quite a realistic scenario where data fetched remotely is processed locally before display. Let’s say the application submits many such operations of the first kind, FindSomeValue, followed by ComputeSomeValue. If two ComputeSomeValue instances are scheduled simultaneously, the available CPU per instance is halved. If two FindSomeValue instances are scheduled in parallel, it might easily be a fine situation for a gigabit ethernet. So, ideally, a scheduler which knows what types of resources are used by each task would schedule one ComputeSomeValue task in parallel with a number of FindSomeValue tasks. This level of custom scheduling can be achieved via the .Net Task Parallel Library extension points (custom schedulers). Who knows, maybe in the future the compiler will even be able to turn synchronous calls into asynchrnous automatically. This could be possible by analyzing runtime behavior.

Until then, go for synchronous APIs unless you must control the scheduling yourself.

To reiterate: expose synchronous APIs unless you have advanced scheduling scenarios.

 

No more piles on our desks, hurray!

Computers and mobile devices are made to support busy people who do a lot of activities at the same time. To be honest, I tried so many times to chat while developing but rarely succeeded to have both a polite conversation and an inspired coding attitude all at the same time. In reality both women (who areImage known to be more multitask-oriented) and men alike, when in front of a modern computer or mobile device, are most often attending true symphonies of concurrent software behaviors.

Panels and menus slide with buttery smooth animations in and out of view, glowing texts highlight stock exchange data that is grabbed from a service miles away, all the while pictures of your friends are being downloaded to the phone and hundreds other big and small things are happening.
Even when you are not that busy, you expect at least those buttery smooth animations from your mega-core phone. The first reason for the need of concurrency is thus the fact that, even with static data, user interface is expected to be very dynamic. If you move a window around and the other content stays fixed and gets hidden, users will get annoyed. Honestly, who thinks that the metaphor of a busy desk is a good starting point for organizing a user interface? I know that I am writing this post on a pc with such overlapping windows interface, but I never have overlapping windows in practice. I would much prefer to have my two-three open documents resize when I drag this window around. Modern, phone-like user interfaces are like that. They do the work for you, resize this, layout that, and your screen is always nice and tidy. No overlapping paperwork, no partially hidden stuff. I am amazed that the messy multi-layer desktop has turned into the big success it has been until now. I am very happy that the trend has shifted, but it’s not all roses for software engineers.
All this keeping tidy work is expensive in terms of computing resources, because it needs to be fast and super smooth not be distracting. Animations help the user when they are smooth, as animated UIs can more easily drive the attention to the right place, and they can show content without being too “abrupt”. All this real-time work means that even applications that are static, in the sense that their data content does not change (much) over time, show a lot of dynamic behaviors and thus require concurrent programming to be developed.

How to implement this?

Super smooth UIs require that all lengthy operations are delegated to background workers. The issue is that lengthy in this context is actually quite short and fast. An interface that hangs or hesitates for a few milliseconds is not perceived as smooth anymore. It might still be perfectly usable, but the feeling changes completely.  So whatever the application does which is not showing a blistering fast UI must be delegated to the background. Reading a file from a fast drive? probably fast enough for data retrieval, but surely not fast enough if this loading step interrupts a ui animation. Network communication? It’s unpredictable, never do it from the main application thread. Lengthy cpu processing follows the same faith of being delegated.
In old single threaded applications, even graphical user interface ones, the main thread which dealt with the UI was also running the business logic. In modern applications, the main (UI) thread does nothign but gathering input from the user, dispatching that input, and requesting user interface controls to repaint. The message is clear: modern application development requires good parallel programming attitudes. This is especially true because of another reason apart from animations and smooth UIs. The average user does not have three documents overlapping one another on screen, and probably focuses on one-two applications max, but still expects the one-two applications on screen to show all content that is required without flipping through pages and pages of interface. Those two-three panels in the app should show all content that is relevant. Maybe it’s not MDI (multiple documents interfaces), but often large screens are expected to be filled with MDVI (multiple data views interfaces). Handling updates to the different views of data require again a good dose of parallelism. I really like this trend of lean and mean for the user, pushing the organizational work to the side of the application. As to the actual technical choices of parallelization techniques, there are many articles, books and posts around. It’s a topic I really like, so I’ll probably post about it soon.

 

It all started with serialization

towerOfBableI am starting this series of posts about software design topics with one of the oldest problems in computer history: serialization.
My intention is primarily to discuss about rather general purpose topics and technologies in software design. I want to compare the typical business requirements, the available technologies, and how close or far these technologies are in meeting those requirements and being good tools for the programmers’ job. The goal is one: through discussion and sharing, find how we can use the tools and technologies at their best, or whether we need to evolve the tools and technologies. So, as a first exercise, I am practicing this pattern on this relatively simple topic.
The number one purpose of computers and servers has been, for a very long time, primarily the storage and retrieval of data. In order to do databases, you need a couple things at least: searching and sorting algorithms, and storage of data. The glue that keeps this all together is the transformation between the storage format and the runtime format, each one optimized to suit different needs.

In general all large applications deal with large amounts of data. In the past, the database was a separate unique application with the specific goal of storage and search. Modern large software structures include layers that often nest databases inside them. Layers are grouping of related components, separated from other layers by some form of data contracts. Large applications can thus be seen, from a certain distance, as boxes (layers) that transform data between different formats: database records, web requests, business objects, ui form fields. Computer programs are usually not “creative” by themselves; computers are not required to be original, funny or anything like that. Keeping this bird’s eye view on large software structures make a common pattern emerge: what goes in comes out. No randomness and no creativity means that the sum of inputs is always equal to the sum of outputs, unless your software is leaky. In my mind, this actually tells that the number one concern of software design is exactly this copying of data to different formats, which is also known as serialization. So software is all about serialization, then on top of that you add some domain specialists and voila, you have a software product. Software is the art of copying data. Given that nothing is added or removed while copying, business rule tells that software is mostly a burden. It’s a bit like burocracy, a lot of forms to fill, a lot of travelling between different offices. The larger the application gets, the longer the corridors and the more time spent on paperwork. But as with burocracy, there are certainly reasons for this.

why do we serialize?

The first, more easily justifiable reason for all this copying around lies in technology limitations. Large software ecosystems are developed with a plethora of different technologies.
Often data has to travel across different processes running on the same machine, or maybe jump around machines. When developers are lucky, they are tasked with the development of a product using one set of tools and one language, but most often sofware is composed of different parts that don’t like to talk to each other in plain simple english. Say, for example making javascript talk to c++ is not really like calling a function, right? Data coming out of databases is also famous for not being object-oriented friendly. Much has been discussed about the “impedance” or “friction” between databases and OOP.
All these boundaries between machines, processes, different languages, database and object orientation lead to the challenge of finding a way to ease the long journey of the bit from disk to screen. There’s always a man in the middle of each of these boundaries, acting as the interpreter. Bewteen databases and object orientation you get the ORM guy (object relational mapper). Between processes and machines there’s RPC, and between C++ and javascript there are web services. All these men in the middle use specific description of the data, in a format which is easy to translate to each end of the communication.
Another instance of the cross machine boundary lies inside most modern devices in the form of boundary between CPU and GPU. Finally, for those of you who know about the environment in a large software company, there are boundaries between departments and between rooms. There are walls, firewalls and communication issues. All these are usually reflected in software architectures via the over-layering syndrome and the need for massive serialization infrastructures. Large burocratic companies tend to produce serialization heavy code I would say.

The second reason for serialization stems from the complexity of large scale software. I will not discuss about the paradoxical aspect of this. Software gets large either because of feature bloat, and that’s understandable as the customer gets more features, or because of indigenous reasons. That is, as a living being, software develops structures to sustain more complexity, and those structures in turn need more scaffolding and at the end you’ve built burocracy into software. But going back to the second valid reason for serialization, or transformation of data: different layers in the application focus on different concerns and thus require a different data format to be built and/or to operate efficiently.
The business logic in an application requires domain specific models, the storage layer mostly requires fields and storage keys to retrieve data, the UI needs views (viewmodels in recent jargon) that collect data in a sexy way to be shown on screen. All these transformations are serializations, and the deeper you travel down the layer structures, the more the serialized models look opaque obscure and imperscrutable, that is closer to the old binary way of doing serialization.

Other reasons include versioning and immutability. Even when monolithic applications store data, even without crossing thick layered architectures, it’s most often required that data saved with version X will be usable also on version Y. So even sitting in the quiet and safe pond of monolithic desktop applications, your data will have to travel time and space, and thus will escape the faith of all data: serialization and deserialization. Versioning is definitely a reason to design specific data models. And finally you have immutability. Data often changes during the application lifetime. Taking a snapshot of data allows to process that crystallized part of data without consistency concerns. Immutable data is very good for parallel processing as no access control is required across threads.

There are thus good reasons to serialize, or we could say good reasons to write software, which is the same. But as with any engineering feat, the devil lies in the details. What are the devilish details and concerns in this case?

Serialization concerns

Whose job is serialization?

We realized we need to serialize, because we have different data formats. But who’s going to take the pain? If A is talking a different language than B, should A or B go to school and learn a new language/format, or etiquette of communication? If you put the question in human terms, the answer becomes immediately apparent. If I need to ask something to my boss, should I expect him/her to understand my nerdish technical blabbering or rather should I explain in human, bossy readable terms? If one of the parties is cleary leading, the language and format of choice is not a choice, and thus the other party adapts and does the translation/serialization. Between busy and creative peers as we are, it is instead convenient to find a common language. Since we are peers, we’ll both take the time to learn and do our own part of serialization. Finally, if I need to communicate with a client on the other side of the language spectrum, I’ll probably ask help either to Google or to a human intermediary. Same stories hold for humans and software components.

roundtripping

This is a sneaky issue in serialization. To keep sailing the human relationships metaphor, it’s like me communicating to a native of some paradise island who has never seen bad weather. My message might be foggy, rainy and contain a lot of slippery questions. I might still be able to get the core of the message through, but many details will be lost. Let’s assume I have chosen this native as an intermediary to talk to an english speaking on the same time zone as where the native resides. My foggy and rainy message will get translated to a shiny sunny story. When this story reaches the intended end of the communication, that is the native english speaker, the message will be very different than my original english text. Now, the english speaker will answer my message and attach it for reference to the communication thread. When the communication thread will get back to me, I will not even be able to understand what the source was. This is the roundtrip issue. When serialization formats are not perfectly compatible, some data might be lost in translation. As a more software specific example, consider this: I am sending more data to the database than it expects. For example, there’s more fields in the records I am trying to store. Since I took precautions server-side, the program will not choke and will instead happily accept the data but discard the extra fields. Ignorance of the extra fields is excusable as the database is an older version, so the client should be tolerant enough and accept that it will not be able to query the new fields. But what will not be acceptable is the loss of data. Even if the server does not know how to handle the extra unkown data, it should be able to return that extra data untouched.

polymorphism/extensibility

The roundtrip concern is a special case of extensibility. More in general, serialization meets the challenge of extensibility when the data formats change but one of the two parties is slow or cannot adapt to the new format. As with the roundtrip case, the main requirement is not to lose any piece of data. In case the extension comes in the form of object oriented inheritance, a nice-to-have feature is also the automated support for adding derived classes to the serialization process. Say for example two components are communicating via a serialized version of the Fruit class. Banana and Apple class derive from fruit, and since their support was envisioned right from the start full details about these fruits can be communicated around. If I add Papaya, I would like the system to handle this fruit as well, since it’s just another fruit maybe with an extra property or two, but hey, nothing revolutionary in business terms. At the very minimum, if I store and retrieve a Papaya, I will be able to get back all data. Maybe I won’t be able to search all Papayas in the system, but at least I know that Papayas are safe in the storage room. A good system might also give me the chance of at least treating the Papaya as a fruit. That is, at least I will be able to use generic fruit processing methods after serializing/deserializing. Ideally object oriented languages should handle this case in very elegant fashions. Object orientation is born as better way of handling data, so it should also offer nice and flexible object serialization support. This is not always the case in OOP languages and frameworks though.
Now that we layed out the requirements, the motivations and the potential pitfalls of serialization, let’s see how languages tackle the problem and if we need to do anything on top of native support. From a technical standpoint I will be referring mostly to c#, but conceptually the story is the same in different languages.

Solutions

Fully opaque serialization

Serialization is fully opaque when it’s like obscure burocracy. When it’s impossible to understand what is serialized, that is. It’s like me scribbling some sticky notes to keep track of software bugs, as opposed to writing down human readable stories in bugzilla. For me, the advantage of my obscure stickies is that they are very efficient. Of course I am the only one who can understand them. Going back to software terms:
PRO: each class can write/read its internals without exposing them to anybody else.
CON: ties serialization into the data model implementation. If there is one single serialization model in the application, it might be fine. Often though serialization formats are multiple, like database records vs JSON for web services vs. some internal file formats for disk storage. Since the data models are the contracts between application layers, they should not carry too many technology dependencies and thus opaque serialization is not practical in cases of growing complexity. On the other hand, if the serialized format does not cross application layers, it might be perfectly fine to serialize using e.g. binary formats. So, here’s an advantage of monolithic applications: the use of opaque serialization simplifies structuring code.

Aspect based programming

It is argued that serialization is a cross-cutting concern, and thus is better tackled by aspect programming.
Serialization done the aspect oriented way uses attributes. The .Net framework exposes the DataContractSerializer API to do serialization the aspect way. In order to enable a class for serialization using DataContractSerializer, decorations are required on top of the plain data models.
A class with a single property will change from this:

class A {
int SomeProperty {get;set;}
}

to this, when decorated for serialization:

[DataContract]
class A {
[DataMember]
int SomeProperty {get;set;}
}

PRO: serialization by attributes does not require explicit serialization and deserialization/parsing code to be written. The framework can do most of the work for us. Given that attributes can be inspected via reflection, the framework or our code can extract the “contract” of the data model we are serializing. Standard data contracts, e.g. SOAP ones, can be used to reconstruct the shape of data in a format native to each language that supports that type of protocol.
CON: Pollutes the data model with serialization information. The real problem is that complex data models might not be very well fitted for serialization. E.g recursive references, or references to large context objects from within child models might be convenient for runtime usage. When serialized, you don’t want to re-serialize large objects referenced from multiple places though. In case of complex models/large data, you might end up creating a set of alternative data models just for serialization purposes, so that your runtime models can still provide the level of usability you expect. That’s where “smart” support for serialization from the framework side really ends up serving little purpose. If I write a model just to be serialized, it better be entirely serialized without any attribute (opt-out none).

Serialization data model

Instead of using the provided support for opaque serialization or aspect-oriented serialization provided by the language, we could opt for creating our own model for serialization. We would need to design into it the following characteristics:
– must be agnostic of source or destination formats. This model will always be the man in the middle.
– must be possible to store and retrieve the serialized format using different storage services, like databases or xml files or web services.
PRO: this model covers only the serialization/persistence use cases. No eventing is required, no tracking of changes as the model is always just a snapshot. Most languages refer to such simple objects as “plain old language objects”, POJOs for Java, POCOs for the C family, etc. The basic implementation is in fact a property bag, or lookup/dictionary of key attribute pairs. By carrying explicit information, the serialization data model allows for easy further tansformation to other serialization formats. The first example is the transition to the relational database world. Field name/attribute value collection is basically the description of what a SQL record is, and the role of the ORM (object-relational mapper) such as (N)Hibernate is just to ease such transition between the relational and the object oriented world. Data access layers such as Microsoft’s ADO employ generic property bag objects (records). These serialization models live in the object-oriented world, but in fact look a lot more like old C structures than C++ classes, as they do not encapsulate logic. Having one single well defined scope is usually a good quality of an object oriented class, though this essentially means that encapsulation of logic and data needs to be violated here. In general I think that object oriented languages require patterns, practices and rules that go well beyond their syntax to be used effectively. It almost feels like object orientation is a way of defining meta-languages which need to be further specified for practical usage. This is one of such areas: a class can be data, logic, a combination of both, without any lexical or syntactical difference but with vastly different usage patterns.
CON: the dreaded property bag pattern might lead to a set of problems of its own. Keys need to be maintained, mapping of fields to databases become areas of concern. If persistence, transfer of data, querying and sorting data is a primary concern of your application, you probably are already aware that a data layer is where all such generic data-centric behavior will/does reside. A reasonable way of splitting away the data layer of such structured application is to have data-specific objects on the boundaries of such layers. If data management is a secondary concern in your domain, choosing the “data object approach” will not be easy to justify from a business use case perspective. I have already argued that most interesting applications will have to deal with a lot of data, but before your users will be confronted with a lot of data, your app needs to become a successfull product generating revenues before data. On the other hand, adding a data layer and structures after an application (or maybe even a family of applications) have become a success is more difficult than baking a data mindset into the architecture right from the start.

Choices, choices

Choice 1, trying to be opaque, or trying to write as little code as possible

If your only need is true temporary serialization/deserialization, use the simplest serialization tools provided by the language. In the c# case, I believe this boils down to the choice between XmlSerializer and DataContractSerializer which will easily serialize your objects to an xml string. The first one does not mandate additional attributes such as the [DataMember] one, and will just serialize everything. The second one requires quite a bit of attribute decorations. Both these tools choke on extensibility though. If you need to serialize derived classes you will need to use special extensions to the serialization mechanism. In essence, you need to specify all types that the serializer can expect. If you go along this route, watch out for the extensibility bumps.

Choice 2, create your own serialization model

With this option you create two classes, one for your data model and another one for your serialization data model. The serialization-specific data model is typically not used outside data persistence code; the application uses a more meaningful and application specific representation of data. This approach is definitely not a DRY one. A data-centric application will simply take the pain and organize around the needs of structured data access. If you can’t afford not to be DRY, there’s a solution to avoid repeating yourself over and over again: deriving the application data models from the serialization models and use some language trickeries to ease the access to the serialized properties. Purely as an example, I am showing here the implementation of such approach in C#. What follows is a simplified version of a small configuration framework I wrote. This configuration framework starts from the requireiment of generic access to configuration properties of application modules, easy browsing and sorting, and a centralized handling of copies. These common data-centric requirements support the creation of a generic configuration object. On the other hand, individual application modules define their own specific properties in their configuration data.

The implementation

All is based here on a generic, easily serializable property bag. What gets serialized at the end is just a list (dictionary) of string keys to object value mappings. Since property-bag style programming is not nice, the base property bag class takes care of converting standard class properties to and from the property bag mappings. The two methods that do the trick are the OnSerializing, which reads the properties of the class via reflection, and the InitFrom which is a sort of constructor that fills in the actual property values from the property bag, again via reflection trickeries.

 

 [DataContract]
    public class PropertyBag {
        /// <summary>The actual property bag, filled at serialization time</summary>
        [DataMember (Name="Properties")]
        private Dictionary<string, object> serializedProperties =
            new Dictionary<string, object>();

        public Dictionary<string, object> SerializedProperties { get { return serializedProperties; } }

        /// <summary>Copy properties from the serialized property bag to the current object</summary>
        /// <param name="other"></param>
        public void InitFrom(PropertyBag other) {
            var publicProperties = GetType().GetProperties(
                BindingFlags.Instance | BindingFlags.Public | BindingFlags.FlattenHierarchy
            ).ToDictionary(propInfo => propInfo.Name);
            foreach (var nameValuePair in other.serializedProperties) {
                PropertyInfo clrProp = null;

                if (publicProperties.TryGetValue(nameValuePair.Key, out clrProp)) {
                    clrProp.SetValue(this, nameValuePair.Value);
                }
            }
        }


        [OnSerializing]
        internal void OnSerializing(StreamingContext context) { 
            var publicProperties = GetType().GetProperties(
                BindingFlags.Instance | BindingFlags.Public | BindingFlags.FlattenHierarchy
            );
            foreach (var property in publicProperties) {
                if (property.Name == "SerializedProperties") continue;
                serializedProperties[property.Name] = property.GetValue(this);
            }
        }       
    }

The property bag can be serialized to xml, json or any other format very easily as it’s not opaque, yet it has a very simple format. Because of this quality, I have kept the actual serialization code outside the class and implemented a PropertyBagSerializer. This way, one can easily swap out the XML DataContract serialization style for a more compact one such as JSON. The serializer here tackles another one of the tricky aspects of serialization, that is extensibility on inheritance. All classes derived from PropertyBag are serialized as PropertyBag, preserving all data stored in their properties though. In .net APIs, this is achieved by means of a datacontract resolver that maps all derived types to the base PropertyBag type. Please note that it would be possible to reconstruct the proper type in this custom serializer by means of an additional property in the serialized propertybag and by using the InitFrom. Left as an exercise…

public class PropertyBagSerializer {
        private DataContractSerializer serializer;

        public string Serialize(PropertyBag bag) {             
            MemoryStream memStream = new MemoryStream();
           
            Serializer.WriteObject(memStream, bag);
            memStream.Seek(0, SeekOrigin.Begin);
            return ASCIIEncoding.Default.GetString(memStream.GetBuffer(),0,(int)memStream.Length);
        }

        public PropertyBag Deserialize(string text) {
            MemoryStream memStream = new MemoryStream(ASCIIEncoding.Default.GetBytes(text));
            XmlReader reader = XmlReader.Create(memStream);
            return Serializer.ReadObject(reader) as PropertyBag;
        }

        private class DeserializeAsBaseResolver : DataContractResolver {
            public override bool TryResolveType(Type type, Type declaredType, DataContractResolver knownTypeResolver, out XmlDictionaryString typeName, out XmlDictionaryString typeNamespace) {
                bool result = true;
                if (typeof(PropertyBag).IsAssignableFrom(type)) {
                    XmlDictionary dictionary = new XmlDictionary();
                    typeName = dictionary.Add(typeof(PropertyBag).Name);
                    typeNamespace = dictionary.Add(typeof(PropertyBag).Namespace);
                } else {
                    result = knownTypeResolver.TryResolveType(type, declaredType, null, out typeName, out typeNamespace);
                }
                return result;
            }

            public override Type ResolveName(string typeName, string typeNamespace, Type declaredType, DataContractResolver knownTypeResolver) {                
                return knownTypeResolver.ResolveName(typeName, typeNamespace, declaredType, null) ?? declaredType;
            }
        }

        private DataContractSerializer Serializer {
            get {
                if (serializer == null) {
                    serializer = new DataContractSerializer(
                        typeof(PropertyBag), null, Int32.MaxValue, true, false, null,
                        new DeserializeAsBaseResolver()
                    );
                }
                return serializer;
            }
        }
    }

Finally, here’s a sample data model that derives from PropertyBag, and code snippets that show how to use the serializer/deserializer.

 public class SimpleData : PropertyBag {
        public string Name { get; set; }
        public int Age { get; set; }
    }
/// <summary>
    /// Interaction logic for MainWindow.xaml
    /// </summary>
    public partial class MainWindow : Window {
        public MainWindow() {
            InitializeComponent();
        }


        private void OnSerializeClick(object sender, RoutedEventArgs e) {
            SimpleData data = new SimpleData { Name = SourceName.Text, Age = int.Parse(SourceAge.Text) };
            PropertyBagSerializer serializer = new PropertyBagSerializer();
            Serialized.Text = serializer.Serialize(data);
        }

        private void OnDeserializeClick(object sender, RoutedEventArgs e) {
            PropertyBagSerializer serializer = new PropertyBagSerializer();
            var data = serializer.Deserialize(Serialized.Text);
            
            // generic property access
            TargetName.Text = data.SerializedProperties["Name"].ToString();
            TargetAge.Text = data.SerializedProperties["Age"].ToString();

            // recreate actual type
            SimpleData simpleData = new SimpleData();
            simpleData.InitFrom(data);
            TargetName.Text = simpleData.Name;
            TargetAge.Text = simpleData.Age.ToString();

        }     
    }

 

Conclusion

As much as serialization is one of the most ancient computer science problems, the solutions provided by new languages are still evolving. There’s not one single ideal, catch-all solution, exactly because software is all about different data formats and transformations. This article explained some of the reasons for such complexity, some of challenges to watch out for, and proposed one possible solution as an example of concrete implementation of one set of serialization requirements. The solution addresses flexibility, extensibility and genericity. By paying the performance penalty of reflection, we get much simpler code that does not need to worry about serialization at all. To really conclude, I think that although the framework support (and this holds true for c++ even more than .net) is not completely removing the friction of serialization, by adding some APIs and some structure ourselves, we can get close to the ideal solution in many circumstances. In the next posts, we’ll keep tackling problems only partially solved by frameworks, and building our own apis and tools to close the circle.