Everything is asynchronous

Asynchronous APIs

In one of the previous posts I stated that all processing that occurs as a result of a user interaction should be delegated to background processing so that the user interface is always responsive and smooth. In ordfuturismoer to keep it (even) simpler, one might say that all classes in an application which deal with data or business logic should only expose asynchronous methods. To be more specific, we can start by categorizing classes and components in an application. Some classes are user controls, like buttons, comboboxes or other more complex collections of user-facing components. All the other classes are somehow related with the actual domain of the application, from data model classes to business logic components that deal with business processes. So, if we want to be strict, we can say that all user-interface classes which call into business classes should do so via an asynchronous call that delegates the work to another thread. In the other direction, from background workers to user interface, the UI frameworks typically require all calls to be directed to the thread owning the user interface components (there might be multiple), so our rule is already enforced. One of the issues with this approach is that it leads to too much/unwanted parallelism. When business objects start calling other business objects, every call turns into a new thread. The asynchronous calls should be enforced only when coming from a user interface component.

With thread-based APIs, this is difficult to achieve. Whenever you design a business object with a method A that can potentially take a very long time, you delegate the work to a background thread. This is appropriate if the caller is the UI, but what if the caller is another business oject? It might be a better choice to run the lengthy method in the same thread as the caller. The solution to this problem, as usual in software engineering, comes via a layer of abstraction. The thread is the low level way of doign parallel computation, the task hides the details of the thread. You can start thousands of tasks, but the “runtime” (language library or virtual machine) will only execute a reasonable number of tasks in parallel, where reasonable depends on several factors including the number of real available cpu cores. Many languages provide some task based abstraction: C++ 11, C#, Javascript and Java as well (JDK8).

While tasks were becoming the trend in parallel programming, I was designing an API which would be used in a mostly asynchronous way. So I asked myself whether I should simply shape the API to return Tasks insted of plain simple results. Back then I chose to offer both task-based as well as non-task based (synchronous) APIs. That meant an API like this:

public class A {
int FindSomeValue(int someKey);
Task BeginFindSomeValue(int someKey);

Normally you would not clutter an API with utility functions. If the user of the API can easily achieve the desired behavior with the existing API, don’t add anything. The smaller the API, the more understandable, the more usable, the more productive. So why would we want to expose both synchronous and asynchronous APIs? After all, it’s easy to turn a call into an asynchronous call in .net:

int someValue = await Task<int>.Run(()=>FindSomeValue(someKey));
int someOtherValue = someValue+1;

The previous lines do a lot of things: start the FindSomeValue function in another thread (to simplify a little), return control to the caller and set up an event so that when the result of the asynchronous call is available (the someValue result), it can continue the computation and finally perform someValue+1. So, although not entirely trivial, it’s at least possible with little code to turn synchronous into asynchronous. Why did I put two versions in the API then? The reason is that I wanted to handle the scheduling myself. The BeginFindSomeValue would use a combination of resources that performed suboptimally when loaded with too many parallel workloads. .Net would allow to specify a custom¬†scheduler, but asking a user of an API to chew all the custom scheduling way of calling threads would be too much work put on the user, and ultimately would mean exposing implementation details of the API. This is the most practical reason to expose both an asynchronous and a synchronous API: custom scheduling. Doing the scheduling internally allows the API implementor to choose how much parallelism to allow for optimal performance. For example, a database might have different scaling characteristics than a simple file storage on disk. .Net schedulers essentially schedule work for the CPU to perform, but in modern computation architectures there’s much more than CPUs: GPUs, remote computing servers, remote data servers. The logic used to schedule tasks on CPUs does not necessarily work well on network bound operations or GPU bound operations. For example, loading the GPU with many more operations than available cores is rather normal. The ratio tasks/cores is much lower on CPUs due to the different architectures. The ratio on a network is again different. A gigabit link can “perform” many network calls per second, and in most circumstances will be limited by latency more than bandwidth. Combining CPU, GPU and network workloads thus require some custom scheduling to achieve the best peformance. In these scenarios, explicitly async APIs give the implementors the freedom to keep this advanced scheduling internal.

In all other cases, which version should we expose, synchronous or async? Unless you or some team mates find the Task api difficult to understand, the synchronous version should be used, as the other one can easily be realized by using the Task factory methods in combination with synchronous APIs. Synchronous APIs are easier to read, and in case the parallelism is already achieved by other means (e.g. explicit thread creation), the asynchronous versions would be useless.
What about the ideal solution? If we have some knowledge about the types of tasks, maybe with a little help from the developer, such as an attribute, we could do better than simple CPU scheduling:

int FindSomeValue(int a, int b) {...}

int ComputeSomeValue(int a, int b) {...}

Now, let’s say the typical use case involves calling the FindSomeValue, then call the ComputeSomeValue locally. This is in fact quite a realistic scenario where data fetched remotely is processed locally before display. Let’s say the application submits many such operations of the first kind, FindSomeValue, followed by ComputeSomeValue. If two ComputeSomeValue instances are scheduled simultaneously, the available CPU per instance is halved. If two FindSomeValue instances are scheduled in parallel, it might easily be a fine situation for a gigabit ethernet. So, ideally, a scheduler which knows what types of resources are used by each task would schedule one ComputeSomeValue task in parallel with a number of FindSomeValue tasks. This level of custom scheduling can be achieved via the .Net Task Parallel Library extension points (custom schedulers). Who knows, maybe in the future the compiler will even be able to turn synchronous calls into asynchrnous automatically. This could be possible by analyzing runtime behavior.

Until then, go for synchronous APIs unless you must control the scheduling yourself.

To reiterate: expose synchronous APIs unless you have advanced scheduling scenarios.


No more piles on our desks, hurray!

Computers and mobile devices are made to support busy people who do a lot of activities at the same time. To be honest, I tried so many times to chat while developing but rarely succeeded to have both a polite conversation and an inspired coding attitude all at the same time. In reality both women (who areImage known to be more multitask-oriented) and men alike, when in front of a modern computer or mobile device, are most often attending true symphonies of concurrent software behaviors.

Panels and menus slide with buttery smooth animations in and out of view, glowing texts highlight stock exchange data that is grabbed from a service miles away, all the while pictures of your friends are being downloaded to the phone and hundreds other big and small things are happening.
Even when you are not that busy, you expect at least those buttery smooth animations from your mega-core phone. The first reason for the need of concurrency is thus the fact that, even with static data, user interface is expected to be very dynamic. If you move a window around and the other content stays fixed and gets hidden, users will get annoyed. Honestly, who thinks that the metaphor of a busy desk is a good starting point for organizing a user interface? I know that I am writing this post on a pc with such overlapping windows interface, but I never have overlapping windows in practice. I would much prefer to have my two-three open documents resize when I drag this window around. Modern, phone-like user interfaces are like that. They do the work for you, resize this, layout that, and your screen is always nice and tidy. No overlapping paperwork, no partially hidden stuff. I am amazed that the messy multi-layer desktop has turned into the big success it has been until now. I am very happy that the trend has shifted, but it’s not all roses for software engineers.
All this keeping tidy work is expensive in terms of computing resources, because it needs to be fast and super smooth not be distracting. Animations help the user when they are smooth, as animated UIs can more easily drive the attention to the right place, and they can show content without being too “abrupt”. All this real-time work means that even applications that are static, in the sense that their data content does not change (much) over time, show a lot of dynamic behaviors and thus require concurrent programming to be developed.

How to implement this?

Super smooth UIs require that all lengthy operations are delegated to background workers. The issue is that lengthy in this context is actually quite short and fast. An interface that hangs or hesitates for a few milliseconds is not perceived as smooth anymore. It might still be perfectly usable, but the feeling changes completely.¬† So whatever the application does which is not showing a blistering fast UI must be delegated to the background. Reading a file from a fast drive? probably fast enough for data retrieval, but surely not fast enough if this loading step interrupts a ui animation. Network communication? It’s unpredictable, never do it from the main application thread. Lengthy cpu processing follows the same faith of being delegated.
In old single threaded applications, even graphical user interface ones, the main thread which dealt with the UI was also running the business logic. In modern applications, the main (UI) thread does nothign but gathering input from the user, dispatching that input, and requesting user interface controls to repaint. The message is clear: modern application development requires good parallel programming attitudes. This is especially true because of another reason apart from animations and smooth UIs. The average user does not have three documents overlapping one another on screen, and probably focuses on one-two applications max, but still expects the one-two applications on screen to show all content that is required without flipping through pages and pages of interface. Those two-three panels in the app should show all content that is relevant. Maybe it’s not MDI (multiple documents interfaces), but often large screens are expected to be filled with MDVI (multiple data views interfaces). Handling updates to the different views of data require again a good dose of parallelism. I really like this trend of lean and mean for the user, pushing the organizational work to the side of the application. As to the actual technical choices of parallelization techniques, there are many articles, books and posts around. It’s a topic I really like, so I’ll probably post about it soon.