I wish authors who write articles about C# async / await programming techniques would emphasize that two concepts they frequently discuss are actually separate features that provide separate benefits. Perhaps I’ve overlooked an explanation of the orthogonality of async and concurrent in the restaurant or cooking breakfast analogies. (Update 2020 Jul 30: The cooking breakfast article has been updated and draws a clearer distinction between async work and parallel / concurrent work). Nonetheless, it will benefit readers of this blog to disentangle the concepts here.
It’s easy for novice programmers to write incorrect code and believe they’re getting performance benefits when they’re not (if they don’t measure performance, they remain ignorant) or get frustrated when they do measure performance and find no speedup compared to non-async sequential code.
So, I’ll state the concepts explicitly and write code to demonstrate potential performance and code correctness issues (the dreaded non-deterministic, multi-threading bug) one may encounter if these distinct concepts are jumbled in one’s mental model of async / await concurrent programming. OK, here are the distinct concepts:
- Async is about being a good neighbor.
- Concurrent is about performing several steps simultaneously in order to complete a multi-step process faster.
Take caution:
- Adding the syntactic sugar of
async
andawait
keywords to method declarations and call sites does not magically transform your program from sequential execution to concurrent execution. - Pay very close attention to when you
await
code.
Let’s write some boilerplate code to generate a PC report. I’ll use this code to demonstrate how to retrieve data using async programming techniques. Then we’ll discuss how the async
keyword does not automatically make the code run concurrently. Next, we’ll examine how to refactor the code so it does run concurrently (and faster). Finally, we’ll discuss how to avoid obvious and not-so-obvious multi-threading bugs.
First, the boilerplate code:
Let’s write a thread-safe class that writes messages to the Console. The code writes colored text so we may quickly distinguish output from different methods.
Let’s write a class to hold the report data.
Let’s write methods that gather data related to a PC: the name of the principal user of the PC, file logs, event logs, and database logs. We’ll declare the methods as async
and use await Task.Delay
to simulate I/O latency involved in gathering data (reading disk, calling the event logs API, and querying a database).
Notice each method accepts a PcReport parameter, then adds content to that object. Now let’s write a method that calls these methods asynchronously.
Run the program, specifying a computer name, the last ten days of logs, and the “sequential” method name.
PS C:\GitHub\Personal\Sandbox\Async Concurrent\bin\Publish> .\ErikTheCoder.Sandbox.PcReport.exe HomePC 10 sequential 00.003 Thread01 Adding primary user of HomePC. 00.094 Thread04 The primary user is HomePC User. 00.105 Thread04 Adding file logs for HomePC. 01.616 Thread05 File logs = File Log 1, File Log 2, File Log 3. 01.648 Thread05 Adding event logs for HomePC. 02.163 Thread05 Event logs = Event Log 1, Event Log 2, Event Log 3. 02.186 Thread05 Adding database logs for HomePC. 03.194 Thread05 Database logs = Database Log 1, Database Log 2, Database Log 3.
Wait a second. As the method name suggests, and the color coding reveals, it appears this code runs sequentially: the primary user name is added, then the file logs, then the event logs, then the database logs. The program invokes a method and waits for results before invoking the next method. No two methods ever are “in flight” at the same time. Nothing is running concurrently. But the await / async keywords are supposed to enable concurrency, right?
No. Well yes, but it’s incumbent on the programmer to write the code correctly. The above code is incorrect. Async / await prevents code from hoarding a thread while waiting for an I/O call to return. That’s what I meant when I said earlier that writing async code is like being a good neighbor. The async
keyword instructs the runtime to invoke a method’s continuations (code following an await
statement) on a ThreadPool thread. The await
keyword instructs the runtime to release the current thread back to the operating system while waiting for a disk read, database query, or other I/O call to return. Control is yielded to the calling method, which can make forward progress until it encounters an await
statement, where control is yielded to its caller, on and on recursively up the call stack. If control reaches the top of the call stack, other processes (or other threads in your process) may use the thread to complete their work while your program is blocked. Threads are a scare resource in an operating system. So by releasing them, rather than hoarding them while doing nothing useful (waiting for an I/O call to return), you’re being kind to your neighbors- the other processes running on the computer. As a consequence, the server can do more work per fixed unit of time when running programs written using async / await techniques than when running programs written using non-async techniques. This increases the scalability of the server. Another way to look at it is async / await programming techniques decrease the hardware budget required to run a fixed workload compared to non-async programming techniques.
To enable concurrency, don’t await
each method call individually. At each method call site, capture a Task
or generic Task<T>
, add it to a collection, then await
all tasks in the collection.
Run the program again, specifying the same computer name and last ten days of logs, but specify the “concurrent-race” method name.
PS C:\GitHub\Personal\Sandbox\Async Concurrent\bin\Publish> .\ErikTheCoder.Sandbox.PcReport.exe HomePC 10 concurrent-race 00.003 Thread01 Adding primary user of HomePC. 00.019 Thread01 Adding file logs for HomePC. 00.021 Thread01 File logs = . 00.026 Thread01 Adding event logs for HomePC. 00.027 Thread01 Event logs = . 00.033 Thread01 Adding database logs for HomePC. 00.038 Thread01 Database logs = . 00.101 Thread04 The primary user is HomePC User.
OK, code is running concurrently now. Notice the cyan text interleaves with the other colors. However, the code does not function properly. It retrieves the primary user of the PC but it fails to retrieve any logs. What happened? Our code is vulnerable to a race condition. Because each Add method first checks if the PrimaryUser is not null before bothering to retrieve logs, and the AddUser and other Add methods run concurrently, and it takes 75 milliseconds to gather user data, the PrimaryUser is guaranteed to be null when the other methods check its value. Therefore, they return without adding any logs to the PcReport object.
This is what I labeled above as an obvious multi-threading bug. Let’s move the location of the Task.Delay
statement to expose a not-so-obvious multi-threading bug. Let’s simulate I/O latency prior to checking the value of the PrimaryUser property. This makes our code a little less realistic (why do all this work gathering logs only to throw them away?) but hey, it’s a demo.
Run the program again, specifying the same computer name, last ten days of logs, and “concurrent-race” method name.
PS C:\GitHub\Personal\Sandbox\Async Concurrent\bin\Publish> .\ErikTheCoder.Sandbox.PcReport.exe HomePC 10 concurrent-race 00.003 Thread01 Adding primary user of HomePC. 00.019 Thread01 Adding file logs for HomePC. 00.025 Thread01 Adding event logs for HomePC. 00.030 Thread01 Adding database logs for HomePC. 00.106 Thread04 The primary user is HomePC User. 00.529 Thread05 Event logs = Event Log 1, Event Log 2, Event Log 3. 01.037 Thread05 Database logs = Database Log 1, Database Log 2, Database Log 3. 01.523 Thread05 File logs = File Log 1, File Log 2, File Log 3.
The code runs concurrently and retrieves all logs. It also runs faster than our sequential code, completing in 1.523 seconds instead of 3.194 seconds. Excellent. We’re being good neighbors and writing high-performance code. Now run the program specifying the same computer name and “concurrent-race” method name, but specify one day.
PS C:\GitHub\Personal\Sandbox\Async Concurrent\bin\Publish> .\ErikTheCoder.Sandbox.PcReport.exe HomePC 1 concurrent-race 00.003 Thread01 Adding primary user of HomePC. 00.022 Thread01 Adding file logs for HomePC. 00.029 Thread01 Adding event logs for HomePC. 00.034 Thread01 Adding database logs for HomePC. 00.098 Thread04 Event logs = . 00.112 Thread04 The primary user is HomePC User. 00.139 Thread04 Database logs = Database Log 1, Database Log 2, Database Log 3. 00.185 Thread04 File logs = File Log 1, File Log 2, File Log 3. PS C:\GitHub\Personal\Sandbox\Async Concurrent\bin\Publish>
Do you spot the bug? The program fails to retrieve event logs. Why? Well, all the Add methods run concurrently. The primary user is retrieved in 75 milliseconds. It takes 150 and 100 milliseconds to retrieve file and database logs, respectively. It takes only 50 milliseconds to retrieve event logs, so they’re retrieved prior to retrieving the primary user. So when the AddEventLogs method checks the value of the PrimaryUser, it’s null. Therefore, it returns without adding any event logs to the PcReport object. This illustrates the dangers of passing a common object to multiple methods that are executing concurrently (that is, simultaneously). If any of those methods mutate the state of the object, they may inadvertently affect logic in the other methods. This is known as a race condition. The situation in real-world programs is even more complicated than in my contrived demo here, in that I/O latency in real-world programs is not so predictable. It’s not a fixed constant, it varies based on server load and network congestion. This makes multi-threading race-condition bugs difficult to spot and frustrating to debug, because the timing of thread interleaving is non-deterministic.
How can we improve our code? We avoid passing a common object to multiple methods that execute concurrently. Let’s capture the results of method calls, then assemble the object after all method calls have returned.
Run the program again, specifying the same computer name, one day of logs, but specify the “concurrent” method name.
PS C:\GitHub\Personal\Sandbox\Async Concurrent\bin\Publish> .\ErikTheCoder.Sandbox.PcReport.exe HomePC 1 concurrent 00.004 Thread01 Adding primary user of HomePC. 00.021 Thread01 Adding file logs for HomePC. 00.027 Thread01 Adding event logs for HomePC. 00.033 Thread01 Adding database logs for HomePC. 00.080 Thread04 Event logs = Event Log 1, Event Log 2, Event Log 3. 00.107 Thread04 The primary user is HomePC User. 00.140 Thread04 Database logs = Database Log 1, Database Log 2, Database Log 3. 00.177 Thread05 File logs = File Log 1, File Log 2, File Log 3.
The program runs quickly and retrieves the primary user and logs concurrently. Sequential retrieval would take 375 milliseconds (75ms + 150ms + 50ms + 100ms) and this program completes in 177 milliseconds. Success!
You may review the full source code in the Async Concurrent folder of my Sandbox project in GitHub.