This post is part 2/5 of my Data-Driven Code Generation of Unit Tests series.
This blog post explains how I used CMake, Jinja2, and the Boost Unit Test framework to perform data-driven code generation of unit tests for a financial performance analytics library. If you haven’t read it already, I recommend starting with Part 1: Background.
All performance analytics metadata is stored in a single metadata file called metadata.csv. This file contains the complete list of calculations, and for each calculation, its settings (i.e. how it differs from other calculations), including properties like:
This post is part 1/5 of my Data-Driven Code Generation of Unit Tests series.
At Morningstar, I created a multi-language, cross-platform performance analytics library which implements both online and offline implementations of a number of common financial analytics such as Alpha, Beta, R-Squared, Sharpe Ratio, Sortino Ratio, and Treynor Ratio (more on this library later). The library relies almost exclusively on a comprehensive suite of automated unit tests to validate its correctness. I quickly found that maintaining a nearly-identical battery of unit tests in three different programming languages was a chore, and I had a hunch that I could use a common technique to deal with this problem: code generation.
I’d like to call out this particular assertion made by him way back in 2010:
[D]evelopers must move towards single-threaded programming models connected through message passing, optionally with provably race-free fine-grained parallelism inside of those single-threaded worlds.
Add “async/await everywhere” and you can sign me up!
Here’s why I try to avoid thread-based programming models for expressing concurrency:
The opponents of thread-based systems line up several drawbacks. For Ousterhout, who probably published the most well-known rant against threads [Ous96], the extreme difficulty of developing correct concurrent code–even for programming experts–is the most harmful trait of threads. As soon as a multi-threaded system shares a single state between multiple threads, coordination and synchronization becomes an imperative. Coordination and synchronization requires locking primitives, which in turn brings along additional issues. Erroneous locking introduces deadlocks or livelocks, and threatens the liveness of the application. Choosing the right locking granularity is also source of trouble. Too coarse locks slow down concurrent code and lead to degraded sequential execution. By contrast, too fine locks increase the danger of deadlocks/livelocks and increase locking overhead. Concurrent components based on threads and locks are not composable. Given two different components that are thread-safe, a composition of them is not thread-safe per se. For instance, placing circular dependencies between multi-threaded components unknowingly can introduce severe deadlocks.
This is part 17/17 of my Exploring the .NET CoreFX series.
Microsoft’s .NET Core team has posted a videotaped API review session where they show how they review API enhancement suggestions. I thought the video was quite educational.
This is part 16/17 of my Exploring the .NET CoreFX series.
While .NET has historically been limited to Windows machines, Mono notwithstanding, the introduction of the cross-platform .NET Core runtime has introduced the possibility of running .NET Core applications on Unix machines. With this possibility, developers may have the need of writing platform-specific code.
One way to write platform-specific code is:
Define a conceptual base class which will have an identical name and methods across all platforms. This does not need to be a C# interface, as we will be using compile-time rather than run-time polymorphism.
Provide an implementation of this class for each target platform.
Use build-time conditions to include the platform-specific class based on target compilation platform.
An example from the .NET Core is the System.Console.ConsolePal class from the System.Console library. The library includes two implementations of this class:
This is part 14/17 of my Exploring the .NET CoreFX series.
Back in 2013, Immo Landwerth and Andrew Arnott recorded a Going Deep video called Inside Immutable Collections which describes how and why System.Collections.Immutable is built the way it is. It’s great background material to understand System.Collections.Immutable.
This is part 13/17 of my Exploring the .NET CoreFX series.
Most implementations of IList, including System.Collections.Generic.List, are dynamic arrays. System.Collections.Immutable.ImmutableList is different – it is an AVL tree. This results in significantly different performance characteristics:
List
ImmutableList
Indexing
O(1)
O(log n)
Append
O(1) average, O(n) worst-case
O(log n)
Insert at arbitrary index
O(n)
O(log n)
Remove
O(n)
O(log n)
Memory layout
Contiguous for value types
Non-contiguous
The data structure behind ImmutableList was likely chosen so that modifications to the list are non-destructive and require minimal data copying.
This is part 12/17 of my Exploring the .NET CoreFX series.
In C++, the inline keyword allows a developer to provide a hint to the compiler that a particular method should be inlined. C# has the identical ability but uses an attribute instead: