Yay, I have a new job! I’m now an Open-Source Software Developer at TU Delft, where I’m going to be working on biomechanical simulation software.
Why do I mention this? Because I’m initially tasked with trying to make OpenSim faster, which is something that beautifully ties together a few of my loves (research software, systems development, and low-level perf optimizations) with a few of my hates (software written by researchers, C++, and diagnosing cache misses) and I’ve been wanting to learn+write about performance for a while.
With that in mind, I would like to write a few short blog posts on any interesting performance topics I come across while working on OpenSim. These posts are mostly for my own record, or (at best) as a way of articulating my work to other people on the project. I figured that, because this work is going to be open-source, there’s little downside to sharing my notes publicly.
While reading my perf posts, it’s important to keep in mind who’s writing this (me), what OpenSim is, and what you should probably already know (basic C/C++):
I’m a general software developer, not an academic researcher. I have a background in research, but my professional expertise is in engineering stable products/systems.
Therefore, any perf posts will focus on established software techniques, rather than anything research-grade.
So, if you want to read about simple profiling techniques, you’ve come to the right place. If you want to read about novel collision detection algorithms, wrong place.
OpenSim is a large (>100 kLOC) C++ application written by clever people with PhDs in biomechanics:
Therefore, any performance changes have to “fit” into the existing codebase extremely cleanly. This can include (for example) reproducing buggy behavior or supporting technically-incorrect and legacy API usage patterns.
It also means that there is a lot of code in OpenSim that’s faaar too specialized for a generalist like me to feasibly learn and reimplement from scratch. There is code in OpenSim that was written years ago by experts in the field who have since moved on. It would be foolish for me to (e.g.) try and reimplement algorithms that took an expert 5 years to develop the first time.
So, if you want to read about performance-tuning a large application extremely incrementally without breaking too much, right place. If you want to read about performance-tuning a small, standalone application with no wider context, wrong place.
OpenSim is a (mostly) single-threaded non-distributed application:
Therefore, these performance-related posts are mostly going to be focused on single-process perf. optimization (reducing cache misses, minimizing memory use, cleaning up a single application), rather than distributed application optimization (logging events, measuring network bottlenecks, etc.).
So, if you want to read about finding performance hotspots in locally-ran applications, right place. If you want to read about performance optimizing a distributed application (plus all of the other crap that might entail, like figuring out why your cloud servers are slower on Tuesday mornings), wrong place.
I’m going to assume you know C/C++ development and general coding principles (functions, IO, etc.) for the more general posts.
I prefer dumb, easy-to-debug, patterns. Most of the C++ code I will be showing should fall into this category. You will not need to know the more advanced topics (SFINAE, virtual inheritance, etc. etc.), but I might dip into those occasionally.
In lower-level posts, I might write statements like “in the
T.x’s are spread far apart in memory, so
performance suffers due to L1 misses”, or something like that.
You can probably ignore those posts, because they’re at a level
that is faaar below most of the profiling work I do (most
performance problems in large systems are much more boring
So, if you want to read about some C++ perf work I did without too much explanation about every implementation step, right place. If you want to learn C++, wrong place.
Ok, that’s the disclaimers sorted. Now to actually write some posts.