So Damn Close

So my latest interest has been trying to squeeze performance out of simple algorithms - mostly so I can understand the impact of branch misses, lookup strategies, etc.

I spent Sunday writing an optimized solution to the language benchmark game’s reverse-complement challenge. I ended up doing all kinds of hacky things I’d never recommend doing in prod, like writing a custom vector and writing tricksy algorithms. Repo here, submission here.

Well, for all my hard work, I managed to come… Second! To, of course, a much tidier Rust implementation (❤️). Why? Not because the Rust solution is a more efficient (it’s not: it takes at least 2x more cycles and memory than my single-threaded C++ implementation), but because the the Rust implementation throws threads at the problem, which is the true power of Rust (in addition to the fact that the Rust version can be just as efficient as the C++ one by adding some SIMD and unsafe code).


Implementing Rust Async and Futures from Scratch

As is tradition for many developers stuck at the family home over xmas. I decided to go hack something.

Asynchronous programming is becoming more popular in all major languages. C++20 is going to get co_await and friends, python 3.7 now has async, and Rust has async / .await. Rust’s implementation of Future<T> is quite unique. It uses a “polling”-based interface, where the listener “polls” for updates but–and this is why I am making judicious use of quotation marks–polling only occurs when the asynchronous event source “wakes” the poller, so polling only actually happens when a state change occurs, rather than continuously.


Demoing PetaSuite Protect at ASHG 2019

I went to Houston for ASHG 2019 with PetaGene to demo PetaSuite Protect, one of the products I’m helping to develop.

Giving tech demos is always a daunting task, especially because we gave our tech demos completely freeform - typing shell commands in front of clients is always fun ;). The demos were delivered without a hitch, though, so there’s something to be said about the effectiveness of writing bash scripts during a long-haul airplane journey.


igv.js: porting a large C/C++ codebase into browsers

One of the more interesting projects I’ve worked on recently is using emscripten to port PetaGene’s high-performance decompression suite to wasm so that it can run in a browser with no installation.

It required figuring out how where to draw the line between having a fully async API (ideal for javascript) and using Emscripten’s asyncify to emulate synchronous IO (ideal for standard C/C++ applications). It also required an ill-thought-out optimization to igv.js, which prompted a much better fix by the maintainer. This is why I like the OSS model: even bad ideas can prompt a discussion about better ones.


PetaGene wins Bio-IT World 2019

PetaGene won best of show for their latest product, PetaSuite Protect (link, archive). I had a great time at the event: people were super interested to learn what compression and encrpytion can do for them. I am looking forward to helping develop the PetaSuite Protect product :)