Leaders on the business side of tech companies rightfully extol the virtues of focusing on minimum viable products, not making perfect the enemy of good and only shipping what’s necessary as quickly as possible. These goals certainly resonate with me as an operational founder and I emphasize them with my teams daily.
But at my core, I’m a software engineer, so my instincts are skewed towards a mindset with an opposite set of goals: striving for comprehensive solutions, building things that last, and mitigating problems before they arise…
…aka premature optimization.
A couple of years ago, I added the Plugin Framework to Concourse so developers could extend it with modular enhancements. The framework was designed to allow each plugin to run in a separate process and communicate with the database server over dedicated interprocess communication (IPC) channels.
While there are a lot of good native approaches to low-level IPC (i.e. domain sockets or named pipes), Concourse seeks to be as cross-platform as possible, so I was limited to higher level constructs like anonymous network sockets or rolling my own solution from scratch.
I chose the later.
The technical benefits were clear — building my own IPC over shared memory would minimize latency and facilitate a future improvement where plugins could communicate with each other directly using the same shared memory segment (anonymous sockets require binding to a distinct port for each unique plugin pair that exchanges messages).
But this was a foolhardy approach.
Before the feature was even working, I was trying to make it fast. Even worse, I was accommodating development for possible future improvements before the core functionality was even complete.
Needless to say, I wasted a lot of time going down this path.
First, I had to implement shared memory using memory-mapped files and a combination of spin-locks and file system notifications to facilitate efficient message consumption by a neighboring process.
View on Github: Premature optimization for reading shared memory messages
But I quickly ran into a problem where the underlying memory-mapped files grew too large and the off-heap memory usage exploded. So, I had to add compaction logic to the shared memory implementation.
This implementation worked for Concourse instances with only one plugin! But chaos ensued with multiple plugins installed because the system lacked concurrency controls to regulate the different communication streams.
So, I had to build synchronization logic from scratch. This is hard enough on its own, but the compaction functionality compounded the problem even further.
In the end, I developed an intricate system that uses a combination of both JVM and file locks to coordinate readers, writers and compaction. I was certainly proud of myself and impressed at the feat, but my work ultimately proved unusable due to a MAJOR limitation inherent in most file systems: broadcasts about file changes only occur once per second, so it is impossible to detect changes that happen in 999ms or less.
View on Github: another attempt to cover all the corner cases in FileOps#awaitChange
In total, I spent a month developing this system and another two weeks trying to debug intermittent issues. Eventually, I was fed up and decided to scrap it in favor of anonymous network sockets.
And two years later, Concourse’s plugin framework is VERY reliable and hasn’t been plagued by any bugs related strictly to the communication infrastructure.
Sure, the pros of using shared memory still stand and the use of anonymous sockets does cause Concourse plugins to need more operating system resources. But the feature works and provides value. And that is the most important thing.
As a developer, I understand that the urge to optimize prematurely is a habit for some of the best software engineers and architects. But, it’s imperative to keep that inclination in check to ensure you focus on quickly delivering value, getting feedback from users and making the right trade-offs as your software evolves.