Published On : 17 August 2022 |Last Updated : 17 August 2022 |2107 words|8.9 min read|0 Comments|

There is this mantra that I have been hearing too much in my life : “don’t reinvent the wheel”. People mean that as a metaphor trying to discourage you from redoing something that is already done, and by that they actually want you to use whatever software application or library instead of coding your own, as if it was automatically a time and resource saver. It’s not.

First of all, despite its apparent common sense, this statement is incredibly ignorant : the wheel has been reinvented many times, typically for the sake of performance. From day one, wheels are all about performance anyway : their first principle is to allow carrying heavy weights on roads while reducing friction, therefore reducing the amount of energy needed or allowing to carry more with as much energy.

Bicycles used to use wooden wheels. Then they moved to steel ones with spokes. And later to double-walled aluminium ones, lighter while being more rigid and less likely to warp. Now, professional road biking uses lenticular plain wheels for time trial, and deep profiled carbon-fiber wheels for classic races. And that’s only talking about the rim. Tires have evolved a lot too, now able to be inflated without inner tube for even reduced energy losses.

So there are many types of wheels, even for the same appliance, each of them designed for specific needs with specific trade-offs, and while the very basic principle holds (a circular thing rolling on a surface as to reduce friction), implementations are different and that is actually much welcome. The whole “don’t reinvent” would have prevented all that. One could argue that wheels were only improved, not reinvented. But just looking at an aero carbon wheel and at a primitive wooden wheel shows they have so little in common (the circular outer shape…) that it’s pretty much a redesign from scratch.

Now, moving on to software development, what does it induce to “not reinvent”, aka re-use ?

If we are talking end-user applications, there are 2 main problems :

  1. the fact that apps may not be exactly tailored to your needs,
  2. the fact that apps may already be riddled with a bloated list of features you don’t need.

#1 will induce a need to work around the lacks of your app, maybe hack it in a way it is not planned to be used and maybe your hack will not survive the next update. It will surely not be efficient and quite frustrating. This usually happens with Microsoft Excel that people hack in all possible ways to avoid using proper database/accounting/data-mining/statistics/project management/time scheduling/contacts managing software, or even simple HTML tables to present data. For literally everything Excel can do, there is a specialized app that does it better.

#2 will force you to use the app with a compass to find where the subset of features you actually use is hidden. This will be tiring and the learning curve may be steep. But all those features working in background will probably load your computer/phone with unnecessary computations that will drain your battery and make the GUI sluggish for nothing, forcing you to invest in expensive hardware to do the exact same thing as you did 15 years ago with a Pentium 4. This commonly happens with Adobe Photoshop, a software you can use to retouch photographs, draw from scratch, layout websites, etc. and where even the most skilled will never use more than 20 to 25 % of the features.

For libraries, it’s mostly the same, but with some additions :

  1. libraries need to be general, which prevents many optimizations and makes them slow in general,
  2. the library documentation may be cryptic, outdated or even non-existent,
  3. the library maintainer may change the API at some point, forcing you to rewrite all your interfaces and to handle user-side lib versions manually,
  4. bugs and regressions may appear with lib updates, upon which you have no control and typically no warning,
  5. the end-user may have a lib conflict on their OS, with some app requiring some particular version, and yours requiring a newer/older one,
  6. the Linux distro of your end-user may compile the lib with exotic features on or off, which may prevent it to work at all for your app.

#1 for libs means you may need to write extra layers on top of the lib, to adapt it to your needs, which worsens your dependency upon the lib API, that can get nuked at any major update. All in all, it might take even more time and resources to go this way rather than to develop your own lib from scratch.

#2 is a typical problem with all those stupid Javascript frameworks, jQuery, Angular, Backbone, Electron, React, etc. They have enabled mediocre, abstraction-disabled developers to whip up websites and apps in no time, since the web 2.0, by allowing them to just stack pre-built functions without having to think about actual algorithms. So, the typical use case is : load a 200 MB lib client-side, use just a couple of methods from the lib, drain the device battery, memory and the user data-plan doing so, sing the praise of the magical dev-friendly world we live in. Nonsensical.

#3 is very annoying in high-performance computing, like real-time image processing. Knowing the memory layout of your data can enable several computational optimizations, because accessing memory contiguously is more efficient and, since the most expensive task is to copy data to the CPU registers, once they are there, you might as well squeeze as many algebraic operations as you can. Libraries tend to have methods that embed the pixel loops at the method level, but each loop will reload the whole picture pixel by pixel, which means I/O overhead penalties apply between each step. Writing your own methods allows to collapse loops, enable SIMD vectorization, parallelization and such. Of course, it requires more skills and more maths.

#4 is something that happens to me all the time. I have reached a programming level such that coding my own libs in C to perform tasks I perfectly understand takes me less time than trying to navigate through shitty documentations or to reverse-engineer uncommented open-source code using only 2-letters variables and unclear function names. I’m too old for that shit.

#5 has happened countless times with Gtk, and some times with Lensfun. As an app dev, the responsible thing to do would be to subscribe to the mailing lists of all the libs your app uses, but who has the time for that ? API changes are noticed when an user report them, which is already too late. Again, I have grown weary of relying on the (disputable) decisions of third-party dev teams, and my personal definition of happiness is : $\text{happiness} = \dfrac{1}{\text{number of people upon which you depend}}$.

#6 happens all the time with GPhoto2, an I/O lib heavily used to connect digital cameras to Linux filesystems in order to access their internal memory and SD card. Some of my cameras used to work 1-2 years ago, since some update they don’t anymore, and nothing in the release notes mentions any work on those particular cameras, with good reason because they are 8 years old. I call these the “Schrödinger libs”, because you never know if the cat is alive or dead, you try your luck every time. You really don’t want them in your app, since the app is the user front-end, the app dev will basically be the one looking like a fool unable to write stable code, and saying “it’s not me, it’s the lib” will only make you look like an irresponsible fool trying to blame someone else.

#7 has led to the rise of the sandboxed and contained environments, like Docker, Snap and Flatpak. To solve the issue of lib versions incompatibilities, that used to be the cause for Linux distributions (ensuring all soft are compatible between each other and with the same libs by providing distro snapshots), they simply ship the lib they need in their own package, and run it aside of the general system. So the same libs could re-installed several times as part of different packages under different versions. This is a great loss of hard drive space, but also diverts CPU cycles into the environment manager running as a daemon in background, to connect with the desktop and the other “native” apps. Another great piece of bullshit solving bloat with more bloat, and don’t get me started on their futuristic marketing. From where I sit, the future looks like a generalized waste of CPU cycles and disk space made acceptable through bullshit buzzwords.

#8 happened to me with Exiv2 on Fedora 36. darktable needs Exiv2 v0.27.5 to be able to read the Canon CR3 raw file formats, based on the ISOBMFF containers. Exiv2 provides it but there is a long-lasting and completely unfounded belief in open-source that ISOBMFF is patented and that supporting it may infringe on imaginary patents. So the Exiv2 devs made the feature optional, to be enabled with a compilation flag, and the Fedora packagers went more catholic than the pope, and shipped it without ISOBMFF support. Even as a dev, it took me 2 h to figure out why CR3 files wouldn’t work on my own app, and it took a couple of messages on Exiv2 Matrix/IRC channel. Forget it for a typical user. One more point for the stupid bloated Snap/Flatpak.

There are 2 things that piss me off the most here.

The first one is that seemingly-wise mantra is completely stupid, and the people who profess it the most are the least technical ones. Which is telling. Those guys really need to have a brutal internship at some software maintenance firm, to get their perceptions aligned on reality. Tech sucks. Tech is led by stupid people who don’t necessarily know what they are doing and can easily be tricked into ideologies.

The second one is how acceptable it has become to simply waste CPU cycles all around now. Libs and bloated apps are just a symptom of this global trend. I’m sorry, but my old Thinkpad P51 with a brand-new 90 Wh battery runs at best 5 to 6 hours in web use with the most aggressive power-saving settings, while any Intel-based Macbook Pro can last 12 to 14 h on similar use. That’s like twice to almost three times more, we are not talking 10 or 15 % bonus. How come ? Even if computers are not that power-hungry now, 14 h is 2 days of productivity, 6 h is barely one. In what universe is that remotely acceptable ?

Remember, energy efficiency and performance was the whole point of the wheel. And, just the same, it’s the whole point of coding your own base layers, instead of relying on disputable libs.

Whenever a lib shits the bed, app devs need to fix things that are out of their control. That may be reporting bugs to the lib maintainer, but that’s assuming development is still active (for GPhoto2 and Lensfun, mentionned above, it’s not). Then you need to convince the lib maintainer that it’s actually a bug on their side, and not an implementation mistake on yours. Typically, they will ask for a minimal reproducer, which you won’t be able to provide because the bug will affect some line of code in the middle of a part you barely understand because it was written 10 years ago by someone who is not around anymore. Then you need to gently point the maintainer toward a solution that benefits you, because you don’t want to rewrite your own interface with their API. If some other dev pops up to push the solution in his favour, you may be screwed.

And, of course, if the development is stale, then you need to patch the problem as best as you can. Maybe force the use of an outdated lib version that will not be around forever. Maybe fork the lib, at least internally. Maybe code a nasty workaround that will add to the technical debt of your own app, because in 2 years, everybody will have forgotten it was there, and in 4 years, bugs will start showing up and someone will spend a week narrowing them down to that ugly fix.

And if all that is done for just a couple of methods inherited from the lib, then the long-term cost of using an external lib will be much higher than simply coding your own minimal one from scratch. Which is what most people don’t understand. There is a whole difference between prototyping code ASAP and writing code meant to be maintained long-term, let alone to run efficiently.

This site uses Akismet to reduce spam. Learn how your comment data is processed.