Abstraction and Dependency: Why Software is Inherently Unstable
Writing software has never been easier due to vast ecosystems and comprehensive abstractions. However, no benefit is without its tradeoff…
The world runs on software, a peculiar medium of technology, bridging abstraction and practicality. Software has become yet another layer of human social infrastructure upon which our world runs. Our news comes from software, we communicate through software, it's in our medical system, soon it may drive our cars. We have adopted this technological medium at the most fundamental levels.
It makes sense, too. Software is cool, it lets us do things that we've never been able to do before. By its very nature, software is “soft”, it's malleable. It can be molded into various shapes and sizes to fit a huge array of use-cases. As software can be represented simply as a sequence of numbers in the right order, transferring copies between computers is also relatively simple.
But while the utility of our programs is without question, there are hazardous implications of such a reliance. As those of you who write software are likely aware, modern software carries a significant degree of instability. Creating an application involves many layers of abstraction between the developer and the digital operations of a CPU. Ease of use comes at a cost.
Today, we'll be talking about that cost.
Understanding Abstraction
It's worth defining some terms. First, one term that I definitely use too much is “abstraction.” I like to define abstraction as taking some discrete process or structure and extracting from it a reusable pattern. In practice, abstraction becomes quite intuitive, tending to occur when a developer notices the repetition of some logic.
Let's take an example. A common element in user interfaces is a table. A table is a grid with rows and columns. The top row describes the type of data in each column, and each row below represents a specific item, showing the relevant data for each column. For our example, we'll say that we want to display a table of all users in the application, with columns for name, role, etc., as well as some buttons to delete or suspend users if need be.
In practice, this means writing some code. First, get all the users. Once you have the users, display the table, and each user as a row under the table headers, with every column defined for its specific user. Now you have a table of users to place anywhere in your application. This high-level series of steps is the process that I'm talking about.
Now, your boss would like a new page added to your app. Another table. This time, instead of users, we'd like to list all food products–this example app is going to be a grocery store app now, by the way. This table should display each product with some information and buttons to delete or view the specific item. As a wise computer science teacher once told me, “a good developer is smart and lazy.”
So, in the smart and lazy spirit, to build your new table you decide that you'll borrow some logic from the user's table you just got done with. You go through, copy the code, paste it where it goes on the new page, change some variable names, bing bang boom, and voilà, the food items table is done. But there's a problem. You broke the DRY principle, Don't Repeat Yourself! You have duplicated the logic from another method, and now you have two different versions of the same logic to maintain. Complexity!
For the sake of our example, this is where abstraction comes in. To manage complexity, instead of repeating the logic for a table, we'll abstract it. That is, extract the repeated pattern into a reusable form. For our particular case, the pattern that is repeated is as follows:
Displaying a variable number of header columns, each defining a specific data point on an entity
Displaying a list of entities containing data points for each aforementioned column
It's worth noting that there is also variability baked into this pattern. This is where the “function” comes in. A function is just a way to repeat some logic for a variable set of parameters. So in our case, we'd make a createTable
function, or something of the sort, and any time that we want a table, we call createTable
with the data that we'd like displayed.
For the non-programmers among us, that may sound a bit confusing. However, you can essentially think of this like a recipe for a cake, where you get to pick what kind of icing you want. Once we have this function, instead of defining specific instructions for each instance of some logic, we only have to define it once. We only have to define specifics for the parts that change. In doing this, we've moved to a higher level of abstraction.
Fostering Dependency
In a single codebase, abstraction often looks like a more complicated version of what I described above. However, duplication and reusable patterns are not only found within discrete codebases. While functions may abstract repeated logic from a single application, libraries abstract repeated logic from across applications. A library is a set of utilities and patterns for building entire sets of applications.
I won't bore you with another drawn out example, but a library is like a toolbox. A carpenter does not create new tools for each new piece of furniture he creates, but rather keeps a set of common tools handy for every project. In software, libraries are developed as patterns are found between multiple projects. Oftentimes these libraries start at an organizational level, but move out to the wider industry.
React.js is a library (or framework, depending on who you ask), which provides a way to build user-interfaces intuitively. Beginning as a Facebook project, React has become the most popular set of tools for building user interfaces. This is because the problems that React solves are very common, and the solutions it provides are sound.
These libraries can then be used in your application with relative ease, largely due to their open source nature. React, for example, is available in its entirety here on GitHub. Simply download the library to your codebase and import its functionality. Super easy, and super helpful. However, with such a convenience comes an inconvenient truth. Your project doesn't just use this library's functionality, it depends on it.
Leaky abstraction
If your project depends on another project to function, this is called a dependency. A dependency is a relationship between two projects where one uses the functionality of the other, and therefore relies on it. With each imported function, your project becomes further attached to the dependency. A “leaky abstraction” is one where the abstraction itself determines the shape of the dependent, not just its own.
You can see how this might become a problem. The aforementioned React library, for example, is a common example of a leaky abstraction. This is because in order to use React to its full extent, you must use it how React wants you to use it. To go back to our cake recipe example from above, this is where the recipe wants you to use a specific set of tools to bake your cake.
Abstractions are always leaky to some extent. This is because every library exposes an interface, or a way of interacting with its underlying functionality. For example, Moment, a popular date time formatting library, has a fairly simple interface. To use the library, it looks something like this:
import moment from 'moment';
function formatDate(d: Date) {
return moment(d).format('MMMM Do YYYY, h:mm:ss a');
}
console.log(formatDate(new Date(Date.now()))); // e.g., July 2nd, 2024, 10:16:18 am
Where formatDate
is a function wrapping around moment's functionality. In this case, we are doing things the way that moment expects, as we should. However, an abstraction begins to leak as the scope of its effects expands. For Moment, the scope is fairly narrow: tasks involving date formatting. However, for something like React, the scope of its effects is your entire user interface and project structure.
You can get a feel for the leakiness of some abstraction by asking yourself:
“How hard would it be to remove this dependency if I wanted to?”
In our above example, if I wanted to remove Moment, it would be as simple as modifying the formatDate
function. I'll spare you the details, because it turns out Moment is actually pretty useful and replicating the exact functionality is tedious. However, you likely don't have to do much to remove a library like Moment from your codebase.
React, on the other hand, leaks into every single aspect of your client-side codebase. As a library used to build user interfaces, React is likely going to be used everywhere that you have a user interface. On one hand, this is the inherent utility of a dependency, you outsource the boilerplate and project infrastructure. On the other, this dependency relationship just became a lot more set in stone.
Gangrene and Infection
When you import a library, you import its problems. A third-party codebase is prone to the same issues as any other. Performance issues arise due to poor practices or other inefficiencies. The project can be poorly maintained, or even not maintained at all, and quickly becomes dated relative to the pace of software. In fact, it's more common than you might expect to have dependencies which are abandoned.
When a project reaches its end of life, its authors will normally deprecate the package, meaning it will not be supported any longer. This is often done formally, and migration steps are published by the author pointing to a better alternative. If you rely on a deprecated package, you'll either have to support it yourself, or move to an alternative. This luxury isn't guaranteed, and oftentimes packages are abandoned with no notice to their dependents.
An outdated package is a problem. Besides performance issues or compatibility failings, outdated projects can become proxies for malicious intentions. A security vulnerability is a weakness, flaw, or error in a security system that could be exploited by a threat agent to compromise a network. These vulnerabilities are discovered by security researchers and other entities over time, who then publish their findings for developers to patch.
An outdated project is a project which has not been patched. Security vulnerabilities don't just go away. They persist, and without maintenance and updates, virtually any project could eventually become vulnerable.
I'd like to give you an idea of this problem's scope. NPM, or Node Package Manager, is used to manage dependencies for JavaScript development. NPM's main package registry lists more than 3 million packages, all available for instant download. When you download an NPM package, on average, that package has 4.39 additional dependencies.
Therefore, your number of dependencies is actually much higher than the number of packages that you've installed. For each dependency you download, 4.39 others come with. If you have 10 dependencies, you likely have closer to 43.9. Researchers have recently estimated that users of NPM download deprecated packages approximately 2.1 billion times per week.
This problem doesn't even account for intentional vulnerabilities and backdoors baked into open-source projects. However, it does account for a large percentage of the security issues faced by software companies today. This is the software that you're using right now to read this post.
A dependency is like a limb, if it starts to die, either you cut it off or it makes you sick.
Takeaways
Things aren't all bad. However, dependencies are often taken for granted when tradeoffs must be accounted for. Managing complexity is a difficult task. Productivity is rewarded, and best practices often are not. Dependencies are a simple way to offload repeatable logic to third-parties.
There are ways to use dependencies wisely. Your level of thoroughness should correspond with the level of effect you expect your dependency to have. For example, if you're going to use a library that will be applied to your entire codebase, you really want to make sure that's a good idea. If it's a smaller dependency, it can still be problematic, so don't write them off.
Issues with dependencies, such as deprecation, compatibility, and security problems, are all manageable with some better habits. An ounce of prevention is worth a pound of cure.
Author's Note
You may have noticed that this post is out late. I was moving this weekend, and without the time to ensure quality, I chose to delay. This won't be a normal occurrence, and next Monday's post can be expected on time, however, I would just like to formally apologize for my tardiness.
Credits
Thumbnail
Bilal Azhar - https://substack.com/@intelligenceimaginarium
Music
Track: Marshmallow by Lukrembo, Source: https://freetouse.com/music, Copyright Free Background Music