A philosophy of software design

Why did I pick up this book?

I came across this, first on this Youtube video by Matt Pocock on how even though software development as a whole, has changed, with the rise of AI agents, loops, orchestrators, harnesses, whatever, the core software principles, the underlying philosophy above which most of these systems which are built, they still remain the same.

What is the authors big claim?

Despite our best efforts, complexity would still increase over time, unless each of the actions are not perfected. Each of the tasks we might assign to the AI agents should have inherent control, and strong guardrails. When we ask the agents to finish a job, always test it with a red/green TDD approach. Try to also write as less code as possible (using ponytail for eg.). If each of these actions are not tested properly, then maintenance would get harder over time, and at the end, we will have code which is unusable, and not easy to change.

With smarter intelligent models out there, we might have a tendency to put them all on a waterfall-ish approach of just assigning a broader direction, and let it take over. Yet, every action has a 1000 micro decisions embedded into it, and we lose control of that.

The extreme of this approach is called the waterfall model, in which a project is divided into discrete phases such as requirements definition, design, coding, testing, and maintenance. In the waterfall model, each phase completes before the next phase starts; in many cases different people are responsible for each phase. The entire system is designed at once, during the design phase. The design is frozen at the end of this phase, and the role of the subsequent phases is to flesh out and implement that design.

Complexity can appear in various ways. It could be as a result of how completely intertwined the modules and the systems might be. That it becomes difficult to isolate the change we want to mitigate. In a complex system, it might take a lot of work to make small improvements..

Complexity is anything related to the structure of a software system that makes it hard to understand and modify the system. Complexity can take many forms. For example, it might be hard to understand how a piece of code works; it might take a lot of effort to implement a small improvement, or it might not be clear which parts of the system must be modified to make the improvement; it might be difficult to fix one bug without introducing another. If a software system is hard to understand and modify, then it is complicated; if it is easy to understand and modify, then it is simple. You can also think of complexity in terms of cost and benefit. In a complex system, it takes a lot of work to implement even small improvements. In a simple system, larger improvements can be implemented with less effort.

A seemingly small change shouldn’t require code modifications in different places. If we ever come into such situations, we could safely say that the codebase has become quite complex.

Change amplification: The first symptom of complexity is that a seemingly simple change requires code modifications in many different places. For example, consider a Web site containing several pages, each of which displays a banner with a background color. In many early Web sites, the color was specified explicitly on each page, as shown in Figure 2.1(a). In order to change the background for such a Web site, a developer might have to modify every existing page by hand; this would be nearly impossible for a large site with thousands of pages.

Author talks about how for a function in C, there is a need to allocate a memory, return a pointer to that memory, and also assume that once the operation is done, the caller frees that memory. That need not be the case, and the caller might leave the memory un-free.. which might lead to a memory leak situation.

“known unknown” type of issues are still easy to diagnose as you still know what to diagnose. However, if there are unknown unknown type of issues, they are very tricky to solve.

The C memory leak is an example of tactical programming stretched to the max, without considering the larger strategy behind how and why the codebase is to be constructed. This is an archetype of a “tactical tornado” ..

Almost every software development organization has at least one developer who takes tactical programming to the extreme: a tactical tornado. The tactical tornado is a prolific programmer who pumps out code far faster than others but works in a totally tactical fashion. When it comes to implementing a quick feature, nobody gets it done faster than the tactical tornado. In some organizations, management treats tactical tornadoes as heroes. However, tactical tornadoes leave behind a wake of destruction. They are rarely considered heroes by the engineers who must work with their code in the future. Typically, other engineers must clean up the messes left behind by the tactical tornado, which makes it appear that those engineers (who are the real heroes) are making slower progress than the tactical tornado.

This was also one of the core reasons why I was prevented from touching the codebase as a non-technical person, even though I was confident that i could ship improvements. The larger team of developers hesitated with this, as I might be a tactical tornado pushing a hoard of PRs without clear rationales explained. Strategic programming is required here, and that requires an investment mindset thinking about the longer term view in mind. And it’s hard for non-technical folks to get this viewpoint sooner, as it requires us spending 100s and 1000s of hours with the codebase, and understand how systems are connected with each other.

Conversely, if you program tactically, you will finish your first projects 10–20% faster, but over time your development speed will slow as complexity accumulates. It won’t be long before you’re programming at least 10–20% slower. You will quickly give back all of the time you saved at the beginning, and for the rest of system’s lifetime you will be developing more slowly than if you had taken the strategic approach. If you haven’t ever worked in a badly degraded code base, talk to someone who has; they will tell you that poor code quality slows development by at least 20%.

In fact, Facebook has suffered so much, that they had to change their original motto from “Move fast and break things” to “Move fast with solid infrastructure”

Facebook has been spectacularly successful as a company, but its code base suffered because of the company’s tactical approach; much of the code was unstable and hard to understand, with few comments or tests, and painful to work with. Over time the company realized that its culture was unsustainable. Eventually, Facebook changed its motto to “Move fast with solid infrastructure” to encourage its engineers to invest more in good design

Modules should be deep

We can either have shallow modules, or deep modules. Each module has two elements: one is the interface, and the other is the implementation. Shallow modules bring a lot of complexity to the interface, and don’t abstract it away into the implementation. While, deeper modules do the opposite. They abstract away as much as possible within the implementation, and dont place much on the interface.

Ousterhout advocates deep modules over shallow modules. Why?

Say, for example, the complexity of the linked list abstraction might be as great a complexity as that of it’s implementation. Then, what’s the whole point of having a shallow module then? It’s not that useful if it doesnt hide the abstraction, and protects us from further complexity.

From the standpoint of managing complexity, this method makes things worse, not better. The method offers no abstraction, since all of its functionality is visible through its interface. For example, callers probably need to know that the attribute will be stored in the data variable. It is no simpler to think about the interface than to think about the full implementation. If the method is documented properly, the documentation will be longer than the method’s code. It even takes more keystrokes to invoke the method than it would take for a caller to manipulate the data variable directly.

The other school of thought is to have shallow modules, and that functions should not exceed, say, N lines of code.. or something like that.. Ousterhout has a different framing and thinking behind this, and advocates us all to go for deeper modules.

Unfortunately, the value of deep classes is not widely appreciated today. The conventional wisdom in programming is that classes should be small, not deep. Students are often taught that the most important thing in class design is to break up larger classes into smaller ones. The same advice is often given about methods: “Any method longer than N lines should be divided into multiple methods” (N can be as low as 10). This approach results in large numbers of shallow classes and methods, which add to overall system complexity.

We can also try strawmanning them, and see why not? why not deep modules. And the opposite of information hiding, is information leakage.

Example: imagine a database module exposes this nice interface:
getUser(id)saveUser(user)
Looks deep and clean. But then callers discover:
getUser(id) is fast only if id is indexedsaveUser(user) silently fails if the user object has nested arraysgetUser(id) returns stale data unless called after refreshCache()saveUser(user) must not be called inside a transaction from another module
Now the caller has to understand hidden implementation facts: indexes, cache invalidation, object shape limitations, transaction boundaries. The abstraction looked deep, but its internals are leaking into usage decisions.

The subtle point is this: deep modules are not bad because they hide complexity. They are bad when they hide complexity that the caller actually needs to reason about.

Always build a generalised module, and push specialisation upwards. It’s much more difficult to make something specific more generic, than the other way round.

Push specialization upwards (and downwards!) Most software systems must inevitably have some code that is specialized. For example, applications provide specific features for their users; these are often highly specialized. Thus it isn’t usually possible to eliminate specialization altogether. However, specialized code should be cleanly separated from general-purpose code. This can be done by pushing the specialized code either up or down in the software stack. One way to separate specialized code is to push it upwards. The top-level classes of an application, which provide specific features, will necessarily be specialized for those features. But this specialization need not percolate down into the lower-level classes that are used to implement the features. We saw this in the editor example earlier in this chapter. The original student implementation leaked specialized user-interface details such as the behavior of the backspace key down into the implementation of the text class. The improved text API pushed all of the specialization upwards into the user interface code, leaving only general-purpose code in the text class.

Modules should be deep

More book notes

Butter

Algorithms to Live By

Bed of Procrustes