Semantic and Expressive Decay of Computer Systems

Having recently re-watched “Stop Writing Dead Programmes” by Jack Rusher, the argument for text and against plain text had me thinking about a particular pet peeve of mine: the abuse and waste of colour in computer interfaces.

Be it in editor themes or terminal prompts, I have grown to dislike a popular tendency of using excessive colours without semantic significance. Jack’s presentation reminded me of this fact, as he refers to the work of graphic designers and data visualisers that do try and make use of visual information to convey meaning.

People are surprised when I make this argument, as when they see me in front of a computer I am either sitting in front of a pale white Emacs window or a beige/yellow terminal window (or my current browser of choice). If they catch me using a shell, I have enjoyed more than one disgusted reaction at my plain

To bring up the point that text can be a rich medium of expression, Jack reminds us of the insufficiency of plain text in this respect. By reducing text to be a sequence of (finite, predetermined) symbols we certainly have a good, practical heuristic for dealing with many use-cases that arise in day-to-day computing. And the past-half-century has certainly demonstrated that there are many ways to embed more expressive language inside plain text – though we certainly have not arrived at a consensus (and there is no clear contender when comparing TeX, HTML, …).

I want to argue that the reason we nevertheless take “plain text”, to be natural or reliable is twofold: As the language of Unix, our most influential programming system, has sequences of bytes (or glyphs when raised to UTF8) as the structuring, primitive “data type”, we can easily hear this echo of this fact throughout the system: Files, pipes, editors, executable scripts, etc. all fit the mould. This is not surprising, after all, “everything is a file” is a purported strength of this design. But just like you can simulate a tagged file system using a hierarchical file system with links, any complex structure within a file can only be simulated, but we cannot ensure the invariant in the same way as the external system does (each directory has a . and .. entry, no hard-links to other directories, etc.).

The second point is that plain text finds itself to be the greatest lower bound of all the other alternatives. I mean this in the sense that you can always loose information by “dropping down” to plain text, which in turn can encode the richer format, while remaining human-usable.

I don’t know if there is a name for this point of being just expressive enough to represent more complex systems, without having to formally encode these properties. Unix and the WWW are both instances of this phenomenon, and have both enjoyed success because of it. This flexibility has allowed it to adapt to changing and unforeseen needs.

The price these systems pay is that they tend to grow out unnatural and uncomfortable extensions that betray any initial simple elegance that some might like to preserve.
Though one might have to be careful to not conflate this as a matter of simplicity vs. complexity. We know from programming languages that simple languages are often insufficient and breed diverging, mutually-incompatible extensions (not for any good reason, but just as programming languages don’t tend to encourage canonical and obviously correct solutions to design problems; or fail to seamlessly translate between equivalently fit ones), while complex languages still end up lacking all the functionality one might want — or worse yet, with issues and then struggles to improve on the issues to to backwards-compatibility concerns.

It therefore seems to me that the issue is not that of being small or large, simple or complex, but a question of being expressive: Capable of expressing unforeseen constraints yet also capable of expressing knowledge intuitively, while being as “external” as possible about this. To give another example from programming languages, C++-style OOP languages have traditionally given birth to an “internal” language of design patterns, to express common ideas that the “external” language couldn’t implement. Yet the existence of these patterns was indicative of a reoccurring need, a intended path that requiring a little ritual each time it was put into practice (just like when jumping over a little fence when wanting to trod over an actual intended path).

I find it difficult to argue the point, that this is objectively or necessarily a bad thing. While sketches of points easily come to mind, these can easily be relativised:

The next question I have is whether a system expressive enough to satisfy me can persist, even if we grant its existence?

My fear is that this is not the case, and worse still that there is a legitimate case that over time systems “internalise” information. In other words, the proportion of constructs encoded within a system to those outside of it tend to increase, especially when lacking (orthodox) oversight².

When we now consider a hypothetical programming-system, that is expressive and with rich semantic capabilities, and compare that to a a very internalising system like Unix (few primitives, you cannot straightforwardly add something as primitive as a directory or environmental variables, yet capable of encoding various little languages and conventions that don’t have to be coherent among one another, yet can be composed with sufficient will (e.g. Makefiles that update themselves using sed); at the cost of an increasing difficulty to maintain and extend the system, as the programmer has to be aware of all the implicit connections and intricacies that are not encoded into the system, at best commented on in an semantically opaque comment). My hypothesis is that all systems tend towards a Unix-like fate, as more and more inadequacies arise. It is now that “complexity” comes back into play: As over time all we can rely on staying around is the skeleton of (ironically) external rules and expectations, it becomes a simple question of usability if one wants to struggle with a misappropriated user interface, or if the Unix-like idea of dealing with the lowest-common denominator of just plain text is simply easier.

This point of view doesn’t defend Unix’ notion plain text as something inherently noble or inspired. The success it enjoys is a consequence of a collective failure to find an alternative which is more expressive — without paying the inadvertent interest of complexity.

I now come to my last question, that I will leave open as I don’t want to commit to any specific answer at this point: What are properties of computerised text systems that we can excuse the complexity of, without having to fear their becoming inadequate? Hypertext? Bi-directional hypertext? Colour? Programmable text? Semantic annotations of intention?

Fun fact: I wrote this text over a few weeks, on my phone (termux and some vi) while in public transit.

If I seem not to have a coherent point or ramble away on some tangent, then jump to something entirely different point, it is because I probably forgot my previous train of thought and had to extrapolate from the text I had written up until that point. Setting aside a spell checker, the entire text was written down as is, without changes. Nice experiment, not sure if I want to do that again.

But this is really just a consequence of my preference to use Emacs’ shell interface, which similarly to Plan 9’s “terminal” allows me to interact with a shell session like a “two-dimensional” file. Inside Emacs this means that I can highlight text matching some regular expressions with specific colours, which can come into conflict of the colours that CLI tools can inject into their output.↩︎
The only instances of such “orthodoxy” that I can think of are well planned-out software projects, where maintainers take an effort to ensure a system doesn’t sell out to hacks and ad-hoc patches↩︎