Building Systems That Will Fail

This 1991 paper discusses the author's experience with building two early time-sharing systems, and his learnings on managing reliability of such large software projects.

To be honest, I paid less attention to the reliability parts of the paper than I should have. What really grabbed my attention was how the author's work seemed to have inspired many features common to operating systems today. For example:

Multi-tasking:

The supervisor program, which was always in main memory, would commutate among the user programs, running each in turn for a brief interval with the help of an interval timer

Swap files / virtual memory:

The key difficulty was that main memory was in short supply and not all the programs of the active users could remain in memory at once. Thus the supervisor program not only had to move programs to and from the disk storage unit, but it also had to act as an intermediary for all I/O initiated by user programs.

Isolating memory for each process:

As a further complication, the supervisor program had to prevent user programs from trampling over one another. To do this required special hardware modifications to the processor such that there were memory bound registers that could only be set by the supervisor

Also, this interesting anecdote about how these projects inspired Unix stood out to me:

The Unix system was a reaction to Multics. Even the name was a joke. Ken Thompson was part of the Bell Laboratories' Multics effort, and, frustrated with the attempts to bring a large system development under control, decided to start over. His strategy was clear-Start small and build up the ideas one by one as he saw how to implement them well. As we all know, Unix has evolved and become immensely successful as the system of choice for workstations.

Ken Thompson explicitly set out to build Unix simply and incrementally after his unpleasant experience on a complicated mega-project.