A lot has been written about how complex
computers and electronics have become. I
think a lot of this complexity derives from the conflict between a particular
engineering principle and a psychological phenomenon. As far as I know no one has explicitly discussed
this phenomenon, so I’ll do it here:
Pete’s Rule:
Every layer of indirection is exponentially mind-boggling.
I think this rule is at
least partly responsible for the vastly diminishing returns in productivity
that new software technology brings to users and programmers. At the least, indirection severely inhibits
casual new users who are trying to join the online community; and if designers
don’t curtail their use of it, indirection could force baby-boomers to give up
their cars in a decade or so.
When you press a button that is labeled with what you
are trying to do, like print a document, you may find that pressing it reveals
another layer of choices, one of which actually prints the document. This style of “drill-down” menu is one layer
of indirection.
Or
consider a much more elaborate example: the Domain Name System (DNS). In the 70’s and early 80’s, computers on the
internet were known by IP address numbers.
There was file called the hosts.txt file that gave the IP addresses of
many of the more important computers by name.
If you ran a computer you could get a copy of the global hosts.txt file
and even add more names to your copy of it for your own private use. This was a new level of indirection: instead
of using IP addresses, you could use names, which some programs understood as
needing to be looked up in the file and converted to addresses. In the 80’s this system was replaced with the
hierarchical Domain Name System. The old
central administration, which used to maintain the global hosts.txt address
file, instead started maintaining a global list of addresses of name
servers. The name servers in turn stored
localized versions of something like the old hosts.txt file to map from name to
IP address. To get at a particular
address book which might contain the name of a computer you wanted to reach,
you had to give the name of the domain, like IBM, after the name of the
computer, like ‘research1’. The domains
themselves were grouped according to purpose, such as commercial, educational,
government, etc.
It’s
hard to remember how mind-boggling this was.
People wondered why the name of the computer came first, and then the
name of the company, and finally the name of the abstract high-level domain. Shouldn’t it be the other way around? And how could companies and schools be counted on to keep their address books in good
order? Wouldn’t the whole system fill up
with garbage once it was out of the hands of the central organization?
Instead,
the whole system has worked amazingly well.
However, it is far more complex for computers to look up names using DNS
instead of a local file. And now another
layer has been added: instead of configuring your computer to know the address
of a reliable DNS name server, you computer likely gets the name server’s
address from a DHCP server.
I’m
not saying that DNS or DHCP are bad. I’m
just pointing out that they are mind-boggling, and that it had taken quite a
while for those of us who lived through the change to adapt to it, including
re-writing a whole lot of software and changing how IT organizations ran. Normally, you don’t still see this
discomfort, but if you work in a corporate or consumer software support organization,
you know that troubleshooting problems with DNS is beyond the average
user. In this case, the value provided
by DNS is worth the complexity to the software developer and end user because
it enabled the explosion of internet connectivity.
In
any case, DNS is not long for the world, in its current visibly hierarchical
form. It’s becoming less necessary for
anybody to know about domain names and hierarchies, especially obscure details
like ‘.com’ vs. ‘.net’ Sometime not too
long from now, a better method will be developed for finding web sites and
sending emails. Already I am no longer
remembering any DNS names because I just use google
to go straight to the website I want based on what is at the site, not based on
its name. And for email, I use the
auto-generated address book that accumulates in my mailer as people send mail
to me. I don’t know the email addresses
for the majority of people I correspond with.
Pretty soon, DNS addresses will be a hidden layer of the internet, the
way DNS hid IP addresses in the 90’s.
Where was I, before that
digression on the evolution of addresses, names, DNS, and the sub-digression on
the future of naming vs. searching?
Let’s get back to the thesis: levels of indirection are exponentially
mind-boggling. To take a diversion while
you are trying to reach somewhere else is a type of indirection. To navigate it mentally you need to take note
of where you are, tuck it away, deal with the diversion, and then return to
where you were and continue on. In
computer science this is a fundamental operation called pushing and then
popping the stack. It is natural for
humans and computers do this. However,
computers are WAY better at it. If a
conversation with someone goes off the track by one or two topics you might
never get back on track, and if you go three levels deep and then get all the
way back your brain will probably react by laughing when it notices what
happened. That’s its way of saying that
you’re reaching its limit.
A
computer has no inherent limit to how deep it can go in dealing with
sub-tasks. Programmers, who are usually
better at breaking down subtasks than most people, are comfortable when
computers do it too. The problem is that
this has a high cost in terms of mental complexity for the end user, because we
have to push and pop our mental stacks to keep track of what is going on as we
navigate through layers of indirection. The software development community, and
increasingly the entire design community, is not paying enough attention to
this cost because they personally are comfortable coping with it and because
they see how well it solves so many of their problems.
How does indirection arise
in software? Indirection lets a designer
get around a limitation without utterly redoing the original solution, by
adding another layer on top of the original solution. This might make the system able to handle
more operations, or work in new situations that weren’t envisioned earlier. For example, early file systems kept all the
files organized by name, like a big phonebook.
But, like a phone book, the list could get unwieldy, and if the same
name appeared more than once it could be confusing. A solution was to add a layer of
indirection. The directory would contain not the names of files but the names of other
directories. This would let you organize
the many filenames into separate compartments, just like having a cabinet full
of folders, and the folders full of documents.
Some
of the greatest computer scientists ever eventually recognized that if you can
put directories under the master directory there is no reason why you couldn’t
put other directories inside of those directories as well. To anyone reading this article, this idea is
obvious, but it is usually acknowledged that Multics
in 1969 was the first OS to use the concept.
To me this is proof by example of Pete’s
rule: the concept of filename indirection was so mind-boggling that
hundreds of software geniuses missed it for years. And if you still think that hierarchical file
systems are obvious and inevitable, try to walk a true novice computer user
through the Microsoft Office “Save as” dialog over the phone. The program will help you by proposing an
obscure location to start. You can then
try to explain how to “navigate up and down” through folders and disks. Or, you can just accept the default, which
puts all the documents together into a single directory, like the first
operating systems.[pnp1]
The trajectory of software
menu development shows how levels of indirection can be great at solving
technical problems but carry a growing cost that is being ignored. Everyone would agree that software menus,
pioneered by Xerox and Apple in the late 70’s and early 80’s, have hugely
empowered the average person to use computers productively. Instead of being forced to learn and remember
the syntax of specific commands and codes, you could look in a menu of commands
and select the one that looked like it would do what you wanted to do. You could begin using a computer productively
with very little orientation.
As
the underlying programs became more powerful, more menus were necessary and
each menu became bigger. Where should
all these menus and items go? One
solution: menus that contain other menus, known as “walking menus”. Once programmers made this jump they took it
to infinity by nesting menus inside of menus of menus. Each menu pops up to the right of the parent
menu until they reach the far right of the screen, and then … the menus start
walking back the other way?
One
problem with this is the dexterity required to control a mouse as you travel
down all these paths, without letting the menus disappear, especially when you
make the wrong selection somewhere in the process and want to back up partially
without starting all over.
The
real problem, however, is that underlying complexity of the program is not
overcome by placing the operations into hierarchical menus. A user who wants to change a particular
setting in Microsoft Outlook now has to search all the menus not for the
setting but for a sub-menu that might contain another menu or set of tabbed
panels for controlling the setting. The
search literally grows exponential in size as the depth of the menus and tabbed
panels increases, and the brain is correspondingly unable to cope with it. The underlying exponential growth in
alternatives introduced by each layer is what gives rise to Pete’s rule.
What
happens is that feature-creep in programs like Word and Outlook outstrips our
ability not only to understand or intuit the behavior of the features but even
to keep track of them in a depth-first hunt through the trees of settings and
panels. The problem is that our brains
can no longer confront such complexity directly by simply looking at all the
options at once, but must rely on very limited short-term memory to hunt
through an exponential, non-linear forest of options. Or, we can train ourselves to remember all
the relevant options, but this is a costly long-term option. There seems to be almost no industry
competition based on reducing the underlying complexity itself, however.
Instead,
features are creeping into other electronic appliances, not just PC software,
to the point that the designers are moving to menus and other levels of
indirection like shift-keys and mode-toggle switches. My universal remote control has a 5-way
switch to control different devices using the same buttons, plus a blue shift
button to activate alternate meanings of the buttons for a given appliance.
Cell-phones
have already blossomed with menus and soft-keys that make it harder to use
basic features like the phone’s directory.
But what I dread most is the introduction of menus to car electronics. The temptation is strong, because there just
isn’t a lot of room for more direct-action buttons and knobs, and the features
keep exploding: climate control, cruise-control, ride-control, seat position,
radio, phone, navigation, security, and on and on. The electrical engineers and designers are
finding that once the dash is covered in buttons and displays, each button,
knob, and display must do double and triple-duty at least, and menus are the
best-known way to do this.
With
short-term memory already in heavy demand just to drive a car, how deep can our
brains go into these menus while driving?
Probably not more than one level deep. I know that this sounds like an old man
complaining about how hectic modern life is.
I’m sure that children growing up now will master the complexity, even
if it means that driving in the future will be like playing an intense video
game and we old timers will have to stay on the side roads if we venture out at
all. But I also think that there is
opportunity for the electronics and auto industries to compete by developing
simpler systems and especially by avoiding levels of indirection in user
interfaces of devices that already should be getting as much of our attention
to operate as we can give.
The
software community, which is so quick to embrace indirection, is finding it to
be increasingly irritating. Looking at a
large program, programmers ask themselves: when is this forsaken code going to
actually do something? When you trace
down the calls from a function like write() in C or,
worse, Java, it seems like the data never gets written to disk. At the lowest level, if you can find it, you
will see some layers that represent relatively concrete things like the disk
hardware registers and the file system.
Those might be hidden under representations of the operating system and,
on top of that, things like text streams, data buffers, and ultimately the
platonic ideal of “output” itself.
It’s
reached the point where new programmers don’t need to spend time learning how
to use a programming language to write a particular algorithm. What they need to learn is how to understand
and extend a vast existing software “framework”. Software frameworks are the current trend in
the continuing process of shrouding the functional nuggets in ever more
elaborate cocoons. Frameworks, APIs,
toolkits, application servers; they all promise to improve productivity by
doing almost all of “the work” for you.
But all this work is mainly just the overhead introduced by the
framework itself.
The
terminology of these frameworks: packages, struts, cocoon, foundation classes,
etc. suggest things that contain other things or prop them up; not things that
actually do anything themselves. Largely
they perform functions that are crucial to the existence of the framework
itself, not to the operation of the application using it.
A
huge learning curve presented by commercial software environments follows from
the growth of the frameworks.
Productivity is not rocketing up the way it should be when the framework
“does all the work for you.” Instead,
productivity goes down while you master the framework, and then just when you
start to get productive, the framework is revolutionized on you, as it has, for
example, with server-side Java about five times in the first 8 years of its
existence.
Pete’s rule
could explain why the mental circumscription provided by object-oriented
software classes provides relief, but the open-ended, hidden multiple layers of
indirection from class inheritance feel more like a burden.
Brooks
showed us how the complexity of communications in a team grows quadratically with the size of the team and leads to
diminishing and then negative returns when the team grows into the thousands.
The exponential complexity growth
introduced by adding more layers of indirection will overwhelm an engineer’s
ability to understand a system fully in only a dozen layers or so. And if you are trying to use and learn such a
system, even two layers will use up a good chunk of your short-term capacity.
There are at least two ways
to avoid boggling the minds of users and programmers with levels of
indirection. Both of these ideas are
being explored but they haven’t made much of an impact yet. The first idea is the harder one: keep it
simple. It’s not impossible. For example, a small business like a dry
cleaner or oil change shop can handle all of its IT needs, including keeping a
good size customer database, with a 70’s vintage DOS-style program that has at
most two layers, implemented with different screen layouts that you switch in
and out of with a button, and a special purpose keyboard template. Why does a retiree with a PC at home need a
hierarchical file system and a professional-style WYSIWYG publishing system in
order to read email from her family and write short letters?
We
could also use technology to try to overcome Pete’s Rule. Just as google
eliminates the need to know the web address of any particular site, we could
use search to end the need for explicitly naming almost every item in a
computer. We don’t give names to our
shoes or clothes. We shouldn’t have to
name everything in our computers. We
also don’t need to define a particular place where every household item is going
to stay. Similarly, every computer item
should just be in the computer, not nested into a hierarchical structure, and
when we want one, we can look for it. If I can find a particular song lyric on
the web using search I should be able to find any paper or picture on my
computer the same way.
Layers
of indirection are to software and product design as alcohol is to Homer
Simpson: the cause of, and solution to, all of life’s problems. A professor of mine told me that his professor liked to say that the
solution to every problem in computer science has always been another layer of
indirection. It remains just as alluring
today. But now both the public and the
engineers are at the point of diminishing returns. When the answer to your technical problem is
obviously another layer of indirection, maybe you have a problem. Spare yourself, your colleagues, and your
users and at least consider taking the harder path of simplifying or even
starting over.
Copyright 2003 Peter
Prokopowicz
[pnp1]Symlinks are another indirection that are still challenging
the experts and boggling everyday users.
See Rob Pike's
"Lexical file names in Plan 9: Getting .. right"