Computer science
is largely the study of naming. The addresses in memory
systems, the variables in programs, the pathnames in filesystems,
and the URLs printed in magazine ads are all names. Giving
something a name gives us power over it, enabling us to
save it for later. Or tell someone else about it.
The designers
of early timesharing systems soon realized that they could
do something better than simply allowing multiple users
to timeslice an expensive computer---they could create
an environment where people could share information by
naming it and then telling each other about it. Thirty
years later, a similar revolution is occurring in computer
networking with the introduction of a naming mechanism---the
URL---that again allows people to easily name and share
information.
But this power
to share things does not come for free. There is always
a small amount of overhead, roughly proportional to the
number of outstanding copies of the name, required to
support sharing. As an example, consider a popular Web
site---the more copies of the site's URL that get printed
in newspapers, recited on radio, and scribbled on napkins
over lunch, the more CPU power and network bandwidth will
be required to support that site.
The costs of
sharing are incurred by any naming mechanism, but are
often small enough to be ignored when all computation
takes place locally. But in a distributed computing environment,
where processes and data are spread out over a network
of processing elements, the costs of sharing can be particularly
pernicious.
These observations
lead us to investigate a restricted form of naming "linear"
naming as a basis for distributed computing. A linear
name cannot be duplicated, so it requires only a small
fixed amount of overhead to implement. We have implemented
network protocols to support linear names that are simple
and cheap, and that allow processes and data to gracefully
move from processing element to processing element. Linear
names also give us a notion of "locality" that we have
used to heuristically minimize network traffic. In short,
linear names are a practical light-weight glue for holding
a networked computation together.