I’ve been talking for years about “the internet operating system“,
but I realized I’ve never written an extended post to define what I
think it is, where it is going, and the choices we face. This is that
missing post. Here you will see the underlying beliefs about the future
that are guiding my publishing program as well as the rationale behind conferences I organize like the Web 2.0 Summit and Web 2.0 Expo, the Where 2.0 Conference, and even the Gov 2.0 Summit and Gov 2.0 Expo.
Ask yourself for a moment, what is the operating system of a Google or
Bing search? What is the operating system of a mobile phone call? What
is the operating system of maps and directions on your phone? What is
the operating system of a tweet?
On a standalone computer, operating systems like Windows, Mac OS X, and
Linux manage the machine’s resources, making it possible for
applications to focus on the job they do for the user. But many of the
activities that are most important to us today take place in a
mysterious space between individual machines. Most people take
for granted that these things just work, and complain when the daily
miracle of instantaneous communications and access to information breaks
down for even a moment.
But peel back the covers and remember that there is an enormous,
worldwide technical infrastructure that is enabling the always-on future
that we rush thoughtlessly towards.
When you type a search query into Google, the resources on your local
computer – the keyboard where you type your query, the screen that
displays the results, the networking hardware and software that connects
your computer to the network, the browser that formats and forwards
your request to Google’s servers – play only a small role. What’s more,
they don’t really matter much to the operation of the search – you can
type your search terms into a browser on a Windows, Mac, or Linux
machine, or into a smartphone running Symbian, or PalmOS, the Mac OS,
Android, Windows Mobile, or some other phone operating system.
The resources that are critical to this operation are mostly somewhere
else: in Google’s massive server farms, where proprietary Google
software farms out your request (one of millions of simultaneous
requests) to some subset of Google’s servers, where proprietary Google
software processes a massive index to return your results in
milliseconds.
Then there’s the IP routing software on each system between you and
Google’s data center (you didn’t think you were directly connected to
Google did you?), the majority of it running on Cisco equipment; the
mostly open source Domain Name System, a network of lookup servers that
not only allowed your computer to connect to google.com in the first
place (rather than typing an IP address like 74.125.19.106), but also
steps in to help your computer access whatever system out there across
the net holds the web pages you are ultimately looking for; the
protocols of the web itself, which allow browsers on client computers
running any local operating system (perhaps we’d better call it a bag of device drivers) to connect to servers running any other operating system.
You might argue that Google search is just an application that happens
to run on a massive computing cluster, and that at bottom, Linux is
still the operating system of that cluster. And that the internet and
web stacks are simply a software layer implemented by both your local
computer and remote applications like Google.
But wait. It gets more interesting. Now consider doing that Google
search on your phone, using Google’s voice search capability. You speak
into your phone, and Google’s speech recognition service translates the
sound of your voice into text, and passes that text on to the search
engine – or, on an Android phone, to any other application that chooses
to listen. Someone familiar with speech recognition on the PC might
think that the translation is happening on the phone, but no, once
again, it’s happening on Google’s servers. But wait. There’s more.
Google improves the accuracy of its speech recognition by comparing what
the speech algorithms think you said with what its search system (think
“Google suggest“)
expects you were most likely to say. Then, because your phone knows
where you are, Google filters the results to find those most relevant to
your location.
Your phone knows where you are. How does it do that? “It’s got
a GPS receiver,” is the facile answer. But if it has a GPS receiver,
that means your phone is getting its position information by reaching
out to a network of satellites originally put up by the US military. It
may also be getting additional information from your mobile carrier that
speeds up the GPS location detection. It may instead be using “cell
tower triangulation” to measure your distance from the nearest cellular
network towers, or even doing a lookup from a database that maps wifi
hotspots to GPS coordinates. (These databases have been created by
driving every street and noting the location and strength of every Wi-Fi
signal.) The iPhone relies on the Skyhook Wireless
service to perform these lookups; Google has its own equivalent,
doubtless created at the same time as it created the imagery for Google Streetview.
But whichever technique is being used, the application is relying on
network-available facilities, not just features of your phone itself.
And increasingly, it’s hard to claim that all of these intertwined
features are simply an application, even when they are provided by a
single company, like Google.
Keep following the plot. What mobile app (other than casual games)
exists solely on the phone? Virtually every application is a network
application, relying on remote services to perform its function.
Where is the “operating system” in all this? Clearly, it is still
evolving. Applications use a hodgepodge of services from multiple
different providers to get the information they need.
But how different is this from PC application development in the early
1980s, when every application provider wrote their own device drivers to
support the hodgepodge of disks, ports, keyboards, and screens that
comprised the still emerging personal computer ecosystem? Along came
Microsoft with an offer that was difficult to refuse: We’ll manage the
drivers; all application developers have to do is write software that
uses the Win32 APIs, and all of the complexity will be abstracted away.
It was. Few developers write device drivers any more. That is left to
device manufacturers, with all the messiness hidden by “operating system
vendors” who manage the updates and often provide generic APIs for
entire classes of device. Those vendors who took on the pain of managing
complexity ended up with a powerful lock-in. They created the context
in which applications have worked ever since.
This is the crux of my argument about the internet operating system. We
are once again approaching the point at which the Faustian bargain will
be made: simply use our facilities, and the complexity will go away.
And much as happened during the 1980s, there is more than one company
making that promise. We’re entering a modern version of “the Great Game“, the rivalry to control the narrow passes to the promised future of computing. (John Battelle calls them “points of control“.)
This rivalry is seen most acutely in mobile applications that rely on
internet services as back-ends. As Nick Bilton of the New York Times
described it in a recent article comparing the Google Nexus One and the iPh