Is HTML a protocol or a format? What does “Layer 7” mean? What are RFCs?
What is the difference between the Web and the Internet?
For the vast majority of people, even many people in technology, the answer to all those questions will be “I don’t know.”
And that’s the way it should be. Most of us drive cars, but very few of us could pick apart an engine, much less build one from scratch.
But everyone who owns a car knows the difference between the car and the engine: engines come in different forms, they can be used for vehicles and machines other than cars, and so forth.
Most people, when told “the car is not the engine” would probably say “obviously,” while very few if any would be surprised by the statement.
But when dealing with information technologies, the situation is reversed.
The Browser is not the Web. An App is not the Internet. Word isn’t Windows. Google is not your Internet connection. Many people would be surprised, if not confused, by some or all of these statements. Only a minority would think “obviously.”
If you are in that minority, you may be surprised that this is the case, but it is.
Cars and engines are physical; the fact that they are two different things — albeit tightly interconnected, even interdependent to different degrees — is almost axiomatic. We have used, seen, or heard of, or studied instances in which engines exist without cars, which is why the concepts are easy to perceive as separate. They are physical constructs, they can be touched, while the fact that they typically break down in nearly binary terms (ie., the car either starts or doesn’t, the fridge either works or doesn’t) is also useful.
On the other hand, information technologies seem to blend into each other, are intertwined in ways invisible to the layperson. The multitude of layers, software, services, providers, is a key element in the Internet’s strength, but it also makes it more opaque for people who use it. The massive efforts of engineers over the last few decades have resulted in a multitude of recovery and error states in which it’s hard to tell what, exactly, is wrong. When a short video is “buffering” for ten minutes, is it because the Internet is slow? Is it because of a router rebooting somewhere? The cable modem? The PC? The Browser? The Browser’s plugin? Or any of the ten other possible problems we haven’t thought of? And that’s even when you have a reasonable understanding on the myriad of components involved along the path for delivering one particular service.
Until a few years ago, most people didn’t experience “the Internet” through anything other than a Web browser and their email app. Even now, as everything from your scale to your security system to your gaming console starts to connect to (and often, depend on) the Internet, the exact nature of what’s going on remains elusive. “Reboot the router” has become the new “Reboot the PC.”
To add to the confusion, hardware itself has become more integrated, more difficult to separate into discrete components.
The browser becomes the operating system, the operating system becomes the display, The display becomes the computer. The computer becomes the Internet.
The Web, Browsers, Email, Operating Systems, not only they all seem linked to varying degrees that are hard to pull apart; they have also evolved in an apparently seamless progression that has had the unintended effect of blurring the lines between them. You can see web pages in your email, you can check your email inside a browser, and so on.
It’s as if combustion engines existed only as part of, say, four door cars. After decades of only seeing engines in four-door cars, it would be natural to mix them up. Only mechanics would be left to argue that an engine could be used to power other vehicles, or other machinery.
Why does this matter?
This matters because over time it seems hard to imagine that parts of the whole can be repurposed or used for anything but what we use them for.
With software, it’s far too easy to conflate form and function. And when the software in question involves simple (but not simplistic), fundamental abstractions, it’s easy for those abstractions to swallow everything else.
The Web browser paradigm, for example, has been so spectacularly successful that it seems nothing could exist in its place, particularly when Web browsers, designed primarily to support HTTP and HTML, subsumed Gopher or FTP early on, and entire runtime environments and applications later.
At the dawn of the Web, browsers implemented various protocols, and someone that accessed, for example, an FTP site understood that the browser was (in a sense) “pretending” to be an FTP client. Today, if the average user happens to stumble, via hyperlinks, onto an FTP server, they will be unlikely to recognize the fact that they’re not looking at a website. They probably won’t notice that the URL starts with “ftp://” (most modern browsers hide the protocol portion of URLs, some hide URLs altogether) and if they do, they probably won’t know what it means. They may even think that the site in question is poorly designed.
Over time, it has become harder to conceive of anything that doesn’t rely on this paradigm and we try to fit all new concepts into it. One implementation of the idea of hypertext has achieved in a sense a form of inevitability. It seems inescapable that this is what hypertext should be.
But of course it isn’t.
The Web, Email, Calendars, Address Books, Filesystems: they are all in a sense elemental. Their primal nature and scale of deployment makes it easy to mix up their current incarnation with their essential form and function.
When thinking of the information we use in various tasks and contexts, it’s nearly impossible to avoid putting information not just in categories but in whatever is the most common way in which we use them. Computing has exposed the true nature of data as an ethereal, infinitely malleable construct, but we think in analogies and physical terms. A document becomes a web page, or a file. A message becomes an email, or a ‘text’. An appointment becomes a calendar entry. The phrase “following a hyperlink” will create in a lot of people the mental image of clicking or tapping on blue underlined text within a Web page, in a Web browser, even if the concept of a hyperlink is not tied uniquely to HTML, or the World Wide Web, underlined text, or, perhaps most obviously, the color blue.
Avoiding these categories is, admittedly, difficult. The software that, in our collective psyche, has come to be associated so strongly to those categories has remained nearly unchanged in its basic behavior and function since it was first developed, decades ago, which makes it seem even more like unchangeable features of our information landscape, rather than the very first step in a long evolutionary process.
If we are ever going to advance beyond the initial paradigms and metaphors of data rooted in the physical world, we need to reclaim some of these terms.
The future of the Internet should not be tied down to the idea of the Web. A lot of it will happen through interfaces other than a web browser. New protocols could supplement old ones, new formats will coexist with those that dominate.
We just have to keep in mind that none of it is fixed, and constantly question the assumptions behind it, looking for a better way.
That’s why I often use the term “Internet” to describe not just one aspect of it but the whole that comprises it. Protocols, servers, routers, your home computer or tablet, a Web page, all of these are parts of the Internet Experience. Using “Internet” in this way is in strict terms inaccurate, but it is also necessary.
It’s a way to escape the confines of terms that are overloaded and overused to such a degree that they restrict our imagination.
Because, maybe, by embracing the notion that the Internet is “everything” we can, hopefully, rediscover the fundamental truth that the Internet can be anything.
Data is more than just bits.
Information is more than Data.
The Internet is more than the Web.