diego's weblog

there and back again

Category Archives: software

encryption is bad news for bad guys! (and other things we should keep in mind)

Once again, a senseless act of violence shocks us and enrages us. Prevention becomes a hot topic, and we end up having a familiar “debate” about technology, surveillance, and encryption, more specifically, how to either eliminate or weaken encryption. Other topics are mentioned in passing (somehow, gun control is not), but ‘controlling’ encryption seems to win the day as The Thing That Apparently Would Solve A Lot Of Problems.

However, as of now, there is zero indication that encryption played any part in preventing security services from stopping the Paris attacks. There wasn’t a message with a date and names and a time, sitting in front of a group of detectives, encrypted.

I feel obligated to mention this, even if it should be obvious by now. “If only we could know what they’re saying,” sounds reasonable. It ignores the fact that you need incredibly invasive, massive non-stop surveillance of everyone, but setting that tiny detail aside it comes back to the (flawed) argument of “you don’t need encryption if you have nothing to hide.”

First off, needing to hide something doesn’t mean you’re a criminal. Setting aside our own intelligence and military services, this is what keeps Chinese dissidents alive (to use one of a myriad examples), and I’m sure there are a few kids growing up in ISIS-controlled areas that are using encrypted channels to pass along books, movies (plus, probably some porn), or to discuss how to get the hell out of there. In less extreme territory, hiding is instrumental in many areas of everyday life, say, planning surprise parties. Selective disclosure is a necessary component in human interaction. 

There’s only one type of debate we should be having about encryption, and it is how to make it easier to use, more widespread. How to make it better, not how to weaken it.

Because encryption can’t be uninvented, and, moreover, widespread secure communications don’t help criminals or terrorists–it hurts them.

(1) Encryption can’t be uninvented

A typical first-line-of-defense argument for encryption goes: “eliminating or weakening encryption does nothing to prevent criminals or terrorists  from using encryption of their own.” Any criminals or terrorists (from now on “bad guys”) with minimal smarts would know how to add their own encryption layer to any standard communication channel. The only bad guys you’d catch would be either lazy, or stupid.

“Aha!” Says the enthusiastic anti-encryption advocate. “That’s why we need to make sure all the algorithms contain backdoors.” What about all the books that describe these algorithms before the backdoors? Would we erase the memory of the millions of programmers, mathematicians, or anyone that’s ever learned about this. And couldn’t the backdoors be used against us? Also get this: you don’t even need a computer to encrypt messages! With just pen and paper you can effectively use any number of cyphers that in some cases are quite strong (e.g., one-time use pads, or multilayered substitution cyphers, etc.) Shocking, I know.

The only way to “stop” encryption from being used by bad guys would be to uninvent it. Which, hopefully, we can all agree is impossible.

Then there’s the positive argument for encryption. It’s good for us, and bad for bad guys.

(2) Herd immunity, or, Encryption is bad for bad guys

Maybe we in technology haven’t done a good job of explaining this to law enforcement or politicians, or the public at large, but there’s a second, more powerful argument that we often fail to make: widespread secure & encrypted communications and data storage channels hinders, not helps, criminals, terrorists, or other assorted psychos.

That’s right. Secure storage and communications hurts bad guys.

Why? Simple: because bad guys, to operate, to prepare, obtain resources, or plan, need three things: money, time, and anonymity. They obtain these by leeching off their surroundings.

More and more frequently terrorists finance their activities with cybercrime. Stealing identities and credit cards, phishing attacks, and so forth. If everyone’s communications and storage (not just individuals but also banks, stores, etc) was always encrypted and more secure, criminals would have a much harder time financing their operations.

That is, to operate with less restrictions bad guys need to be able to exploit their surroundings. The more protected their surroundings are, the more exposed they are. More security and encryption also mean it’s harder to obtain a fake passport, create a fake identity, or steal someone else’s.

Biosafety experts have a term for this: Herd Immunity. Vaccines work only when in widespread use, for two reasons. First, the higher the percentage of immune individuals, the fewer avenues a disease has to spread, but, as importantly, the less probability that a non-immune individual will interact with an infected individual.

More advanced encryption and security also helps police agencies and security services. If the bad guys can’t get into your network or spy on your activities, you have more of a chance of catching them. The first beneficiaries of strong encryption are the very agencies tasked with defending us.


Dictatorships and other oppressive regimes hate encryption for a reason. Secure, widespread communication also strengthens public discourse. It makes communication channels harder to attack, allowing the free flow of information to continue in the face of ideologies who want nothing less than to shut it down and lock everyone into a single way of thinking, acting, and behaving.


(Postscript) Dear media: to have a real conversation we need your help, so get a grip and calm down. 

The focus on encryption is part of looking for quick fixes when there aren’t any. In our fear and grief we demand answers and “safety,” even to a degree that is clearly not possible. We cannot be 100% safe. I think people in general are pretty reasonable, and know this. But it’s kind of hard to stay that way when we are surrounded by news reports that have all the subtlety and balance of a chicken running around with its head cut off. We are told that the “mastermind” (or “architect”) of the attack is still at large. We hear of “of an elaborate international terror operation.” On day 3, the freakout seems to be intensifying, so much so that a reporter asks the President of the United States: “Why Can’t We Take Out These Bastards?

The Paris attacks were perpetrated by a bunch of suicidal murderers with alarm clocks, a few rifles, bullets, and some explosives. Their “plan” amounted to synchronizing their clocks and then start firing and/or blow themselves up on a given date at roughly the same time, a time chosen for maximum damage.

“Mastermind”? Reporters need to take a deep breath and put things in context. This wasn’t complicated enough to be “masterminded.” We’re not dealing with an ultra-sophisticated criminal organization headed by a Bond villain ready to deploy a doomsday device. This is bunch of thugs with wristwatches and Soviet-era rifles. They are lethal, and we need to fight back. But they are not an existential threat to our civilization. We are stronger than that.

With less of an apocalyptic tone to the reporting we could have a more reasonable conversation about the very real and complex reality behind all of this. Naive? Maybe. Still —  it doesn’t hurt to mention it.



all your tech are belong to us: media in a world of technology as the dominant force

Pop quiz: who held the monopoly on radio equipment production in the US in 1918?

General Electric? The Marconi Company?

Radio Shack? (Jk!) :)

How about the US Military?

The US entered World War I “officially” in early April, 1917. Determined to control a technology of strategic importance to the war effort, the Federal Government took over radio-related patents owned by companies in the US and gave the monopoly of manufacturing of radio equipment to the Armed Forces — which at the time included the Army, the Navy, the Marine Corps, and the Coast Guard.

This takeover was short-lived (ending in late 1918) but it would have profound effects in how the industry organized in the years and decades that followed. The War and Navy departments, intent on keeping the technology under some form of US control, arranged for General Electric to acquire the American Marconi company and secure the patents involved.

The result was Radio Corporation of America, RCA, a public company whose controlling interested was owned by GE.

Newspapers had been vertically integrated since their inception. The technology required for printing presses and the distribution networks involved in delivering the product were all “proprietary,” in that they were controlled and evolved by the newspapers themselves. Even if the printing press had other uses, you couldn’t easily repurpose a newspaper printing press to print books, or viceversa, and even if you could secure a printing press for newspapers (a massive investment) you could not hope to easily recreate the distribution network required to get the newspaper in the hands of consumers.

This vertical integration resulted in a combination of natural and artificial barriers of entry that would let a few key players, most notably William Randolph Hearst, leverage the resulting common economic, distribution and technological foundation to effect a consolidation in the market without engendering significant opposition. Later, Movie studios relied on a similar set of controls over the technology employed — they didn’t manufacture their own cameras but by controlling creation and distribution, and with their aggregate purchase power, they could dictate what technology was viable and how it was to be used.

Radio, early on, presented the possibility of a revolution in this regard. It could have allowed consumers to also be creators (at least in a small scale). The ability to broadcast was restricted by the size and power of the transmitter at your disposal, and you could start small. It was the first opportunity for a new medium to have the evolution of the underlying technology decoupled from the content it carried, but WWI and the intervention of the US government ensured this would not come to pass. The deal that resulted in the creation of RCA created, in effect, a similar vertical integration in Radio as in other mediums (in Britain, a pioneer of broadcast radio and later TV, the government had been largely in control from the beginning through the BBC, and so already was “vertically integrated”).

This is a way of thinking that became embedded into how Media companies operated.

RCA went on to be at the center of the creation of the two other subsequent major media markets of the 20th century: music and television, and in both cases it extended the notion of technology as subservient to the content that it carried.

For every major new medium that appeared until late in the 20th century, media companies could control the technology that they depended on.

Over time, even as technology development broke off into its own path and started to evolve separately from media, media companies retained control of both the standards and the adoption rate (black and white to color, vinyl to CD, SD to HD, etc.). Media companies selected new technologies when and how they wanted, and they set the terms of use, the price, and the pace of its deployment. Consumers could only consume. By retaining control of the evolution of the technology through implicit control of standards, and explicit control of the distribution channels, they could retain overall control of the medium. Slowly, though, the same technology started to be used for more than one thing, and control started to slip away.

Then the Internet came along.

The great media/technology decoupling

TV, radio, CDs, even newspapers are all “platforms” in a technical sense, even if closed ones, in that they provide a set of common standards and distribution channels for information. In this way, the Internet appears to be “just another platform” through which media companies must deliver their content. This has led to the view that we are simply going through a transition not unlike that of, say, Vinyl to CDs, or Radio to TV.

That media companies can’t control the technology as they used to is clear. What is less clear is that this is a difference of kind, not of degree.

CNN can have a website, but it can neither control the technology standards or software used to build it or ensure that the introduction of a certain technology (say, Adobe Flash) will be followed by a period of stability long enough to ensure recouping the investment required to use it. NBC can post shows online, but it can’t prevent millions of people from downloading the show without advertisement through other channels. Universal Studios can provide a digital copy of a movie six months after its release, but in the meantime everyone that wanted to watch it has, often without paying for it. These effects and many more are plainly visible, and as a result, prophecies involving the death of TV, the music industry, newspapers, movie studios, or radio, are common.

The diagnoses are varied and they tend to focus, incorrectly, on the revenue side of the equation: it’s the media companies’ business models which are antiquated. They don’t know how to monetize. Piracy is killing them. They can’t (or won’t) adapt to new demands and therefore are too expensive to operate. Long-standing contracts get in the way (e.g. Premium channels & cable providers). The traditional business models that supported mass media throughout their existence are being made increasingly ineffective by the radically different dynamics created by online audiences, ease of copying and lack of ability to create scarcity, which drive down prices.

All of these are real problems but none of them is insurmountable, and indeed many media concerns are making progress in fits and starts in these areas and finding new sources of revenue in the online world. The fundamental issue is that control has shifted, irreversibly, out of the hands of the media companies.

For the first time in the history of mass media, technology evolution has become largely decoupled from the media that uses it, and, as importantly, it has become valuable in and of itself. This has completely inverted the power structure in which media operated, with media relegated to just another actor in a larger stage. For media companies, lack of control of the information channel used is behind each and every instance of a crack in the edifice that has supported their evolution, their profits, and their power.

Until the appearance of the Internet it was the media companies that dictated the evolution of the technology behind the medium and, as critically, the distribution channel. Since the mid-1990s, media companies have tried and generally failed to insert themselves as a force of control in the information landscape created by the digitalization of media and the Internet. Like radio and TV, the Internet includes a built in “distribution channel” but unlike them it does not lend itself to natural monopolies apportioned by the government of that channel. Like other media, the Internet depends on standards and devices to access it, but unlike other media the standards and devices are controlled, evolved, and manufactured by companies that see media as just another element of their platforms, and not as a driver of their existence.

This shift in control over technology standards, manufacture, demand, and evolution is without precedent, and it is the central factor that drives the ongoing crisis media finds itself since the early 90s.

Now what?

Implicitly or explicitly, what media companies are trying to do with every new initiative and every effort (DRM, new formats, paywalls, apps) is to regain control of the platform. Given the actors that now control technology, it becomes clear why they are not succeeding and what they must do to adapt.

In the past, they may have attempted to purchase the companies involved in technology, fund competitors, and the like. Some of this is going on today, with the foremost examples being Hulu and Ultraviolet. As with past technological shifts, media companies have also resorted to lobbying and the courts to attempt to maintain control, but this too is a losing proposition long-term. Trying to wrest control of technology by lawsuits that address whatever the offending technology is at any given moment, when technology itself is evolving, advancing, and expanding so quickly, is like trying to empty the ocean by using a spoon.

These attempts are not effective because the real cause of the shift in power that has occurred is beyond their control. It is systemic.

In a world where the market capitalization of the technology industry is an order of magnitude or more than that of the media companies (and when, incidentally, a single company, Apple, has more cash in hand than the market value of all traditional media companies combined), it should be obvious that the battle for economic dominance has been lost. Temporary victories, if any, only serve to obfuscate that fact.

The media companies that survive the current upheaval are those that accept their new role in this emerging ecosystem: one of an important player but not a dominant one (this is probably the toughest part). There still is and there will continue to be demand for content that is professionally produced.

Whenever people in a production company, or a studio, or magazine, find themselves trying to figure out which technology is better for the business, they’re having the wrong conversation. Technology should now be directed only by the needs of creation, and at the service of content.

And everyone needs to adapt to this new reality, accept it, and move on… or fall, slowly but surely, into irrelevance.

When All You Have Is A Hammer

(x-post to medium)

Today’s software traps people in metaphors and abstractions that are antiquated, inefficient, or, simply, wrong. New apps appear daily and with a few exceptions they simply cement the dominance of these broken metaphors into the future.

Uncharacteristically, I’m going to skip a digression on what are the causes of this and leave that for another time (*cough*UI guidelines*cough*), and go straight to the consequences, which must begin with identifying the problem. I could point at the “Desktop metaphor” including its influence on the idea of “User Interfaces” as the source of many problems (people, not users!), but I won’t.

I’ll just focus on a simple question: Can you print it?

If You Can Print It…

Most of the metaphors and abstractions that deal with “old” media, content, and data simply haven’t evolved beyond being a souped-up digital version of their real-world counterparts. For example: you can print a homepage of the New York Times and it wouldn’t look much different from the paper version.

You could take an email thread, or a calendar and print it.

You can print your address book.

Consider: If you showed any of these printouts to someone from the early 20th century, they would have no problem recognizing them and the only thing that they may find hard to believe about it would be how good the typesetting is (or they would be surprised by the pervasive use of color).

Our thinking in some areas has advanced little beyond just creating a digital replica of industrial-era mechanisms and ideas: not only they can be printed: Data would be lost (e.g. IP routing headers), but little to no information would be lost. With few exceptions⁠ (say, alarms built into the calendar.), they can be printed without significant loss of fidelity or even functionality.

On the flip side, you could print a Facebook profile page, but once put in these terms, we can see that the static, printed page does not really replicate what a profile page is: something deeply interactive, engaging, and more than what you can just see on the surface.

Similarly, you could actually take all these printouts and organize them in basically the same way as you would organize them in your computer or online (with physical folders and shelves and desks) and you’d get basically the same functionality. Basic keyword searching is the main feature that you’d lose, but as we all now from our daily experience, for anything that isn’t Google, keyword search can a hit-and-miss proposition.

This translation of a printed product into a simple digital version (albeit sprinkled with hyperlinks) has significant effects in how we think about information itself, placing constraints on how software works from the highest levels of interaction to the lowest levels of code.

These constraints express themselves as a pervasive lack of context: in terms of how pieces of data relate to each other, how it relates to our environment, when we did something, with whom, and so on.

Lacking context, and because the “print-to-digital” translation has been so literal in so many cases, we look at data as text or media with little context or meaning attached, leading modern software to resort to a one-word answer for anything that requires we find a specific piece of information.


Apparently, There Is One Solution To All Our Problems

Spend a moment and look at the different websites and apps on your desktop, your phone, or your tablet. Search fields, embodied in the iconic (pun intended)⁠1 magnifying glass, are everywhere these days. The screenshot below includes common services and apps, used daily by hundreds of millions of people.⁠

Search is everywhere

Search is everywhere

On the web, a website that doesn’t include a keyword-based search function is rare, and that’s even considering that the browser address bar has by now turned into a search field as well.

While the screenshot is of OS X, you could take similar screenshots on Windows or Linux. On phones and tablets, the only difference is that we see (typically) only one app at a time.

Other data organization tools, like tags and folders, have also increased our ability to get to data by generally flattening the space we search through.

The fact that search fields appear to be reproducing at an alarming rate is not a good sign. It’s because search is inherently bad (or inherently good, for that matter). It’s a feature that is trying to address real problems, but attempting to cure a symptom rather than the underlying cause. It’s like a pilot wearing a helmet one size too small and taking aspirin for the headache instead of getting a bigger helmet.

Whether on apps or on the web, these search engines and features are good at returning a lot of results quickly. But that’s not enough, and it’s definitely not what we need.

Because searching very fast through a lot of data is not the same as getting quickly to the right information.

Et tu, Search?

Score one for iconography: Search behaves, more often than not, exactly like a magnifying glass: as powerful and indiscriminate in its amplification as a microscope, only we’re not all detectives looking for clues or biologists zooming in on cell structure. What we need is something closer to a sieve than a magnifying glass: something that naturally gives us what we care about while filtering out what we don’t need.

Inspector Clouseau, possibly searching for a document

Inspector Clouseau, possibly searching for a document

Superficially, the solution to this problem appears to be “better search”, but it is not.

Search as we understand it today will be part of the solution, a sort of escape hatch to be used when more appropriate mechanisms fail. Building those “appropriate mechanisms,” however, requires confronting that software is, by and large, utterly unaware of either context or the concept of time beyond their most primitive forms — more likely to try to impose on us whatever it thinks is the “proper” way to do something than adapting to how we already work and think, and frequently focused on recency at the expense of everything else. Today’s software and services fail to both associate enough contextual and chronological information, and leverage effectively the contextual data that is available once we are in the process of retrieving or exploring.

Meaning what, exactly? Consider, for example, the specific case of trying to find the address of a place where you’re supposed to meet a friend in an hour. You created a calendar entry but neglected to enter the location. With today’s tools, you’d have to search through email for your friend’s name, and more likely than not you’d get dozens of email threads that have nothing to do with the meeting. If software used context even in simple ways, you could, as a primitive example, just drag the calendar entry and drop it on top of a list of emails, which the software would interpret as wanting to filter emails by those around the date in which the entry was created. The amount of emails that would match that condition would be relatively small. Drag and drop your friend’s avatar from a list of contacts and more often than not you’d end up staring at an email thread around that date that would have the information you need, no keyword search necessary.

In other words — search is a crutch. Even when we must resort to exhaustive searching, we don’t need a tool to search as much as we need a tool to find. It is very clear that reducing every possible type of information need to “searching,” mostly using keywords (or whatever can be entered into a textfield) is an inadequate metaphor to apply to the world, which is increasingly being digitized and dumped wholesale into phones, laptops and servers.

We need software that understands the world and that adapts, most importantly, to us. People. It’s not as far-fetched or difficult as it sounds.

2 idiots, 1 keyboard (or: How I Learned to Stop Worrying and Love Mr. Robot)

I’d rename it “The Three Stooges in Half-Wits at Work” if not for the fact that there are four of them. We could say the sandwich idiot doesn’t count, though, but he does a good job with his line (“Is that a videogame?”) while extra points go to the “facepalm” solution of disconnecting a terminal to stop someone from hacking a server. It’s so simple! Why didn’t I think of that before!?!?!

Mr. Robot would have to go 100 seasons before it starts to balance out the stupidity that shows like NCIS, CSI and countless others have perpetrated on brains re: programming/ops/etc.

Alternative for writers that insist in not doing simple things like talking to the computer guy that makes your studio not implode: keep the stupid, but make it hilariously, over the top funny, like so:

We’ll count it even if it’s unintentional. That’s how nice we computer people are.

PS: and, btw, this, this, is why no one gets to complain about Mr. Robot’s shortcomings.

not evenly distributed, indeed

The future is already here – it’s just not very evenly distributed.

— William Gibson (multiple sources)

The speed at which digital content grows (and at which non-digital content has been digitized) has quickly outpaced the ability of systems to aid us in processing it in a meaningful way, which is why we are stuck living in a land of Lost Files, Trending Topics and Viral Videos.

Most of those systems use centuries-old organizational concepts like Folders and Libraries, or rigid hierarchical structures that are perfectly reasonable when everything exists on paper, but that are grossly inadequate, not to mention wasteful and outdated, in the digital world. When immersed in the digital information is infinitely malleable, easily changeable, and can be referenced with a degree of precision and at scales that are simply impossible to replicate in the physical world, and we should be leveraging those capabilities far more than we do today outside of new services.

Doing this effectively would require many changes across the stack, from protocols, to interfaces, to storage mechanisms, maybe formats. This certainly sounds pretty disruptive, but is it really? Or is there a precedent for this type of change?

What We Can Learn From The Natives

“Digital native” systems like social networks and other tools and services created in the last decade continue to evolve at an increasingly rapid pace around completely new mechanisms of information creation and consumption, so a good question to ask is whether it is those services that will simply take over as the primary way in which we interact with information.

Social media, services, and apps are “native” in that they are generally unbounded by the constraints of old physical based paradigms — they simply could not exist before the arrival of the Internet, high speed networks, or powerful, portable personal computing devices. They leverage (to varying degrees) a few of the ideas I’ve mentioned in earlier posts: context, typically in the form of time and/or geographical location, an understanding of how people interact and relate to each other, a strong sense of time and semantics around the data they consume and create.

Twitter, Facebook, and others, include the concept of time as a sorting mechanism and, at best, as another way to filter search. While this type of functionality is part of what we are discussing, it is not all that what we are talking about, and just like “time” is not the only variable that we need to consider, neither will social media replace all other types of media. Each social media service is focused on specific functionality, needs, and wants. Each has its own unique ‘social contract.’

Social media is but one example of the kind of qualitative jumps in functionality and capabilities that are possible when we leverage context, even in small ways. They are proof positive that people respond to these ideas, but they are also limited — specific expressions of the use of context within small islands of functionality of the larger world of data and information that we interact with.

Back on topic, the next question is, did ‘digital natives’ succeed in part because they embraced the old systems and structures? And if so, wouldn’t that mean that they are still relevant? The answer to both questions is: not really.

Post Hoc, Ergo Propter Hoc

Facebook and Twitter (to name just two) are examples of wildly successful new services that, when we look closely, have not succeeded because of the old hierarchical paradigms embedded into the foundations of computers and the Internet, but in spite of them. To be able to grow they have in fact left behind most of the recognizable elements on which the early Internet was built. Their server-side infrastructures are extremely complex and not even remotely close to what we’d have called a “website” barely a decade ago. On the client side, they are really full-fledged applications that don’t exist in the context of the Web as a mechanism for delivering content. New services use web browsers as multi-platform runtime environments, which is also why as they transition to mobile devices more of their usage happens in their own apps, in environments they fully control. They have achieved this thanks to massive investments, in the order of hundreds of millions or billions of dollars, and enormous effort.

This has also carried its cost for the rest of the Web in terms of interconnectivity. These services and systems are in the Web, but not of it. They relate to it through tightly controlled APIs, even as they happily import data from other services. In some respects, they behave like a black hole of data, and they are often criticized for it.

This is usually considered to be a business decision — a need to keep control of their data and thus control of the future, sometimes with ominous undertones attached, and perhaps they could do more to open up their ability to interface with other services in this regard.

But there is another factor that is often overlooked and that plays a role as or more important. These services’ information graphs, structures, and patterns of interaction are qualitatively different than, and far removed from, the basic mechanisms that the Web supports. For example, some of Facebook’s data streams can’t really be shared using the types of primitive mechanisms available through the hierarchical, fixed structures that form the shared foundation of the Internet: simple HTML, URLs, and open access. Whereas before you could attach a permalink to most pieces of content, some pieces of content within Facebook are intrinsically part of a stream of data that crumbles if you start to tease it apart or that requires you to be signed in to verify whether you have access to it or not, how it relates to other people and content on the site, etc. The same applies to other modern services. Wikipedia and Google both have managed to straddle this divide to some degree, Wikipedia retaining extremely simple output structures, and Google maintaining some ability to reference portions of their services through URLs, but this is quickly changing as Google+ is embedded more deeply throughout the core service.

Skype is an example of a system that creates a new layer of routing to deliver a service in a way that couldn’t be possible before, while still retaining the ability to connect to its “old world” equivalent (POTS) through hybrid elements in its infrastructure. Because Skype never ran on a web browser, we tend not to think of it as “part of the Web,” something we do for Facebook, Twitter, and others, but it’s a mere historical accident of when it was built and the capabilities of browsers at the time. Skype has as much of a social network as Facebook does, but because it deals mostly with real time communication we don’t think of putting them in the same category as we do Facebook, but there’s no real reason for that.

Bits are bits, communication is communication.

Old standards become overloaded and strained to cover every possible need or function *coughHTML5cough*. Fear drives this, fear that instead of helping new systems would end up being counterproductive, concerns of balkanization, incompatibilities, and so forth. Those concerns are misplaced.

The fact is that new services have to discard most of the internal models and technology stacks (and many external ones) that the Internet supposedly depends on. They have to frequently resort to a “low fidelity” version of what they offer to connect to the Web in terms it can “understand.” In the past we have called these systems and services “walled gardens.” When a bunch of these “walled gardens” are used by 10, 20, or 30% of the population of the planet, we’re not talking about gardens anymore. You can’t hold a billion plants and trees in your backyard.

The balkanization of the Internet has already happened.

New approaches are already here.

They’re just not evenly distributed yet.

scenario #1

“I just know I’ve seen it before.”

You’re meeting Mike, who waits patiently while you mumble this. Browsing, navigating through files, searching. Something you were looking at just yesterday, something that would be useful… You remember telling yourself this is important, then getting sidetracked following up on the last in the list of emails you needed to exchange to set the time for the meeting, switching between checking your spam folder for misplaced messages and your calendar for available times, then a phone call… but that doesn’t help… you know you won’t find it. You briefly consider checking the browser on your laptop, but the thought of wading through two-dozen-plus spinning tabs as they load data you don’t need while trying to find something you can’t even describe precisely doesn’t sound like an inviting prospect. You give up.

The meeting moves on. You start to take some notes. Suddenly, a notification pops up but it goes away too quickly for you to see it. You don’t know what it is, so you load the app, disrupting the conversation and your note-taking. It’s a shipment tracking notification. You close the app and go back to your notes, now stuck at mid-sentence.

The flow of the conversation moves to a blog post Mike forwarded to you recently, but you can’t remember seeing it. You find the email, eventually, but after clicking on the link in the results page the window is blank and the page doesn’t finish loading. You wait five seconds. Ten. You give up, close the tab, and keep going.

Hours later, you are at home, reading through the news of the day, and you suddenly remember that blog post again. While it’s loading, you get an alert. Twin beeps. You glance at it. Meeting with Mike, 8 pm, it says. A second later, the phone beeps.

Meeting with Mike, 8 pm.

Two rooms away, you hear your tablet, beeping. You don’t need to go look at it. You know what it says.

Meeting with Mike, 8 pm.

It turns out that the time you set in the calendar entry when you originally created it was incorrect, the fixed one was a duplicate, and all your calendars are now happily notifying you of an upcoming meeting that actually happened hours ago. You dismiss the alert on your laptop, but this doesn’t do much for the alerts on your other devices.

In fact, an hour or so later, when you start using the tablet, the alert is still be there, even though it’s two hours after when the meeting should have happened. Now you’d like to finish reading what you had started earlier in the day, but the list of “cloud tabs” seems endless, and when you finally find what you want to read, you can’t remember exactly where you were in the article. You don’t want to read it all again, not now. You mark it to “read later” … and give up.

Oh, well. Maybe there’s something good on TV that you can watch on the phone.

they’re moving to agile any day now

Great story: the most obsolete infrastructure money could buy. If you know the meaning of words/acronyms like RCS, VAX, VMS, Xenix, Kermit and many others and have been waiting anxiously for a chance to see them use in actual sentences once more, here’s your chance. Choice quotes:

[…] on my first day I found that X was running what was supposedly largest VAXcluster remaining in the world, for doing their production builds. Yes, dozens of VAXen running VMS, working as a cross-compile farm, producing x86 code. You might wonder a bit about the viability of the VAX as computing platform in the year 2005. Especially for something as cpu-bound as compiling. But don’t worry, one of my new coworkers had as their current task evaluating whether this should be migrated to VMS/Alpha or to VMS/VAX running under a VAX emulator on x86-64


After a couple of months of twiddling my thumbs and mostly reading up on all this mysterious infrastructure, a huge package arrived addressed to this compiler project. […]

Why it’s the server that we’ll use for compiling one of the compiler suites once we get the source code! A Intel System/86 with a genuine 80286 CPU, running Intel Xenix 286 3.5. The best way to interface with all this computing power is over a 9600 bps serial port. Luckily the previous owners were kind enough to pre-install Kermit on the spacious 40MB hard drive of the machine, and I didn’t need to track down a floppy drive or a Xenix 286 Kermit or rz/sz binary. God, what primitive pieces of crap that machine and OS were.

Go read it, and try to avoid laughing, crying, shuddering, or shaking your head (possibly all at the same time).

who adapts to whom?

graffitiTen years ago, the PalmPilot trained millions of people to write in primitive scribbles so the device could “understand handwriting.” Today, social networks asks that you partition the entire world into “Friends” and “Not Friends.” Note-taking software may require you use tags to organize information. Others give you folders. Others, both. 

The idea that we should adapt to how machines work or how they model data isn’t new. In fact it goes all the way back to what we could call the “origin document” of the modern information age (if there was one): “As We May Think” by Vannevar Bush.

It starts off with promise. Early on, talking about access to information, Bush puts forward the view that we should build systems that “work like the human mind:”

“The real heart of the matter of selection, however, goes deeper than a lag in the adoption of mechanisms by libraries, or a lack of development of devices for their use. Our ineptitude in getting at the record is largely caused by the artificiality of systems of indexing. When data of any sort are placed in storage, they are filed alphabetically or numerically, and information is found (when it is) by tracing it down from subclass to subclass. It can be in only one place, unless duplicates are used; one has to have rules as to which path will locate it, and the rules are cumbersome. Having found one item, moreover, one has to emerge from the system and re-enter on a new path.

The human mind does not work that way. It operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain. It has other characteristics, of course; trails that are not frequently followed are prone to fade, items are not fully permanent, memory is transitory. Yet the speed of action, the intricacy of trails, the detail of mental pictures, is awe-inspiring beyond all else in nature.

Man cannot hope fully to duplicate this mental process artificially, but he certainly ought to be able to learn from it. In minor ways he may even improve, for his records have relative permanency. The first idea, however, to be drawn from the analogy concerns selection. Selection by association, rather than indexing, may yet be mechanized. One cannot hope thus to equal the speed and flexibility with which the mind follows an associative trail, but it should be possible to beat the mind decisively in regard to the permanence and clarity of the items resurrected from storage.

By now it’s already clear that he’s not talking about something that adapts well to the human mind, but that attempts to mimic it, or, rather, mimic a specific reductionist view of the process of recall that mixes up a mental process (association) with a physical one (‘some intricate web of trails carried by the cells in the brain’). Bush then builds on this idea of trails to propose a system. He moves on to describe his memex in some detail:

There is, of course, provision for consultation of the record by the usual scheme of indexing. If the user wishes to consult a certain book, he taps its code on the keyboard, and the title page of the book promptly appears before him, projected onto one of his viewing positions.


All this is conventional, except for the projection forward of present-day mechanisms and gadgetry. It affords an immediate step, however, to associative indexing, the basic idea of which is a provision whereby any item may be caused at will to select immediately and automatically another. This is the essential feature of the memex. The process of tying two items together is the important thing.

When the user is building a trail, he names it, inserts the name in his code book, and taps it out on his keyboard. Before him are the two items to be joined, projected onto adjacent viewing positions. At the bottom of each there are a number of blank code spaces, and a pointer is set to indicate one of these on each item. The user taps a single key, and the items are permanently joined. In each code space appears the code word. Out of view, but also in the code space, is inserted a set of dots for photocell viewing; and on each item these dots by their positions designate the index number of the other item.

Thereafter, at any time, when one of these items is in view, the other can be instantly recalled merely by tapping a button below the corresponding code space. Moreover, when numerous items have been thus joined together to form a trail, they can be reviewed in turn, rapidly or slowly, by deflecting a lever like that used for turning the pages of a book. It is exactly as though the physical items had been gathered together from widely separated sources and bound together to form a new book. It is more than this, for any item can be joined into numerous trails.


And his trails do not fade. Several years later, his talk with a friend turns to the queer ways in which a people resist innovations, even of vital interest. He has an example, in the fact that the outraged Europeans still failed to adopt the Turkish bow. In fact he has a trail on it. A touch brings up the code book. Tapping a few keys projects the head of the trail. A lever runs through it at will, stopping at interesting items, going off on side excursions. It is an interesting trail, pertinent to the discussion. So he sets a reproducer in action, photographs the whole trail out, and passes it to his friend for insertion in his own memex, there to be linked into the more general trail.

Considering that he wrote this in 1945 and that his entire system design was based on analog technology, it’s fascinating to imagine how foreign the idea of “linking” two documents must have sounded to his contemporaries, not to mention the notion that you could (in essence) copy and share a whole section of documents. He also describes a feature that we haven’t even achieved today: the ability to extract a section of a sequence of hyperlinked documents from our set and have it seamlessly join into someone else’s document collection.

Even as Bush starts with the idea of associative memory, he immediately turns to relying on indexes and code spaces and code words. Partially this is due to the boundaries of the technology in which he was operating, but even within them he could have suggested, for example, that the codes be imprinted automatically by the machine to associate the documents by time of access. The subtle requirements of machines on how we operated was present even for him. For example, earlier he discusses speech recognition:

“To make the record, we now push a pencil or tap a typewriter. Then comes the process of digestion and correction, followed by an intricate process of typesetting, printing, and distribution. To consider the first stage of the procedure, will the author of the future cease writing by hand or typewriter and talk directly to the record? He does so indirectly, by talking to a stenographer or a wax cylinder; but the elements are all present if he wishes to have his talk directly produce a typed record. All he needs to do is to take advantage of existing mechanisms and to alter his language.”

(Emphasis added)


“Our present languages are not especially adapted to this sort of mechanization, it is true. It is strange that the inventors of universal languages have not seized upon the idea of producing one which better fitted the technique for transmitting and recording speech. Mechanization may yet force the issue, especially in the scientific field; whereupon scientific jargon would become still less intelligible to the layman.”

Translation: we must change how we speak so the machine can understand us.

Technology has advanced, but we still try to adapt people to software rather than the other way around.  Often this is required to adapt to technological limitations, but it is also frequently done to provide a “better” way of doing things, or even out of sheer inertia.

In the meantime, we haven’t spent enough time trying to get fundamental platforms and technologies to adapt to people’s patterns. It’s time for that to change.

the ‘engagement’ trap

Continuing to connect the dots between some of what I’ve been writing about people vs users and map vs territory, today’s topic is “engagement” and its kin as stats that we often hear about or discuss, and, in my view, frequently misuse. I’ll discuss common problems that emerge as a result, and some alternatives.

mice cage engagement trapTo measure how an app or service is doing, we often set our sights on, say, “engagement.” Maybe this is carefully defined, maybe it’s not, either way the effects of reductionism are already present. People aren’t just clicks, or ‘time spent’. “Engagement” could result from enjoyment, but it can also result, as it often does, from cheap or underhanded manipulation of our baser instincts.

As I mentioned before, software agents are ‘users’ as well as far as software is concerned. This level of abstraction makes it easier to stop thinking of ‘users’ as people. Bots are also users, which only becomes a problem if someone chooses to care. Maybe mice could be users too. You can get them to click a button all day long any number of ways. The mice will be really “engaged.” The charts are going to look great… for the most part.


Abstractions are both useful and undeniably necessary. Overusing them, however, is dangerous. Using them incorrectly, without context, even more so.( This is along the lines of “The only thing worse than no documentation is bad documentation”). It’s something common, talking about “engagement” when what we’re talking about is getting people to spend as much time as possible and click on as many links as possible and post as many things as possible.

The example of using mice may sound like a cheap analogy, but I’m serious. I’m not focusing on whether triggering addictive behaviors is a good idea, or arguing about dopamine rushes or the like. That’s a worthy discussion to have, but I’m focusing on a different aspect.

Like a Turing test for sites, if you can manipulate your stats (in this case, “engagement”) by having small rodents perform a task in a box, then you are doing it wrong.

One example: you report that user ‘engagement,’ which you define as average number of full page loads per user per month, is 10. Great. But that could be achieved any number of ways: MySpace’s signup page was at one point a crazy number of pages, which was either the result of bad design/engineering or something done to artificially inflate its pageview numbers. So maybe the user signs up and in the process they have to load 10 pages. Then they never return. OR  maybe they sign up and  then return every 3 days and read a new page. “Engagement” is the same, but the second example shows a qualitatively different result. Oh, but that’s why we measure “user retention” and “return visits”! someone may say. All too frequently, though, these three metrics aren’t cross referenced which again makes them meaningless since the ‘users’ that dominate each area may be different sets. Averages are used without looking at the standard deviation, which makes also them close to meaningless. We separate users in ‘cohorts’ that are different across different stat sets. Since a ‘user’ is at best an account, while we have soft ways of extrapolating when multiple accounts are really one person, we don’t usually look at that. Bots are users too, unless they’re so well known or so high in traffic that you can’t get away with ignoring them.

But there’s more!

When you use an abstraction like “user” it’s also easier to go down the wrong path. Getting a “user” to “discover content” by inserting “targeted paid results” Is much better than to describe how you’re getting your grandmother to unwittingly click on a piece of advertising that looks much like the real content she wanted to look at but says “advertisement” in 5-point font. While you may or may not think (like I do) that this is morally wrong, my larger point is that it is the wrong thing to do for the business too.

You’re essentially depending on people not understanding what they’re doing, or being manipulated, and therefore, you’re kidding yourself. When you start thinking of motivation, you may also realize that as long as you don’t have the equivalent for your company of the card “As a CEO of Trinket software I want to keep shareholders happy by delivering X product with Y revenue and Z profits” you’re just kidding yourself. Making goals explicit, from the small and tactical to the large and strategic, is critical.

Even the best companies take years to refine how they analyze their business, massive amounts of work to patch together a set of abstractions that start to reflect what the business is really like.

What’s the alternative?

No credit for predicting rain,” is always present in my mind. Ben is talking about some specific situations, and he is not saying that  you always have to know the answer before you can ask the question or criticize/point out flaws. I have, however, adopted this mode of thinking when I’m going after something specific or whenever I’m questioning anything that has been longstanding. I always try to come up with alternatives, even I can’t lay out the alternative in detail right here, I can point in a direction. If I’m saying that X is wrong I generally try to have at least one alternative in mind.

So in this case, among other things, I’m calling bullshit on the vast majority of simple metrics used with abandon like “user engagement,” “time spent,” “user retention,” “churn.”  These measures require careful definition, proper parameters and boundaries, and  accurate correlation to high level goals. They require cross-referencing. They should always be labeled “handle with care: numbers in stats may be more meaningless than they appear.”

So, what, then, is a possible alternative? What is a better way? For example, while they may measure pageviews or time spent, what Amazon really cares about is sales of products. Typical metrics may be looked at in the service of that (e.g. we just did a release and avg pageviews are down with high standard deviation, did we screw up the release in some geographic region?). I’m sure that if they could retain the level of sales by serving a single page a day in some magical way, they’d take it.

In being able to clearly point at product sales Amazon is in the minority, but my argument is that every product and service has something equivalent, even if it is less tangible and harder to define, it can be defined and quantified in one or more ways.

If you are a communications app, you may want to know if people really ‘use’ your app. But you shouldn’t care about the number of messages sent. This invents causality where there is none. Just because a message was sent doesn’t mean it was read, it doesn’t mean it was, um, communication. Even if read and replied to, what qualifies are the type of “use” you envision? 5 messages per thread? 4? 100? Over what timeframe?

Is this harder to model and measure? You bet. But it’s necessary, and letting go of abstractions helps.

When you think of people and not users it’s easier to see why pageviews, clicks, “time spent” and many other types of metrics commonly discussed are pretty much smoke and mirrors. Most of us already know this, and we keep using them not because we think they’re great but because they’re readily accessible and our over-reliance of abstractions lets us get away with it.

Suppose the goal of your service is enabling group communication. You shouldn’t care about the number of messages sent, something frequently touted in press releases. This invents causality where there is none.

Regardless of number of messages, or pageviews, or clicks, or signups or any of this other stuff that is more readily measurable, what really matters is whether people are communicating or not, right?

So can say that, for this service, ‘engagement’ = ‘frequent group communication’. A definition of ‘person engagement’ (which would be different from ‘group engagement’) in this context could be a combination of a) frequency of participation in threads with a group of at least 2 other people (meaning, for example, correlated sequences of reply-response of at least 5 messages that involve at least 3 people including the originator) and b) frequency of thread generation, ie start a conversation that call out to others and then tapers off. If you’re looking for growth-related metrics you could look at things like frequency of invitation of others to join that then actually results in someone creating an account. This could be further enhanced by measuring whether the conversation more closely matches real communication patterns, like recurrent use of the names of the other people involved in the group, variance in vocabulary between participants, and many others.

Again, people not users: they don’t just “click” or “post”, they have intention, motivation. They may be showing off, they may be trying to share their happiness or grief. They may be just shallow and obsessed and repeating whatever they think is popular. They may be trying to help. And so on. And in their motivation and intent lies a key element that will either help the product or hurt it.

One person just blindly posts photos of celebrities, for two hours a day, and they have 5 “followers” and two “friends”. Another person has just two “friend” and no “followers” and sends just one joke everyday to the two friends, one of which is in the hospital and they exchange a couple of LULz and check in briefly. When you think of “users”, the first person could easily look “better” on paper than the other. But when you think of people, you realize that the second person and their friend are the ones that are really engaging in something meaningful enabled by your service. At least for me, the first case (which would rank higher in nearly any of the common metrics) would not be nearly as important and valuable as the second group. The second group will be more loyal, their interactions with the product more meaningful and lasting, and they will be better champions for your product.

These more meaningful metrics also enable us to focus on what matters. Do you want to help group 1 or group 2? They have different costs associated, and different growth dynamics. Common reductionist abstractions would either not give you enough information, or mislead you.

And that’s something we should all want to avoid. :)

affordances matter post #96482828

Great article: How The Ballpoint Pen Killed Cursive. Ok maybe the title tries to be a bit too flashy, given the topic — plus ballpoint pens aren’t murderers… I keep thinking that if cars were invented tomorrow we’d see headlines like “How Cars Killed The Horse,” or “The Death Of The Carriages: Gas vs. Grass.” Anyway.


[…M]y own writing morphed from Palmerian script into mostly print shortly after starting college. Like most gradual changes of habit, I can’t recall exactly why this happened, although I remember the change occurred at a time when I regularly had to copy down reams of notes for mathematics and engineering lectures.

During grade school I wrote largely using a fountain pen, but as I started high school I switched to ballpoint. Not sure why, but probably cost had something to do with it. By the middle of the first year I was writing mostly print. I didn’t even notice I was doing it our “literature” teacher started berating me, and then threatening to fail me if I didn’t write cursive. It should be noted that the following year when requesting books for the class to read she scoffed at my suggestion, The Lord of the Rings. “Some fantasy garbage,” she said. Everyone laughed and moved on. So, yeah, she wasn’t very enlightened.

The result of this pressure was that by the end my handwriting was a complete mess, a print-cursive hybrid that even I had trouble reading at times. Over time I switched over to more readable print, but by them I was doing most of my writing on keyboards anyway, and that was that.

Back then I wondered why I switched to print. My younger self decided that the reason was some form of underhanded rebellion at the backwardness of cursive (note: the nerd rebellion: That book was great, but I’ll write that book report any way I want to!). I remember thinking that writing print hurt but I was damned if I was going to relent.

At times in later years I would occasionally wonder about that switch form cursive to print, thinking that perhaps technical drawings (drafting — hand-drawn plans for engines, houses, and the like), math, physics, etc, had played a roled too. I hadn’t thought about this for years until the article this morning. Now I’ve got a much better explanation: It wasn’t writing in print that hurt; print was in fact the least uncomfortable writing style for the new device: the ballpoint pen. From the article:

Fountain pens want to connect letters. Ballpoint pens need to be convinced to write, need to be pushed into the paper rather than merely touch it. The No.2 pencils I used for math notes weren’t much of a break either, requiring pressure similar to that of a ballpoint pen.


[…] the type of pen grip taught in contemporary grade school is the same grip that’s been used for generations, long before everyone wrote with ballpoints. However, writing with ballpoints and other modern pens requires that they be placed at a greater, more upright angle to the paper—a position that’s generally uncomfortable with a traditional pen hold. Even before computer keyboards turned so many people into carpal-tunnel sufferers, the ballpoint pen was already straining hands and wrists

As usual, there’s more than one factor at play. Drafting requires print. Working with equations and their references contributed as well. And perhaps even rebellion. But the ballpoint’s affordances were surely a big factor, perhaps the determining factor. Affordances matter.

PS: yeah, I used a semicolon. Deal with it.


Get every new post delivered to your Inbox.

Join 392 other followers

%d bloggers like this: