The slow implosion of a searchable web

The slow implosion of a searchable web

Could it be that the last thirty years have been a Goldilocks-like ‘sweet spot’ between a world where information was too difficult to access, and a world where it has become impossible to locate any information at all?

To understand why that might be the case, we need to look back at the origins of our modern culture of information: in those roots we sowed the wind, and now can feel the first great gusts of a whirlwind that could make our ‘Information Age’ soon seem like a lost golden age.

But first – to the Time Machine!

Thirty years ago this month, at SIGGRAPH, the big computer graphics conference, I sat down for a demo before an expensive computer workstation. I moused around idly on its screen, noticing that when I clicked on some text underlined in blue – BAM! – another page loaded. I’d traversed a hyperlink, the basic element of all hypertext systems – a sort of informational wormhole directly connecting two pages of information.

I knew all about hypertext. I’d been fascinated by it since I read Ted Nelson’s amazingly prophetic Literary Machines, which described a universal, hyperlinked library of all knowledge, and his own effort – Project Xanadu – to bring that dream into reality. I built a Xanadu ‘browser’ for the Macintosh – my own hypertext browser. Yes, I knew all about hypertext. I also knew that Xanadu had never made it over the line from dream to real, shipping software. A failed dream that left hypertext languishing with a dozen unsuccessful commercial products, none of which interoperated with one another. Oh how I longed for a single solution to link them all…

That single solution described what I’d just experienced at SIGGRAPH: a brand new hypertext system styling itself the World Wide Web. I have to admit I had a fairly lukewarm response, clicking around just long enough to see that while the World Wide Web did do all the hypertext-y things, there wasn’t a lot on offer. No there, there. Yeah, whatevs. I thought very little of it, and moved on to the next demo.

A month later, one of my email newsletters arrived with this weird bit of text embedded in one of the articles, something that began with http://then-continued-on-with-a-lot-of-unrecognisable-stuff.html. This, the newsletter assured me, was a hyperlink to more information about the topic – a link accessible through that World Wide Web demo I’d dismissed as a toy. Fine, I thought, as I downloaded the ‘browser’ software to my computer, and fired it up: Let’s see how all this works. I was stunned by what I found – in just a month’s time, the World Wide Web had hit an inflection point, entering an extended period of exponential growth.

Finally – a global-scale hypertext system! I can’t tell you how much I bored my friends with my own excitement (as I rabbited on and on about ‘the Web’). I made it my personal goal to get them all set up with all of the tools they needed to browse the Web – then built my first Website. As far as we can determine, mine may have been the very first Website running out of a server in someone’s home – all the others were running at universities, or big commercial organisations.

As far as we can determine, mine may have been the very first Website running out of a server in someone’s home.

Once I had launched my site, I wanted folks to visit it. But how? Back in the middle of 1993, the only way to do that was to put yourself on the list of sites maintained at CERN (the birthplace of the Web) or at UIUC (birthplace of the web browser). I remember coming across that list on UIUC’s website – a comprehensive list of every known site on the Web.

I spent a week, that October, working my way through that list, methodically visiting every site. By the end of the week, I’d finished. I’d surfed the entirety of the World Wide Web. I submitted my own site to that list, so others could find it – and as October turned into November and December I checked that list almost every day, doing my best to visit all the new sites on the Web as they appeared. But the whole thing kept accelerating. Every day saw a few more sites launch than the day before, until (as 1993 turned to 1994) the list simply became too big to maintain – or to surf. The folks behind that list gave it up.

Now, how could I learn what I could find on the Web?

Fortunately, at just this moment a website came along to address this problem. “Jerry and Dave’s Guide to the World Wide Web” offered a library-catalogue like experience, presented as a sensible and detailed taxonomy of subject areas. Click through subjects, and sub-domains, and sub-sub-domains – even sub-sub-sub-domains, and find just that website about gardening in the steppes of Central Asia, or archaeology in the Pacific Northwest, or the hydrostatics of a supernova explosion or, or, or… If it was on the Web, it was in Jerry and Dave’s guide, so that guide became incredibly popular. So popular, it soon left its home on a Stanford University computer, and went commercial – as Yahoo!

I checked that list almost every day, doing my best to visit all the new sites on the Web as they appeared.

For pretty much all of 1994, Yahoo! was more than good enough to help anyone find anything they needed to locate on the Web. Even as the number of sites mushroomed into the thousands, then tens of thousands, the taxonomy of Yahoo! held up. But there comes a point when ferreting your way into a sub-sub-sub-sub-sub-domain of knowledge carries too much cognitive load – when the travail involved in finding what you need on the Web becomes greater than the value of the information sought. Yahoo! had its own ‘Goldilocks moment’ – when the Web was big enough, but not yet so big – and when that moment passed, Yahoo! quickly became pointless.

Fortunately, another technological solution emerged to meet the needs of an ever-growing population of ‘Web surfers’: Alta Vista, the Web’s first proper ‘search engine’. A search engine digests the content of the Web, reading every page, organising that information into a massive, very high-speed database. If you knew enough to type in a reasonable topic or summary into Alta Vista, chances were that it would be able to find you the right page. That’s how it worked at the beginning, for Alta Vista itself thrived in another ‘sweet spot’: while the Web was large enough to index and search, but before the Web had become so big, so unwieldy, that any search would turn up pages and pages of hits, few of them on target.

As the Web kept growing, it outgrew every attempt to make it apprehensible, from lists to taxonomies to search. All of these approaches eventually failed because the Web refused to make itself manageable.

Then, along came Google.

The key innovation powering Google – developed by co-founders Larry Page and Sergey Brin – leveraged the hyperlinks pointing into a web page. The greater the number of inbound hyperlinks, the more likely the page held content with high relevance to a search.

That, as it turns out, was the beginning of the end of the Information Age.

By 1998, when Google search launched, the Web was already well into its “1.0” era of startups, and eye-watering IPOs from firms like Yahoo! (which did its best to parlay its fading popularity as a search engine into success as a one-stop-shop for all sorts of products and services), eBay, and Amazon. The Web clearly had a future as an enabler and amplifier of commerce – if its hundreds of millions of daily users could find you out there. Web advertising took off, as cashed-up startups put huge marketing dollars behind their efforts to make their Websites well-known destinations. Marketers began to study the operation of Google’s page ranking algorithm, leading to another innovation: ‘Search Engine Optimisation’, or SEO.

Since Google would rank sites based on inbound hyperlinks, SEO providers would generate a hundred or a hundred thousand inbound links – generally coming from ‘spam’ pages or websites – feed these into Google, and watch their customers’ page rankings rise. Not wanting to go the way of Yahoo! and Alta Vista, Google fought back, continuously refining its algorithm, playing a game of whack-a-mole with the SEO providers.

That, as it turns out, was the beginning of the end of the Information Age.

Every move in that game has weakened search. SEO adds noise to the Web, and Google does its best to filter out that noise, but no filter is ever perfect. Nor – in this environment where each side constantly seeks advantage over the other – will any solution work for very long. Each has been honing its skills, which means each has been helping its adversary to hone their skills.

What worked well for Google back in 1998 wouldn’t work at all a quarter century later. The algorithms developed through this commercial warfare have gradually degraded the quality of Google’s results. The whole Web has slowly filled with noise – driven by the incentive of billions of dollars in consumer clicks. Google knows this, and despite throwing thousands of PhDs at the problem, cannot prevent an eventual disappearance within that rising noise. The firm has done an admirable job slowing the decline – existential threats have given a notoriously dilatory business one shaft of clarity. But it’s becoming increasingly clear that search as we know it – the way we find things within a Web grown vast beyond any possibility of knowing anything more than just a tiny part of it – that search will stop working.

And with that, the Web will become unknowable. This brief moment when we could gather all human knowledge – and find it – will have passed. We will be left with a pool of data so polluted by the intersection of late capitalism and information theory that it will be impossible to detect any signal amidst all the noise.

That will mark the dawn of humanity’s ‘Post-Information Era’. And it’s nearly here.

More from Mark Pesce: Apple’s Vision Pro could bring surveillance unlike anything imagined

Please login to favourite this article.