Yahoo Directory: When the Web Was Curated by Hand
In January 1994, two Stanford graduate students named Jerry Yang and David Filo started maintaining a list of websites they found interesting. They organised the list by category. They added new sites as they found them. They called it “Jerry and David’s Guide to the World Wide Web.” Within months they renamed it Yahoo, and the directory they were maintaining became, for a time, the front door to the web.
The Yahoo Directory survived in some form until 2014 when Yahoo finally retired it, twenty years after its creation. The arc of its rise and decline is one of the most instructive stories in internet history because it represents a road not taken — the version of the web where humans, rather than algorithms, decided what was worth finding.
The original problem
The web in 1994 had perhaps 2,500-3,000 sites. By the end of 1995 it had something like 23,000. By the end of 1996 it was over 100,000. The growth curve was vertical and the discovery problem was severe. There was no Google. AltaVista didn’t exist until December 1995. The early search engines that did exist (WebCrawler, Lycos, InfoSeek) returned results that were technically relevant but often unhelpful because the sites themselves hadn’t yet developed the structures that made search useful.
Yahoo’s solution was elegant in its simplicity: pay humans to read websites and put them in categories. The directory tree was organised hierarchically — Arts, Business, Computers, Entertainment, Government, Health, News, Recreation, Reference, Regional, Science, Social Science, Society — with subcategories that drilled down into increasingly specific topics. Each entry in the directory was placed by a human editor who had presumably looked at the site and made a judgment about where it belonged.
For roughly four years, this worked. Yahoo was the most visited site on the web for substantial portions of that period. Getting listed in the Yahoo Directory was a meaningful business event for the early commercial web — sites would announce their inclusion in press releases.
The editorial economics
The editorial operation Yahoo built was substantial. At its peak Yahoo employed something on the order of 150-200 directory editors, called “surfers” internally, organised into teams that handled different parts of the category tree. Each editor was responsible for evaluating site submissions, placing accepted sites in appropriate categories, writing brief descriptions, and pruning broken or irrelevant listings.
The category tree had over 100,000 categories and subcategories at peak. The directory itself listed several million sites. The editorial workload was enormous and the labour cost was material — by the late 1990s, the directory editorial operation was reportedly costing Yahoo tens of millions of dollars annually.
The economic model was advertising-supported, with category-level sponsorships and banner advertising adjacent to listings. For a few years this worked because Yahoo’s traffic was so dominant that the advertising rates supported the editorial costs and produced significant operating margin. The model started to break down when search engines began capturing the navigation behaviour that had previously gone through directory browsing.
The submission process
Getting your site listed in the Yahoo Directory was a process that, in retrospect, feels charmingly archaic. You’d visit a submission form, select the category you thought your site belonged in, fill in the URL and a brief description, and submit. Then you’d wait. The waiting could take weeks or months depending on the category’s editorial backlog.
If your site was accepted, it would appear in the category. If it was rejected, you might or might not be told why. Some submissions were rejected because the editor decided the site belonged in a different category, which would be moved to without notification. Some were rejected because the editor decided the site wasn’t of sufficient quality, which was a judgment call with no formal appeals process.
The introduction of paid expedited submission in 1998 was a significant moment. For $199 (later increased), you could pay to have your site reviewed within seven business days rather than waiting in the free queue. Critics argued this turned the directory from a curated resource into a paid placement scheme. Defenders argued the paid expedited review still required editorial approval and that the fee just reflected the editorial cost of timely review.
The truth was somewhere in between. The directory’s editorial integrity gradually eroded as commercial pressure increased, and the credibility advantage that human curation had over algorithmic search dwindled as Google’s PageRank approach demonstrated that algorithms could surface useful results at scale.
What Google broke
Google launched in September 1998 and changed the discovery problem fundamentally. PageRank’s insight was that the link structure of the web itself contained information about which pages were valuable. By treating each link as a vote, Google could rank pages without requiring human editorial judgment. The rankings were better than anything algorithm-based that had come before, and they scaled in ways that human editorial operations could never match.
Yahoo Directory’s value proposition was that human judgment produced better results than algorithms. Once that proposition stopped being true — and Google made it stop being true within a couple of years — the directory’s strategic position became untenable. Yahoo even partnered with Google for search results between 2000 and 2004, which was an odd interim period where the company was using its competitor’s technology to deliver the search experience while still maintaining its own directory.
The New York Times ran a number of pieces across the early 2000s tracking Yahoo’s strategic confusion during this period. The directory was kept alive long after it had stopped being commercially relevant because nobody at Yahoo was willing to pull the plug on the original product that had built the company.
The DMOZ alternative
The Open Directory Project, known as DMOZ, was launched in 1998 as a volunteer-edited alternative to commercial directories. It used the same hierarchical category structure as Yahoo Directory but relied on volunteer editors rather than paid staff. At various points DMOZ was used as the directory backbone for Google itself, AOL, and dozens of other search portals.
DMOZ survived until 2017, longer than Yahoo Directory, and its data was preserved in mirror form by various archival projects after AOL closed it down. The DMOZ approach had its own problems — volunteer editor cliques, slow processing, accusations of bias and gatekeeping — but it represented an interesting parallel experiment in what human-curated web organisation could look like outside a commercial context.
What was actually lost
The closing of Yahoo Directory in 2014 felt at the time like an overdue acknowledgement of something that had stopped being relevant years earlier. Looking back, the loss was more subtle than it seemed.
Algorithmic search optimised for relevance to specific queries. Directory browsing optimised for serendipitous discovery within a topic area. These are different cognitive activities. The disappearance of the directory model means there isn’t really a way to browse the web by topic in 2026 the way you could in 1998. You can search for “Australian indie record labels” and get specific answers. You can’t easily browse the entire space of “music industry sites” the way you could when Yahoo Directory had a category for it.
The Internet Archive’s preservation work has captured snapshots of the directory structure across its history, which is valuable both as historical record and as a reminder of how the web’s organisation has shifted from intentional curation to algorithmic surfacing.
The legacy in unlikely places
Some of the directory model’s DNA persists in places you might not expect. Wikipedia’s category system inherits some of the hierarchical organisation logic. Reddit’s subreddit structure functions as a kind of distributed directory. Even the broad concept of “lists of resources” maintained by enthusiasts on GitHub repositories — the awesome-X repos that aggregate links to projects in a topic area — is a return to manual curation in a smaller, more focused form.
The Yahoo Directory project demonstrated that human curation can work at small to medium scales, struggles at large scales, and breaks down economically when the alternative is “free.” The web ended up with the free alternative. Whether that was the right outcome depends on what you weight in your evaluation of how the modern web works.
For the people who navigated the early web through Yahoo’s category tree, the experience left a certain residue. There was a sense that the web was a place you could understand the shape of, even if you couldn’t see all of it. That sense is gone now, and probably won’t return.