Despite its reputation for illegal activity, much of what goes on underneath the surface of the internet is legit.
Drugs, child porn, hitmen for hire: The dark web doesn’t have a particularly glowing reputation. When the leader of a illegal-drug marketplace called the Silk Road was arrested in 2013, many were introduced to the hidden side of the internet for the first time as the home of the notorious “eBay for drugs.”
But because of how the dark web is organized—or rather, how it isn’t—it’s nearly impossible to determine the kinds of sites that populate it. By definition, the dark web isn’t searchable by engines like Google, nor is it accessible by normal browsers like Chrome or Safari. Most of the coverage about the dark web—including some of mine—focuses on the strange and unusual: marketplaces for stolen Netflix credentials, personal data and even paywalled academic research.
But the dark web and Tor, the privacy-preserving browser used to access hidden sites—called onions—have myriad legitimate uses, too.
New research from Terbium Labs, a company that analyzes the dark web, took a small snapshot of onion sites—a random collection of 400 sites its web-crawling robots had found in the course of one day in August—and divided up the sites based on their purpose and content. The results suggest less than half of what goes on beyond the reach of search engines and traditional browsers is illegal.
The rest, it turns out, is legitimate, made up of dark-web mirrors of websites like Facebook and ProPublica, websites for companies and political parties, and forums for chatting about technology, games, privacy, or even erectile disfunction. In addition, as on the “clear” internet, a good chunk of the webpages Terbium studied hosted legal pornography: photos, videos, and written material, available for free or for sale.
“Anonymity does not equate criminality, merely a desire for privacy,” wrote Clare Gollnick and Emily Wilson, the authors of the Terbium study.
That’s not to say the slice of the dark web the analysis looked at didn’t have its fair share of unsavory and illegal material. More than 15 percent of the sample was made up of sites selling drugs or pharmaceuticals; pages with content related to hacking, fraud, illegal porn, or terrorism turned up at least once. (Many of these same categories of websites can be found on the clear web, too, hosted on sketchy domains or protected by passwords.)
Terbium obtained the sample for the study with the same scanner that powers Matchlight, a service I’ve written about before that crawls shady forums and marketplaces on the dark web and the clear web in search of stolen sensitive or personal data.
“We’re trying to index what people don’t want indexed,” Danny Rogers, Terbium’s CEO, told me earlier this year. “There’s no desire to make things easy to find. Fundamentally, it’s a more hostile environment to crawl.”
That makes studying the dark web really difficult. Terbium’s researchers acknowledged that their study is limited by its scope and the decisions that human analysts made in categorizing the websites the crawler came across. To complicate matters further, the dark web is notoriously transient, with some sites disappearing after only hours. Just a week passed between the day Terbium took its dark-web snapshot and the day its analysts began categorizing the sites, but some amount of the 18 percent of non-functioning URLs the analysts found may have gone down even in that short a time.
The Terbium study only examined the makeup of a small chunk of the dark web—not the traffic patterns that show where users actually tend to visit. A 2014 study determined that more than 80 percent of visits to the dark web were to child-porn sites. But as always, it’s hard to know just who those visitors were: That traffic could have come from bots, law-enforcement agencies investigating illegal porn sites, or even a cyberattack directed at the sites.
What’s more, the hidden sites on the dark web like the ones Terbium studied probably only make up a tiny part of the internet as a whole. Estimates from Tor itself show the number of hidden sites has hovered between 5,000 and 6,000 over the last year—and that only about 2 percent of traffic on the Tor network is visiting them.
Most people who use Tor use it to browse the clear web while preserving their privacy and anonymity. The network allows users to visit a normal website without revealing the types of information often used to track people as they make their way across the internet. Tor’s privacy-preserving features can be particularly vital to journalists, human-rights advocates, or political dissidents operating in countries hostile to their work.