Rooted in DARPA research, the tool shines light on the darkest corners of the internet.
Search the phrase “sex trafficking” on Google, and your first result is a link to the national human trafficking hotline. Scroll down and you’ll find stories about convicted traffickers, explainer pieces on modern-day slavery and loads of anti-trafficking advocacy groups.
What you won’t see are the sites where humans are being bought and sold. That’s because Google’s search engine isn’t optimized to show those platforms, and the vast majority reside on the “deep web,” the unindexed internet that search engines can’t access.
But using a machine-learning tool with roots in military research, investigators can quickly scour the internet for those unsavory marketplaces and work to shut them down.
Giant Oak Search Technology, or GOST, allows law enforcement agencies to build custom search engines that optimize results for different types of nefarious activity and pull information directly from those underground marketplaces. The tool also works for sites related to drug trafficking, money laundering, smuggling, terrorism and virtually any other type of illegal behavior.
Criminals often go to great lengths to hide their identities online, but it’s much tougher to fake specific patterns of behavior, said Gary Shiffman, founder and CEO of Giant Oak, the company that developed GOST. By looking for certain behavioral red flags, the tool can pinpoint individual sites and users engaged in illicit activities and highlight those findings in its search results.
“As long as there’s some sort of a data trail people leave behind … GOST can be used,” Shiffman said. “It brings whatever you’re interested in to the top, and all of our customers are interested in illicit activity.”
Today, those customers include commercial banks, the Homeland Security Department and a handful of other government agencies, Shiffman told Nextgov. Using GOST, investigators are uncovering online criminals who might’ve otherwise flown under the radar “every day of the week,” he said.
GOST is currently a fully commercial product, but its origins can be traced back to an initiative at the Defense Advanced Research Projects Agency.
DARPA contracted Giant Oak for a handful of initiatives over the years, but GOST is most directly linked to the Memex program, an effort to build internet search technologies specially tuned for different categories. The program concluded in 2017, but Shiffman said custom indexing and search technologies like GOST wouldn’t be where they are today without DARPA’s initial investment.
When you look something up on a traditional search engine like Google, the platform uses algorithms to sort through the entire internet and bring the most relevant results to the top. The engine then takes note of the links you click and uses that information to optimize future searches. Because Google’s algorithm is trained by some 3.5 billion searches every day, you almost always find what you’re looking within the first few pages of results.
Most people stick to the top 10 or 20 links, but there are millions of results they never see. Those lower-ranked or altogether unindexed sites are where most illegal activity takes place. And using a combination of behavioral science and machine-learning, GOST can shed light on those hidden corners of the web.
“We have behavioral and computer science working together—that’s so important and so valuable,” said Shiffman. “Human traffickers are humans. Money launderers are humans. You need to have that behavioral science component in the technology build to really make sense of that data.”
Then depending on the links an investigator clicks, GOST further refines its algorithms. That means the more it’s used, the better its results. Different crimes correlate to different behaviors, so users can tell the engine to optimize for whatever specific activity they’re investigating, said Shiffman.
“We’re reorganizing the internet and we’re giving you the results you want,” he said. “We live in a world now where you should never settle for general purpose [internet]. You should always have a custom solution because you can do it instantly and cheaply.”
Editor's note: This article was updated to clarify how the tool works.
NEXT STORY: Pentagon Declares War on Scooters