recommended reading

Researchers: It Is Trivially Easy to Match Metadata to Real People

Genialbaron/Shutterstock.com

In defending the NSA's telephony metadata collection efforts, government officials have repeatedly resorted to one seemingly significant detail: This is just metadata—numbers dialed, lengths of calls. "There are no names, there’s no content in that database," President Barack Obama told Charlie Rose in June.

No names; just metadata.

New research from Stanford demonstrates the silliness of that distinction. Armed with very sparse metadata, Jonathan Mayer and Patrick Mutchler found it easy—trivially so—to figure out the identity of a caller.

Mayer and Mutchler are running an experiment which works with volunteers who agree to use an Android app, MetaPhone, that allows the researchers access to their metadata. Now, using that data, Mayer and Mutchler say that it was hardly any trouble at all to figure out who the phone numbers belonged to, and they did it in just a few hours.

They write:

We randomly sampled 5,000 numbers from our crowdsourced MetaPhone data set and queried the Yelp, Google Places, and Facebook directories. With little marginal effort and just those three sources—all free and public—we matched 1,356 (27.1%) of the numbers. Specifically, there were 378 hits (7.6%) on Yelp, 684 (13.7%) on Google Places, and 618 (12.3%) on Facebook.

What about if an organization were willing to put in some manpower? To conservatively approximate human analysis, we randomly sampled 100 numbers from our dataset, then ran Google searches on each. In under an hour, we were able to associate an individual or a business with 60 of the 100 numbers. When we added in our three initial sources, we were up to 73.

How about if money were no object? We don’t have the budget or credentials to access a premium data aggregator, so we ran our 100 numbers with Intelius, a cheap consumer-oriented service. 74 matched.1 Between Intelius, Google search, and our three initial sources, we associated a name with 91 of the 100 numbers.

Their results weren't perfect (and they note that the Intelius data was particularly spotty), but they didn't even try all that hard. "If a few academic researchers can get this far this quickly, it’s difficult to believe the NSA would have any trouble identifying the overwhelming majority of American phone numbers," they conclude.

It's also difficult to believe they wouldn't try. As federal district judge Richard Leon wrote in his decision last week, "There is also nothing stopping the Government from skipping the [National Security Letter] step altogether and using public databases or any of its other vast resources to match phone numbers with subscribers."

(Image via Genialbaron/Shutterstock.com)

Threatwatch Alert

Thousands of cyber attacks occur each day

See the latest threats

JOIN THE DISCUSSION

Close [ x ] More from Nextgov
 
 

Thank you for subscribing to newsletters from Nextgov.com.
We think these reports might interest you:

  • Modernizing IT for Mission Success

    Surveying Federal and Defense Leaders on Priorities and Challenges at the Tactical Edge

    Download
  • Communicating Innovation in Federal Government

    Federal Government spending on ‘obsolete technology’ continues to increase. Supporting the twin pillars of improved digital service delivery for citizens on the one hand, and the increasingly optimized and flexible working practices for federal employees on the other, are neither easy nor inexpensive tasks. This whitepaper explores how federal agencies can leverage the value of existing agency technology assets while offering IT leaders the ability to implement the kind of employee productivity, citizen service improvements and security demanded by federal oversight.

    Download
  • Effective Ransomware Response

    This whitepaper provides an overview and understanding of ransomware and how to successfully combat it.

    Download
  • Forecasting Cloud's Future

    Conversations with Federal, State, and Local Technology Leaders on Cloud-Driven Digital Transformation

    Download
  • IT Transformation Trends: Flash Storage as a Strategic IT Asset

    MIT Technology Review: Flash Storage As a Strategic IT Asset For the first time in decades, IT leaders now consider all-flash storage as a strategic IT asset. IT has become a new operating model that enables self-service with high performance, density and resiliency. It also offers the self-service agility of the public cloud combined with the security, performance, and cost-effectiveness of a private cloud. Download this MIT Technology Review paper to learn more about how all-flash storage is transforming the data center.

    Download

When you download a report, your information may be shared with the underwriters of that document.