recommended reading

Big data meets interested government

Intelligence agencies increasingly are looking beyond the satellite photos and secret reports upon which they've traditionally relied for insight into U.S. adversaries' actions and are turning to data-crunching algorithms that can sift through massive piles of disparate information, such as GPS reports, social media posts and online images, said Amr Awadallah, co-founder and chief technology officer of CloudEra, a vendor that maintains and manages big data systems.

While intelligence agencies are the first government entities to mine big data -- data sets too large to be analyzed by desktop analytical tools -- they're unlikely to be the last, Awadallah said.

Agencies managing Social Security, Medicare and Medicaid for instance, could analyze big data to spot trends in fraud and abuse and the Transportation Department could crunch through satellite images to get a better sense of traffic patterns on interstate highways.

CloudEra's federal customers include the CIA and the National Security Agency. "I can't talk about what those projects are, but you can imagine how much data they have and what type of things they could be doing with it," he said.

The CIA also indirectly invested in CloudEra, through In-Q-Tel, an independent, nonprofit venture capital firm started at the spy agency's request and which describes its mission as delivering useful technology to the agency.

Awadallah spoke with Nextgov on the sidelines of the Government Big Data Forum that vendor Carahsoft Technology sponsored on March 6.

At the root of most big data crunching systems is the open source software Apache Hadoop. Its major innovations are, first,the ability to link together multiple computers and servers, either in a proprietary data center or in a computer cloud, and make them work like one huge computer that can scale up for a major task.

The software's second major innovation is the ability to sort through unstructured data such as all posts under a particular Twitter hash tag or emails containing a particular word or phrase, as well as through more structured data such as spreadsheets.

"The old way of collecting data was to only collect it . . . when a human generates it," Awadallah said, such as by making a purchase or filling out a survey.

"We called that an explicit transaction," he said. "Now we're collecting implicit information. We have all these sensors around humans in mobile devices and satellites taking images and there are Web services collecting information about you all the time nonstop."

The classic example of big data in the private sector is when Google, Facebook or another site mines through a user's search history, network of contacts and profile information to micro-target the advertisements she's most likely to click on.

Big data can be used in other commercial ways, though, that have nothing to do with Web activity.

The company Skybox Imaging, for example, has made a business out of sorting through satellite data to deliver commercial intelligence on demand, according to Awadallah.

"So [for example] you can buy a little stream from them that gives you a measure of how many cars are parked at Home Depot in different locations across the country," he said. "If you're a competitor of Home Depot's or if you're a financial analyst who's trying to predict the quarterly earnings of Home Depot that's very valuable information."

(Image via Pasko Maksim/

Threatwatch Alert

Thousands of cyber attacks occur each day

See the latest threats


Close [ x ] More from Nextgov

Thank you for subscribing to newsletters from
We think these reports might interest you:

  • Modernizing IT for Mission Success

    Surveying Federal and Defense Leaders on Priorities and Challenges at the Tactical Edge

  • Communicating Innovation in Federal Government

    Federal Government spending on ‘obsolete technology’ continues to increase. Supporting the twin pillars of improved digital service delivery for citizens on the one hand, and the increasingly optimized and flexible working practices for federal employees on the other, are neither easy nor inexpensive tasks. This whitepaper explores how federal agencies can leverage the value of existing agency technology assets while offering IT leaders the ability to implement the kind of employee productivity, citizen service improvements and security demanded by federal oversight.

  • Effective Ransomware Response

    This whitepaper provides an overview and understanding of ransomware and how to successfully combat it.

  • Forecasting Cloud's Future

    Conversations with Federal, State, and Local Technology Leaders on Cloud-Driven Digital Transformation

  • IT Transformation Trends: Flash Storage as a Strategic IT Asset

    MIT Technology Review: Flash Storage As a Strategic IT Asset For the first time in decades, IT leaders now consider all-flash storage as a strategic IT asset. IT has become a new operating model that enables self-service with high performance, density and resiliency. It also offers the self-service agility of the public cloud combined with the security, performance, and cost-effectiveness of a private cloud. Download this MIT Technology Review paper to learn more about how all-flash storage is transforming the data center.


When you download a report, your information may be shared with the underwriters of that document.