recommended reading

Big data meets interested government

Intelligence agencies increasingly are looking beyond the satellite photos and secret reports upon which they've traditionally relied for insight into U.S. adversaries' actions and are turning to data-crunching algorithms that can sift through massive piles of disparate information, such as GPS reports, social media posts and online images, said Amr Awadallah, co-founder and chief technology officer of CloudEra, a vendor that maintains and manages big data systems.

While intelligence agencies are the first government entities to mine big data -- data sets too large to be analyzed by desktop analytical tools -- they're unlikely to be the last, Awadallah said.

Agencies managing Social Security, Medicare and Medicaid for instance, could analyze big data to spot trends in fraud and abuse and the Transportation Department could crunch through satellite images to get a better sense of traffic patterns on interstate highways.

CloudEra's federal customers include the CIA and the National Security Agency. "I can't talk about what those projects are, but you can imagine how much data they have and what type of things they could be doing with it," he said.

The CIA also indirectly invested in CloudEra, through In-Q-Tel, an independent, nonprofit venture capital firm started at the spy agency's request and which describes its mission as delivering useful technology to the agency.

Awadallah spoke with Nextgov on the sidelines of the Government Big Data Forum that vendor Carahsoft Technology sponsored on March 6.

At the root of most big data crunching systems is the open source software Apache Hadoop. Its major innovations are, first,the ability to link together multiple computers and servers, either in a proprietary data center or in a computer cloud, and make them work like one huge computer that can scale up for a major task.

The software's second major innovation is the ability to sort through unstructured data such as all posts under a particular Twitter hash tag or emails containing a particular word or phrase, as well as through more structured data such as spreadsheets.

"The old way of collecting data was to only collect it . . . when a human generates it," Awadallah said, such as by making a purchase or filling out a survey.

"We called that an explicit transaction," he said. "Now we're collecting implicit information. We have all these sensors around humans in mobile devices and satellites taking images and there are Web services collecting information about you all the time nonstop."

The classic example of big data in the private sector is when Google, Facebook or another site mines through a user's search history, network of contacts and profile information to micro-target the advertisements she's most likely to click on.

Big data can be used in other commercial ways, though, that have nothing to do with Web activity.

The company Skybox Imaging, for example, has made a business out of sorting through satellite data to deliver commercial intelligence on demand, according to Awadallah.

"So [for example] you can buy a little stream from them that gives you a measure of how many cars are parked at Home Depot in different locations across the country," he said. "If you're a competitor of Home Depot's or if you're a financial analyst who's trying to predict the quarterly earnings of Home Depot that's very valuable information."

(Image via Pasko Maksim/

Threatwatch Alert

Thousands of cyber attacks occur each day

See the latest threats


Close [ x ] More from Nextgov

Thank you for subscribing to newsletters from
We think these reports might interest you:

  • It’s Time for the Federal Government to Embrace Wireless and Mobility

    The United States has turned a corner on the adoption of mobile phones, tablets and other smart devices, outpacing traditional desktop and laptop sales by a wide margin. This issue brief discusses the state of wireless and mobility in federal government and outlines why now is the time to embrace these technologies in government.

  • Featured Content from RSA Conference: Dissed by NIST

    Learn more about the latest draft of the U.S. National Institute of Standards and Technology guidance document on authentication and lifecycle management.

  • A New Security Architecture for Federal Networks

    Federal government networks are under constant attack, and the number of those attacks is increasing. This issue brief discusses today's threats and a new model for the future.

  • Going Agile:Revolutionizing Federal Digital Services Delivery

    Here’s one indication that times have changed: Harriet Tubman is going to be the next face of the twenty dollar bill. Another sign of change? The way in which the federal government arrived at that decision.

  • Software-Defined Networking

    So many demands are being placed on federal information technology networks, which must handle vast amounts of data, accommodate voice and video, and cope with a multitude of highly connected devices while keeping government information secure from cyber threats. This issue brief discusses the state of SDN in the federal government and the path forward.

  • The New IP: Moving Government Agencies Toward the Network of The Future

    Federal IT managers are looking to modernize legacy network infrastructures that are taxed by growing demands from mobile devices, video, vast amounts of data, and more. This issue brief discusses the federal government network landscape, as well as market, financial force drivers for network modernization.


When you download a report, your information may be shared with the underwriters of that document.