Does the Intelligence Community Really Get Hadoop?


Intelligence experts say the government needs to hire more data scientists to keep up with the big-data savvy private sector.

The intelligence community is collecting more data than ever. But does that mean the intelligence gleaned from these massive new stores of data is also getting better?

Intelligence officials are naturally a little tight-lipped about the types of capabilities at their disposal. But current and former officials agreed the intelligence community’s foray into big data – using new tools to collect, process and parse through data on a massive scale – remains a work in progress.

For one thing, the intelligence community needs more data scientists, according to John Custer, the retired Army major general and former director of intelligence for U.S. Central Command. He spoke at a panel discussion last week hosted by Nextgov and the Intelligence and National Security Alliance.

The Harvard Business Review proclaimed data scientist the “sexiest job of the 21st century,” but the government hasn’t exactly picked up on the trend.

“I know of no military service or agency” – outside of a few federally funded research and development centers – “who has any educational process or training program for data scientists," said Custer, who’s now the director of EMC’s federal division. "So we're going to buy them, plain and simple. And they're incredibly expensive. We should be training people as data scientists.”

Traditional Role of Analyst Changing

The kind of applications and tools necessary to sift through petabytes of data – one petabyte equals the storage capacity of tens of thousands of iPhones – will also require a different breed of intelligence analysts, he said.

"We're not talking about your ESPN app on your smartphone – a hundred lines of JavaScript,” he said. “We're talking about applications that have to be equal to your 'Star Trek' Universal Translator,” he said. They have to speak multiple languages, and they have to query a host of different kinds of databases. This is graduate-level work.”

One of those applications is Hadoop, open-source software designed to help process massive datasets built across clusters of commodity servers, which is used by tech giants such as Facebook and Twitter.

The National Security Agency has reportedly been an early adopter and robust user of Hadoop.

But Custer said the intelligence community as a whole has only scratched the surface of its capability – and traditional intelligence analysts are still largely left in the dark.

“Analysts have no concept of it,” he said. “None. I guarantee you can talk to 99 percent of analysts and they have no idea what Hadoop can do for them; they don't even know what Hadoop is."

NGA: We’re Ready for More Data

Ellen McCarthy, chief operating officer of the National Geospatial-Intelligence Agency, acknowledged the intelligence community “may not be keeping pace with the private sector” when it comes to big data, but suggested the percentage of analysts familiar with Hadoop and other big-data analysis tools was much higher than 1 percent.

"Not all of our analysts would be able to define Hadoop, but they certainly understand the expectation that they're going to have to operate in a different way,” she said. “And there are analysts that are absolutely pushing back. But I have to say for the most part, our analysts are very excited about the future, because most of them were hired post-9/11 and they understand that this is the world that we live in.”

NGA, which is responsible for creating and collecting geospatial intelligence, such as satellite imagery, has queried industry and academia on developing new analytic capabilities through its online “GEOINT Solutions Marketplace,” McCarthy said.

"We're an agency that wants more data, which I'll tell maybe wasn't the case five years ago, where I think we were very afraid of the data,” she added, in part because the agency simply didn’t have the analytic capabilities to make sense of it all.

What Can the IC Learn … From Nike?

But for now, if you want to see some truly cutting-edge uses of big data in action, simply head to the mall, experts suggested.

"I look at what business does with big data and I think, although the intel community three years ago was on the cusp of understanding it, business has far surpassed it,” Custer said.

Look at Nike, he said.

"In the past … you bought your Nike gear because you thought Michael Jordan was a cool guy,” he said. “It was purely transactional. It was all about the purchase. Now, Nike offers online data platforms: pedometers, wrist-pods, music downloads to increase your performance. It's all about interaction.”

The same holds true with Amazon, which helpfully predicts customers’ next purchase by dissecting millions of data points about what previous customers bought. Or retail giant Target, which famously “knew” of a teen girl's pregnancy before her father based on a few telltale purchases.

"It's really important to understand that we live in a world where there is no longer just one haystack,” said John Jolly, 20-year veteran of the defense industry who’s now a principal at NC3. “In the past, the intelligence community had a tremendous asymmetric advantage. They led in many technology areas – in imagery, in computing ... We're in a period now where that's not going to be able to happen anymore.”

(Image via Bloomua/