Federal agencies, thanks to their unique missions, have long been collectors of valuable, vital and, no doubt, arcane data. Under a nearly two-year-old executive order from President Barack Obama, agencies are releasing more of this data in machine-readable formats to the public and entrepreneurs than ever before.
But agencies still need a little help parsing through this data for their own purposes. They are turning to industry, academia and outside researchers for cutting-edge analytics tools to parse through their data to derive insights and to use those insights to drive decision-making.
Take the U.S. Agency for International Development, for example. The agency administers U.S. foreign aid programs aimed at ending extreme poverty and helping support democratic societies around the globe.
Under the agency’s own recent open data policy, it’s started collecting reams of data from its overseas missions. Starting Oct. 1, organizations doing development work on the ground – including through grants and contracts – have been directed to also collect data generated by their work and submit it to back to agency headquarters. Teams go through the data, scrub it to remove sensitive material and then publish it.
The data spans the gamut from information on land ownership in South Sudan to livestock demographics in Senegal and HIV prevention activities in Zambia.
"While some data sets might seem arcane to one person to the next person, it's the desperately sought-after answer to a researcher's key question,” Brandon Pustejovsky, USAID’s chief data officer, told Nextgov.
But one consequence of this new push to publish government data sets is that agencies are realizing just how much – and how valuable – their data is.
Pustejovsky added: "What we are finding is as we put more data into the public realm that we have missions that may look at that data and say, 'We'd like to visualize it in a certain way.’ Or, 'We have questions about it.' Or, 'We'd like to conduct a specific kind of analysis on that data.'"
The agency took the first step in solving that problem with a Jan. 20 request for information from outside groups for cutting-edge data analytics tools.
“Operating units within USAID are sometimes constrained by existing capacity to transform data into insights that could inform development programming,” the RFI stated.
The RFI queries industry on their capabilities in data mining and social media analytics and forecasting and systems modeling.
USAID is far from alone in its quest for data-driven decision-making.
A Jan. 26 RFI from the Transportation Department’s Federal Highway Administration also seeks innovative ideas from industry for “advanced analytical capabilities.”
But FHWA wants to kill two birds with one stone, so to speak. The agency is interested in first transferring its “vast data holdings” to the cloud where it will then be more easily available for data analysis.
Currently, only a small percentage of data is available via public-facing websites, often in varying formats. The agency is considering gathering it all in one place – and in a cloud environment where storage and computing is more affordable – “thus removing government infrastructure as a bottleneck to the pace of American innovation and enabling new value-added services and unimaginable integration into our daily lives.”
The agency hosts a treasure trove of transportation-related data, including annual inventories of roadways, tunnels and bridges, aggregate data on licensed drivers by age and gender and monthly data on gasoline sales.