recommended reading

These Are the 158 Key Federal Science Data Sets Rogue Programmers Have Duplicated So Far


Since the weeks leading up to Donald Trump’s inauguration day, impromptu gatherings of programmers, scientists and archivists have popped up at universities across the country. They gather on weekends, laptops and thumb drives in hand, order pizza and then download and archive as much federal science data as they can get their hands on.

These “data rescue” events have managed to archive tens of thousands of government website pages they fear may be edited or removed under an administration that has expressed hostility toward climate and environmental science. Copies of those web pages now live within the Internet Archive, best known for its Wayback Machine platform.

But the Internet Archive can’t scrape more elaborate databases—so in addition to simple web pages, the groups pull down intricate and often large data sets from science agencies like NASA, the Environmental Protection Agency and the National Oceanic and Atmospheric Administration, all three of which have been singled out by the Trump administration for budget and staffing cuts to their Earth and climate science programs.

Since January, 158 complete data sets have been downloaded, labeled and re-uploaded to, a growing repository of scraped government science.

And now, there’s a data visualization tool that lets you see exactly which data sets, from which agencies, the data rescue groups have duplicated so far. The list includes data sets from NOAA’s Earth-observing satellites, NASA’s polar-orbiting missions, pollution discharge monitoring reports from EPA, among many others.

Sarah Kolbe, a data scientist for California State University, built the data visualization after volunteering with a group of programmers in Madison, Wisconsin, who held its first “data rescue” event March 5.

“We’re planning to do another soon,” she says, so more data sets will likely be added.

DataRefuge Dashboard

A coalition of researchers called the Environmental Data and Governance Initiative, or EDGI, is monitoring the government web pages for any changes under the new administration, by comparing them to the scraped copies. (They also plan to track any data sets that are removed.)

And they’ve already found several notable changes: Climate change reports have disappeared off State Department websites, and as The New York Times points out, the science and technology office of EPA has changed its mission description from creating “scientific and technological foundations to achieve clean water” to creating “economically and technologically achievable performance standards.” A description of a federal fracking rule, and another about a methane emissions rule, have also gone missing from Interior Department web pages.

Threatwatch Alert

Thousands of cyber attacks occur each day

See the latest threats


Close [ x ] More from Nextgov

Thank you for subscribing to newsletters from
We think these reports might interest you:

  • It’s Time for the Federal Government to Embrace Wireless and Mobility

    The United States has turned a corner on the adoption of mobile phones, tablets and other smart devices, outpacing traditional desktop and laptop sales by a wide margin. This issue brief discusses the state of wireless and mobility in federal government and outlines why now is the time to embrace these technologies in government.

  • Featured Content from RSA Conference: Dissed by NIST

    Learn more about the latest draft of the U.S. National Institute of Standards and Technology guidance document on authentication and lifecycle management.

  • A New Security Architecture for Federal Networks

    Federal government networks are under constant attack, and the number of those attacks is increasing. This issue brief discusses today's threats and a new model for the future.

  • Going Agile:Revolutionizing Federal Digital Services Delivery

    Here’s one indication that times have changed: Harriet Tubman is going to be the next face of the twenty dollar bill. Another sign of change? The way in which the federal government arrived at that decision.

  • Software-Defined Networking

    So many demands are being placed on federal information technology networks, which must handle vast amounts of data, accommodate voice and video, and cope with a multitude of highly connected devices while keeping government information secure from cyber threats. This issue brief discusses the state of SDN in the federal government and the path forward.

  • The New IP: Moving Government Agencies Toward the Network of The Future

    Federal IT managers are looking to modernize legacy network infrastructures that are taxed by growing demands from mobile devices, video, vast amounts of data, and more. This issue brief discusses the federal government network landscape, as well as market, financial force drivers for network modernization.


When you download a report, your information may be shared with the underwriters of that document.