These Are the 158 Key Federal Science Data Sets Rogue Programmers Have Duplicated So Far


Since the weeks leading up to Donald Trump’s inauguration day, impromptu gatherings of programmers, scientists and archivists have popped up at universities across the country. They gather on weekends, laptops and thumb drives in hand, order pizza and then download and archive as much federal science data as they can get their hands on.

These “data rescue” events have managed to archive tens of thousands of government website pages they fear may be edited or removed under an administration that has expressed hostility toward climate and environmental science. Copies of those web pages now live within the Internet Archive, best known for its Wayback Machine platform.

But the Internet Archive can’t scrape more elaborate databases—so in addition to simple web pages, the groups pull down intricate and often large data sets from science agencies like NASA, the Environmental Protection Agency and the National Oceanic and Atmospheric Administration, all three of which have been singled out by the Trump administration for budget and staffing cuts to their Earth and climate science programs.

Since January, 158 complete data sets have been downloaded, labeled and re-uploaded to, a growing repository of scraped government science.

And now, there’s a data visualization tool that lets you see exactly which data sets, from which agencies, the data rescue groups have duplicated so far. The list includes data sets from NOAA’s Earth-observing satellites, NASA’s polar-orbiting missions, pollution discharge monitoring reports from EPA, among many others.

Sarah Kolbe, a data scientist for California State University, built the data visualization after volunteering with a group of programmers in Madison, Wisconsin, who held its first “data rescue” event March 5.

“We’re planning to do another soon,” she says, so more data sets will likely be added.

DataRefuge Dashboard

A coalition of researchers called the Environmental Data and Governance Initiative, or EDGI, is monitoring the government web pages for any changes under the new administration, by comparing them to the scraped copies. (They also plan to track any data sets that are removed.)

And they’ve already found several notable changes: Climate change reports have disappeared off State Department websites, and as The New York Times points out, the science and technology office of EPA has changed its mission description from creating “scientific and technological foundations to achieve clean water” to creating “economically and technologically achievable performance standards.” A description of a federal fracking rule, and another about a methane emissions rule, have also gone missing from Interior Department web pages.

