Data warehouse gets a facelift for its one-year anniversary

Sets of statistics will receive popularity ratings and visitor information similar to that on the video-sharing site YouTube.

The White House on Friday will unveil an upgraded online depot of downloadable federal statistics that is the equivalent of a YouTube video-sharing website for data analysts, according to federal Chief Information Officer Vivek Kundra.

When Data.gov launched in May 2009, its offerings were scant and the formatting of its data sets was unpopular with developers who wanted more Web-friendly downloads for publication on their own sites. Today, open government advocates and programmers still quibble about a lack of accessible formats and a failure to provide context explaining the information available, but they acknowledge the project is a first-of-its kind experiment for the United States and has opened more data than ever before.

On May 21, which is the one-year anniversary of the website, Data.gov will catalog more than 250,000 datasets on topics ranging from earthquakes to retail gasoline prices -- all retrievable through the site's new advanced search engine powered by Microsoft's Bing.

"This is the future, where you see databases being shared not just locally but globally," Kundra told reporters during a briefing to demonstrate the new functionalities coming to Data.gov. He noted that after the launch of Data.gov, the United Kingdom, Canada and Australia followed suit with similar sites.

Friday's iteration will label data sets with popularity ratings based on the number of times users have downloaded the information, list the top 10 states visiting Data.gov, and display other use information that is similar to the social features found on YouTube.

From the beginning, the White House has marketed Data.gov as a warehouse for agencies to expose their raw information so third-party websites can develop innovative analyses for the public. Such online renderings, often called mashups, compare data sets with outside statistics to uncover interesting or suspicious patterns, such as high unemployment rates in communities that have received a large injection of economic stimulus money.

During the past year, Data.gov's contents have helped developers create 237 applications. The day the site debuted, its 47 data sets received a collective 2.1 million hits; 98 million hits are expected on Friday.

"The birth of the community of innovators -- that's far exceeded my expectations," Kundra said. But the government has had to overcome some technical difficulties. "We wanted to release [U.S. Patent and Trademark Office] data, but its systems were decades old," he said.

In the future, the White House wants to post new statistics in as close to real time as possible, he said, adding the Obama administration does not want too many people massaging the data before it is released.

The administration also is making an effort to reach beyond techies and advocacy groups and lure the general public to Data.gov. The new site will add more feedback mechanisms for people to tell the government what kinds of applications and other electronic information they want featured on Data.gov.

For assistance, the White House has hired a former NASA official, who is well-versed in citizen outreach, to drive interest in statistical analysis among grade school students. Jeanne Holm, previously the chief knowledge architect at NASA's Jet Propulsion Laboratory, helped oversee the space agency's website during one of the most-viewed Internet events ever -- the landing of the Mars Exploration Rover on the surface of Mars in 2004.

"That's the shift that's happening in the next 365 days," Kundra said.

Friday's update also will include a showcase of hundreds of applications designed by citizens, he said.

For example, students at Rensselaer Polytechnic Institute in Troy, N.Y., have devised an interactive map that shows the ratio of total debts over assets from all bankruptcy cases at the state level and describes each company that went bankrupt. And using a free online tool and community health data from Data.gov, a member of the nonprofit community of developers at Sunlight Labs, a government transparency group, created an obesity comparison application. It shows individuals how their neighborhood compares to their state and the nation as a whole across several measures related to obesity.

NEXT STORY: Email Your Doc, 24/7