For Google, Public Not Always Public

Google this week piloted a new <a href="http://googleblog.blogspot.com/2009/04/adding-search-power-to-public-data.html">feature</a> that, according to the company, "makes it easy to find and compare public data," but, in reality, Google itself can't find a lot of the public data out there.

Google this week piloted a new feature that, according to the company, "makes it easy to find and compare public data," but, in reality, Google itself can't find a lot of the public data out there.

Many federal Web sites or content on site pages cannot be indexed by typical search engines, including Google. So, much of the data on these sites is invisible, or hidden in the so-called "Deep Web."

Part of the reason is that government pages include databases, forms and other coding that search engines cannot crawl through. Many are also lacking site maps, or a visual breakdown of the pages of a Web site, that help search engines capture all of a site's pages.

Google's new tool, Google Public Data, only works with public data that is already accessible to search engines. The company created the application because federal data "was complicated not because it was inaccessible," Google spokeswoman Aviva Gilbert said.

It takes easy-to-crawl -- but otherwise opaque -- information and makes it easy to understand. Google grabs unemployment rates and population data accessible through the sites of the U.S. Bureau of Labor Statistics and the U.S. Census Bureau's Population Division and displays the information in interactive graphs that the user can manipulate.

"When comparing Santa Clara county data to the national unemployment rate, it becomes clear not only that Santa Clara's peak during 2002-2003 was really dramatic, but also that the recent increase is a bit more drastic than the national rate," states a Google release, explaining what can be concluded based on the visualization.

A standard Google search for unemployment data in Santa Clara county would lead the user to a list of various public and private links -- "but it was difficult to navigate," Gilbert said.

The new tool does not make online federal data more accessible, but rather more meaningful.

Google researchers "haven't figured out how to site map everything," said Jerry Brito, who studies government transparency as a senior research fellow at George Mason University's Mercatus Center. "It's kind of impossible to do that without the government agencies cooperation."

He and Google have advocated that Congress require agencies to make all of the information on their sites accessible to commercial search engines.

A law introduced last Congress by Sen. Joseph I. Lieberman, I-Conn., chairman of the Homeland Security and Governmental Affairs Committee, directs the government to "promulgate guidance and best practices to ensure that publicly available online federal government information and services are made more accessible to external search capabilities, including commercial and governmental search capabilities."