That may mean those sites have been shut down or that their content has been consolidated into larger sites in accordance with a White House plan to drastically cut the federal Web presence over the coming year, said Benjamin Balter, a new technology fellow at the Federal Communications Commission and graduate student at The George Washington University who designed the analysis tool as a personal project.
In other cases, it could mean the sites were temporarily down during the search or that the collection system made an error, Balter said. He stressed that his analysis tool, which also gathers information about sites' content management systems and the sophistication of their Internet protocol addresses, has not been rigorously tested and that the results are by no means 100 percent accurate.
"This [data set] is nothing to bet the house on," he said, "but, it should give a good general picture of where things are."
Agencies have acknowledged shuttering some sites as part of the website cutting initiative, but so far there's been no central list of closed or consolidated government sites.
Balter said he developed the Web analysis tool out of "idle curiosity" and when he saw the list of top-level dot-gov domains it was a "no brainer" to try it out.
Though far from an official endorsement of his findings, White House New Media Director Macon Phillips Tweeted Balter's results approvingly.
About 70 percent of the dot-gov domains have no detectable content management system or may be using a custom-built system, according to Balter's analysis, and more than one-tenth of the sites are unreachable without typing the sites' "www" prefix into the address bar.
For simple, run-of-the-mill websites, requiring the "www" prefix typically means the sites are extremely old or unsophisticated. In the case of higher traffic sites such as NASA.gov and FAA.gov it may mean the sites are a complex mesh of public and private information, which makes modernizing the addresses more complicated, Balter said.
Balter's data show only a handful of the sites are hosted in a public cloud service, which allows website owners to save money on storage and to more easily handle surges in use.
About 200 of the sites are using some form of Web analytics to track what sections of the site are most popular, the data said.
Only nine of the sites appear to be fully compliant with Internet protocol version 6, or IPv6, standards for how the site transfers information, according to the data. The federal government has been working to transition to IPv6 for more than half a decade. Agencies are required to have fully transferred public facing content on their sites by the end of 2012 and to have moved internal content by 2014.
The federal website cutting initiative is aimed at saving money by consolidating sites into a few uniform architectures and content management systems, and at raising the overall standards of government websites, which are often disorganized, slow and clunky. The project is modeled, in part, on an ongoing British government effort that has cut or consolidated about 75 percent of the country's 2,000 websites over five years.