Government Printing Office adopts internal XML system


New format will make it easier to launch apps and e-books, CTO says.

The Government Printing Office is adopting a new system that will manage and publish congressional bills and other publications entirely in a pared down and machine-readable XML format, the company providing the system announced Wednesday.

GPO plans to launch a “proof of concept” for the new system with congressional bills before expanding it to other publications such as the Federal Register and the Congressional Record, Chief Technology Officer Ric Davis told Nextgov.

The XML system provided by vendor SDL will replace a three-decade old proprietary system known as Microcomp that simply ran out of capacity despite numerous additions and patches, said Lou Iuppa, SDL’s vice president for strategic business development.

GPO currently receives some of its raw materials for publication in XML, including House bills. It also publishes some materials in XML, such as the Federal Register.

The new system will allow GPO to manage all its internal processes in XML without first converting documents to the format used by its proprietary system, according to Matt Landgraf, lead program planner at GPO. That will save time and labor, he said.

Publications that reach the public, such as congressional bills accessed through GPO’s Federal Digital System or the Library of Congress’ Thomas system, should look the same after the transition, agency officials said, though more of them likely will be available in XML in addition to other forms such as PDFs, officials said.

The system also will make it easier to load GPO publications into new forms such as e-books and smartphone and tablet applications, according to Davis.

“We wanted an open and standard system,” he said. “What we’ll see is the ability of end users and customers to repurpose that information more readily than they can today. The tagging structure, the metadata and other features are much richer.”

XML is a preferred publishing form for Web developers because it is easily manipulated and can be read by machines as well as by people. Federal Chief Information Officer Steven VanRoekel has urged agencies to adopt an “XML first” approach to Web publishing as part of a broader drive to make government data more accessible to developers.

Agencies that publish documents online in XML also typically offer a PDF version, which looks cleaner to human eyes but is less intelligible to computers.

The SDL contract is for installing the new system and training GPO employees, which will begin in October, an SDL spokeswoman said.