Big data can Moneyball government services

D. Ross Cameron/AP file photo

By standardizing more of its data and sharing it more widely, the federal government could deliver service more efficiently and cut costs.

The initial version of this story misidentified the vendor of a 2011 Centers for Medicare and Medicaid Services system to track improper payments. It has been corrected.

The federal government could learn from the state of Michigan and the Oakland Athletics, Teradata Corp. executive Darryl McDonald told a House panel Thursday: Both have used complex data analytics to cut down on inefficiencies and save money.

The A's crunched through mountains of statistics to figure out how to build a better team for far less money than the biggest spenders in Major League Baseball. As portrayed in the 2011 film Moneyball, the team ended the 2002 season first in the American League West division.

Beginning in 1996, Michigan invested in an enterprise data warehouse that pulled together information from across the state into standardized formats so that its agencies could learn from each other, avoid duplicating tasks, and spot common patterns of fraud, waste and abuse. About 10,000 users across 20 state agencies now regularly access the database, saving thousands of dollars annually, McDonald told members of a House Ways and Means subcommittee on human resources.

Michigan is a Teradata customer.

By standardizing more of its data and sharing it more widely among agencies and with state and local governments, McDonald and other panelists told subcommittee members, the federal government could deliver service more efficiently and cut costs.

The Center for Medicare and Medicaid Services launched a system in July 2011 aimed at reducing the agency's $50 million in annual improper payments by spotting common fraud indicators before the 30-day window, by which most Medicare claims must be paid.

Medicare officials have long been aware of major indicators of fraudulent or erroneous claims, but so many files are flagged by those notifiers -- including valid claims -- that officials didn't previously have time to investigate them all. Officials processed fraudulent claims and tried to play catch-up later, chasing after improper payments. The new system aims to fine-tune officials' understanding of which claims might be fraudulent, guiding resources to the most suspect claims before any money is paid out.

Teradata built an Agriculture Department system that compares farmers' claims for crops ruined by floods and tornadoes with satellite information from the National Weather Service and mapping data to spot claims that might be fraudulent. In the past decade, the system has saved more than $1 billion, McDonald told Nextgov after the hearing.

Data standardization and analytics also can help programs disburse payments more efficiently.

Ginger Zielinskie is executive director of Benefits Data Trust, a nonprofit that partners with state agencies and private businesses to create information sharing platforms that make it easier for claimants to simultaneously apply for a range of assistance programs and also identify people who are eligible for benefits they're not receiving.

Many people forced out of work by the recession are unfamiliar with state and federal assistance programs, or are embarrassed to apply for them, Zielinskie told lawmakers. By letting them know they're eligible for food stamps or other assistance, she said, her organization has helped them to stay in their homes or stop borrowing from retirement plans they spent years building up.

The first step for most federal organizations, McDonald told Nextgov typically is to standardize their data so that they can be shared between divisions.

Part of that process is about storing information in common readable forms such as XML files or XBRL files for financial data. Another part is settling on standard data fields between agencies, he said. This can be as simple as including a person's full middle name or only an initial, or establishing standard ranges for peoples' annual income.

"When you type Darryl McDonald into a system -- depending on how you type it, you could find zero or 12 Darryls," he said, illustrating the need for uniformity.

The next step, he said, is about breaking down information sharing barriers between divisions and agencies and applying analytics to learn how programs can operate more efficiently and at a lower cost.

"[Divisions] have been given the freedom to get data and store data in any way they want so you have to start standardizing at the lowest levels," he said. "Once you do that you find tremendous synergies within agencies."