BLS on CPI: I'm old fashioned, but I don't mind it

One of economics' most important numbers depends on data collected by hand – data that's largely already online, posted freely by merchants and their customers.

bar chart made of $100

Government and business alike rely on the Consumer Price Index for major calculations, especially for determining cost-of-living adjustments to Social Security and other payments.

Yet the Labor Department’s Bureau of Labor Statistics has resisted change, sticking to its methods of telephone surveys and manual price collection as the rest of the world zooms past online.

Turns out, though, that using more sophisticated methods yields pretty much the same results that BLS gets with its old-fashioned surveys.

CPI basics

Two main data sets underpin the CPI: consumer expenditure survey results, collected for the BLS by the Census Bureau, and price information collected by BLS employees from individual stores.

The consumer expenditure surveys help the BLS determine the “market basket”: what percentage of their income the average consumer spends on different items from apples to ziti.

“The goal is to get an accurate representation of what people are actually buying,” said BLS economist Ken Stewart.

BLS collectors are then tasked with tracking the prices of very specific items from the basket – Pink Lady apples, for instance – at a particular store, month after month, to see how much the market basket is costing Americans over time.

It’s a time-consuming process, as economists like Stewart readily admit.

The actual calculations through which BLS run the data to get the CPI are sophisticated – see BLS Commissioner Erica Groshen’s House testimony here for an overview – but the collection methods are far from state-of-the-art.

A better way online?

In a big way, the data the BLS seeks is already online.

On the “What are people buying?” side, cash may not be dead, but between smartphone apps and credit cards, more and more of consumers’ purchasing can be easily tracked.

Services such as Mint sync with users’ bank accounts to monitor spending, creating the sorts of data sets the BLS needs to determine a market basket’s composition.

For the other element in the equation, price information is also online on retailers’ websites, ripe for webscraping.

PriceStats does just that, scraping prices from hundreds of retailers to generate daily inflation series across 70 countries and powering MIT’s Billion Prices Project.

Other companies have gone beyond scraping.

Premise, a firm that provides economic data to the World Bank and Bloomberg, combines social platforms with big data analytics.

A global network of paid collectors snap pictures of store shelves, and Premise’s software pulls price information from the uploaded photographs to add to its price index-generating data sets.

“Our numbers highly correlate to the official CPI,” said Premise’s Sara Blask. “The difference is that our numbers are collected and analyzed on a significantly higher frequency basis. Whereas the BLS publishes the CPI monthly, we’re analyzing our data in real-time.”

Blask noted that Premise’s big data-meets-social approach allows for the tracking of much more than just inflation; Premise can also monitor food shortages, political campaign activity and other critical phenomena.

Official caution

As industry and academia are racing ahead, the BLS is staying put.

BLS’s Stewart said the bureau sees alternate approaches to data collection as an “opportunity,” but no firm plans exist for the implementation of webscraping or other methods.

“There have been discussions,” Stewart said, but concerns weigh heavy in the BLS.

Consistency is one – the CPI relies on strict continuity when it comes to tracking items’ prices, and Stewart wondered whether online retailers’ data sets would phase different brands and varieties in and out.

“When you webscrape, can you get [the price of] that same shoe month after month after month?” Stewart asked. “I don’t know.”

Getting permission to use data – from consumers’ anonymized purchasing information to retailers’ prices – would also be key, and potentially hairy.

Besides, Stewart noted, online spending information would leave out cash transactions, potentially skewing market basket compositions, while the prices of services such as haircuts (included in the CPI) are generally not posted online, meaning collectors would still need to go into the field.

Change would run deep, Stewart added: “You’d have to rethink not just your data collection strategy, but your whole sample design: outlets, items, all of it.”

And, he noted, the current methods aren’t quite as labor-intensive as they seem.

Collectors often call stores for price checks rather than trekking out in person, and some of the low-hanging fruit would be low-hanging whether prices were collected manually or electronically.

“Gasoline’s a good example” of a uniform good with regular price information that could be picked up online, “but it’s also one of the cheapest things to collect,” Stewart said. “You can just drive by the station and read the sign.”

No change on the horizon

“Outmoded and monstrously expensive manual surveys are not sufficient to meet the policy, trading, relief or business strategy challenges posed by an era of unprecedented economic and social volatility,” wrote Premise founder David Soloff back in 2013. “The fundamental nature of human economic activity has changed. It’s time for the means of economic analysis to catch up.”

But there’s no telling when the BLS might take the leap.

“We believe that our current methods are the best, given the whole range of things we need to be concerned about,” Stewart said.

He quickly added that the BLS is “absolutely open” to other methods – though, after years of pressure, the bureau is still “just starting research” into webscraping and other techniques.

Even a hybrid method, in which some data comes from the Internet and some is still collected manually, is still in the “discussions” phase, Stewart said.

In the interest of accuracy, the BLS will keep doing things by hand for the foreseeable future, Stewart said: “[Electronic data collection for the CPI] is not around the corner.”