recommended reading

Wikipedia Is Better Than Google at Tracking Flu Trends

Flickr user peterthoeny

Wikipedia traffic could be used to provide realtime tracking of flu cases, according to a study published today. John Brownstein, a professor of pediatrics at Harvard Medical School and director of Boston Children’s Hospital’s computational epidemiology group, along with follow researcher David McIver, has developed an algorithm for pulling daily flu metrics from data on which flu-related terms are viewed in the online open-source encyclopedia.

Brownstein previously developed Flu Near You, which relies on users to self-report flu-like symptoms in themselves, family, and friends. But by analyzing page views for terms such as “fever,” “influenza,” and “Tamiflu,” for example—Brownstein and McIver created a more reliable method of estimating flu spikes.

Using online activity to monitor flu trends isn’t a new idea. Google Flu Trends has used flu-related search engine queries to estimate the number of daily cases since 2008. But the algorithm failed in 2009, overestimating the peak number of cases during the H1N1 swine flu pandemic. The 2012-2013 flu season saw similar miscalculation.

When compared to data from the Centers for Disease Control and Prevention on the prevalence of flu-like illnesses in the US (which is released to the public with a two-week lag) the Wikipedia model was found to be more accurate than Google’s. As the charts below show, that’s because of its ability to stay on track even during sudden spikes in infection (and the accompanying panic):

Screen Shot 2014-04-17 at 2.58.28 PM

Screen Shot 2014-04-17 at 2.57.49 PM

Perhaps, the authors suggest, hyped pandemics and particularly unpleasant flu strains cause increased Googling—including by those not ill but looking for news stories. The researchers don’t didn’t investigate exactly why those who click through to Wikipedia are more likely suffering from the flu, or near someone who’s suffering. But it stands to reason that the site can give researchers a nuanced read on how we’re feeling: Wikipedia is likely to be among the top results in web searches—and as the No.1 source of health information on the internet, those who click through to the site may be more likely to be seeking information about symptoms or medications.

In the paper, Brownstein and McIver point out that the CDC’s data isn’t perfect, either: It’s reported by physicians, who may be more likely to log flu-like symptoms when they have heard media buzz about a possible pandemic. Indeed, it’s not impossible that web-driven metrics may one day overtake the official data in both speed and accuracy.

(Image via Flickr user peterthoeny)

Threatwatch Alert

Thousands of cyber attacks occur each day

See the latest threats


Close [ x ] More from Nextgov

Thank you for subscribing to newsletters from
We think these reports might interest you:

  • It’s Time for the Federal Government to Embrace Wireless and Mobility

    The United States has turned a corner on the adoption of mobile phones, tablets and other smart devices, outpacing traditional desktop and laptop sales by a wide margin. This issue brief discusses the state of wireless and mobility in federal government and outlines why now is the time to embrace these technologies in government.

  • Featured Content from RSA Conference: Dissed by NIST

    Learn more about the latest draft of the U.S. National Institute of Standards and Technology guidance document on authentication and lifecycle management.

  • A New Security Architecture for Federal Networks

    Federal government networks are under constant attack, and the number of those attacks is increasing. This issue brief discusses today's threats and a new model for the future.

  • Going Agile:Revolutionizing Federal Digital Services Delivery

    Here’s one indication that times have changed: Harriet Tubman is going to be the next face of the twenty dollar bill. Another sign of change? The way in which the federal government arrived at that decision.

  • Software-Defined Networking

    So many demands are being placed on federal information technology networks, which must handle vast amounts of data, accommodate voice and video, and cope with a multitude of highly connected devices while keeping government information secure from cyber threats. This issue brief discusses the state of SDN in the federal government and the path forward.

  • The New IP: Moving Government Agencies Toward the Network of The Future

    Federal IT managers are looking to modernize legacy network infrastructures that are taxed by growing demands from mobile devices, video, vast amounts of data, and more. This issue brief discusses the federal government network landscape, as well as market, financial force drivers for network modernization.


When you download a report, your information may be shared with the underwriters of that document.