recommended reading

How Computer Clouds Could Help Cure Cancer


Computer clouds have been credited with making the workplace more efficient and giving consumers anytime-anywhere access to emails, photos, documents and music as well as helping companies crunch through masses of data to gain business intelligence.

Now it looks like the cloud might help cure cancer too.

The National Cancer Institute plans to sponsor three pilot computer clouds filled with genomic cancer information that researchers across the country will be able to access remotely and mine for information.

The program is based on a simple revelation, George Komatsoulis, interim director and chief information officer of the National Cancer Institute’s Center for Biomedical Informatics and Information Technology, told Nextgov. It turns out the gross physiological characteristics we typically use to describe cancer -- a tumor’s size and its location in the body -- often say less about the disease’s true character and the best course of treatment than genomic data buried deep in the cancer’s DNA.

That’s sort of like saying you’re probably more similar to your cousin than to your neighbor, even though you live in New York and your cousin lives in New Delhi. It means treatments designed for one cancer site might be useful for certain tumors at a different site, but, in most cases, we don’t know enough about those tumors’ genetic similarities yet to make that call.

The largest barrier to gaining that information isn’t medical but technical, said Komatsoulis who’s leading the cancer institute’s cloud initiative. The National Cancer Institute is part of the National Institutes of Health. 

The largest source of data about cancer genetics, the cancer institute’s Cancer Genome Atlas, contains half a petabyte of information now, he said, or the equivalent of about 5 billion pages of text. Only a handful of research institutions can afford to store that amount of information on their servers let alone manipulate and analyze it.

By 2014, officials expect the atlas to contain 2.5 petabytes of genomic data drawn from 11,000 patients. Just storing and securing that information would cost an institution $2 million per year, presuming the researchers already had enough storage space to fit it in, Komatsoulis told a meeting of the institute’s board of advisers in June.

To download all that data at 10 gigabytes per second would take 23 days, he said. If five or 10 institutions wanted to share the data, download speeds would be even slower. It could take longer than six months to share all the information.

That’s where computer clouds -- the massive banks of computer servers that can pack information more tightly than most conventional data centers and make it available remotely over the Internet -- come in. If the genomic information contained inside the atlas could be stored inside a cloud, he said, researchers across the world would be able to access and study it from the comfort of their offices. That would provide significant cost savings for researchers. More importantly, he said, it would democratize cancer genomics.

“As one reviewer from our board of scientific advisers put it, this means a smart graduate student someplace will be able to develop some new, interesting analytic software to mine this information and they’ll be able to do it in a reasonable time frame,” Komatsoulis said, “and without requiring millions of dollars of investment in commodity information technology.”

It’s not clear where all this genomic information will ultimately end up. If one or more of the pilots proves successful, a private sector cloud vendor may be interested in storing the information and making it available to researchers on a fee-for-service basis, Komatsoulis said. This is essentially what Amazon has done for basic genetic information captured by the international Thousand Genomes Project.

A private sector cloud provider will have to be convinced that there’s a substantial enough market for genomic cancer information to make storing the data worth its while, Komatsoulis said. The vendor will also have to adhere to rigorous privacy standards, he said, because all the genomic data was donated by patients who were promised confidentiality.  

One or more genomic cancer clouds may also be managed by university consortiums, he said, and it’s possible the government may have an ongoing role.

The cancer institute is seeking public input on the cloud through the crowdsourcing website Ideascale. The University of Chicago has already launched a cancer cloud to store some of that information. It’s not clear yet whether the university will apply to be one of the institute’s pilot clouds.

Because the types of data and the tools used to mine it differ so greatly, it’s likely there will have to be at least two cancer clouds after the pilot phase is complete, Komatsoulis said. As genomic research into other diseases progresses, it’s possible that information could be integrated into the cancer clouds as well, he said.

“Cancer research is on the bleeding edge of really large scale data generation, he said. “So, as a practical matter, cancer researchers happen to be the first group to hit the point where we need to change the paradigm by which we do computational analysis on this data . . . But much of the data that I think we’re going to incorporate will be the same or similar as in other diseases.”

As scientists’ ability to sequence and understand genes improves, genome sequencing may one day become part of standard care for patients diagnosed with cancer, heart problems and other diseases with a genetic component, Komatsoulis said.

“As we learn more about the molecular basis of diseases, there’s every reason to believe that in the future if you present with a cancer, the tumor will be sequenced and compared against known mutations and that will drive your physician’s treatment decisions,” he explained. “This is a very forward looking model but, at some level, the purpose of things like The Cancer Genome Atlas is to develop a knowledge base so that kind of a future is possible.”

(Image via Tashatuvango/

Threatwatch Alert

Thousands of cyber attacks occur each day

See the latest threats


Close [ x ] More from Nextgov

Thank you for subscribing to newsletters from
We think these reports might interest you:

  • Modernizing IT for Mission Success

    Surveying Federal and Defense Leaders on Priorities and Challenges at the Tactical Edge

  • Communicating Innovation in Federal Government

    Federal Government spending on ‘obsolete technology’ continues to increase. Supporting the twin pillars of improved digital service delivery for citizens on the one hand, and the increasingly optimized and flexible working practices for federal employees on the other, are neither easy nor inexpensive tasks. This whitepaper explores how federal agencies can leverage the value of existing agency technology assets while offering IT leaders the ability to implement the kind of employee productivity, citizen service improvements and security demanded by federal oversight.

  • Effective Ransomware Response

    This whitepaper provides an overview and understanding of ransomware and how to successfully combat it.

  • Forecasting Cloud's Future

    Conversations with Federal, State, and Local Technology Leaders on Cloud-Driven Digital Transformation

  • IT Transformation Trends: Flash Storage as a Strategic IT Asset

    MIT Technology Review: Flash Storage As a Strategic IT Asset For the first time in decades, IT leaders now consider all-flash storage as a strategic IT asset. IT has become a new operating model that enables self-service with high performance, density and resiliency. It also offers the self-service agility of the public cloud combined with the security, performance, and cost-effectiveness of a private cloud. Download this MIT Technology Review paper to learn more about how all-flash storage is transforming the data center.


When you download a report, your information may be shared with the underwriters of that document.