Intelligence Community Wants to Use DNA to Store Exabytes of Data


The IC is exploring whether polymers could be the future of data storage.

The U.S. intelligence community wants to unlock more efficient ways to store the trove of data humans generate every day, and it believes our DNA could hold the key.

The Intelligence Advanced Research Projects Activity last month issued a broad agency announcement seeking research teams for the agency’s Molecular Information Storage program, which aims to create a system for storing vast quantities of data on sequence-controlled polymers, like human DNA.

Selected teams would have two primary tasks over the four-year initiative: build a table-top device that writes data onto polymers and another that reads the information once it’s stored. Teams must also develop an operating system to index, access and search data within the network.

By the program’s end, the system must be able to write one terabyte and read 10 terabytes per day, and “present a clear and commercially viable path to future deployment at the exabyte scale” within 10 years, according to IARPA.

As a comparison, one exabyte is about 4 million times larger than the storage capacity of the top iPhone X model.

Today, exabyte-scale data centers take up huge tracts of land and can cost billions to build and operate in the long run, an infrastructure IARPA argues will no longer be feasible in the years to come. By 2020, the tech firm Domo estimates there will be more than 140 gigabytes of data generated daily for each human on Earth, and as the internet of things expands, that number is only expected to grow.

“This resource intensive model does not offer a tractable path to scaling beyond the exabyte regime in the future,” IARPA wrote. “Faced with exponential data growth, large data consumers may soon face a choice between investing exponentially more resources in storage or discarding an exponentially increasing fraction of data.”

During a proposers day presentation in February, the agency outlined its vision for an exabyte-scale storage unit that could be housed in a single room and cost less than $1 million to run per year. Though scientists have yet to build a system anywhere close to that scale, multiple studies have shown sequence-controlled polymers are capable of virtually error-free data storage, according to IARPA.

Researchers estimate DNA and similar polymers can store information more than 100,000 times more efficiently than traditional data storage technology, and polymers’ stable molecular structure allows them to last hundreds of years without losing or corrupting information. More efficient data storage technology could also help researchers gain increased insights from today’s state-of-the-art supercomputers.

Groups hoping to join the program must submit proposals by July 16.

Editor's note: This article was updated to correct exascale data center operating costs.