The Library of Congress will complete archiving every Twitter post from the social media company’s first four years online this month, according to a blog post published on Friday.
The Library is also developing a system to immediately archive new Tweets posted to the site, the post said.
“The Library’s focus now is on addressing the significant technology challenges to making the archive accessible to researchers in a comprehensive, useful way,” the Library’s director of communications Gayle Osterberg wrote.
The Library began the process of archiving hundreds of terabytes of data from the billions of Twitter posts in 2010. The database reached 133.2 terabytes of information in December. The Library was collecting nearly half a billion tweets each day as of October 2012, according to a white paper.
Osterberg emphasized the importance of the Twitter archive as part of the Library’s broader mission. Researchers have already contacted the Library asking if they can access the archive to study citizen journalism, public health trends and stock market activities, she said.
“As society turns to social media as a primary method of communication and creative expression, social media is supplementing, and in some cases supplanting, letters, journals, serial publications and other sources routinely collected by research libraries,” Osterberg wrote.

sponsored
Event: Digital Government Success: Meeting the Call for 21st Century Government
Performance Analytics: What It Means for Your Agency
What Big Data Means for TSA & Airport Security
How DHS is Mondernzing Mobile Procurement
JOIN THE DISCUSSION
By using this service you agree not to post material that is obscene, harassing, defamatory, or otherwise objectionable. Although Nextgov does not monitor comments posted to this site (and has no obligation to), it reserves the right to delete, edit, or move any material that it deems to be in violation of this rule.