The agency is currently transferring about 532 million Obama administration files into its archives.
President Donald Trump may be the most prolific Twitter user the Oval Office has seen, but the National Archives and Records Administration is currently grappling with how best to store the vast number of social media postings, emails and digital documents produced by his predecessor's White House.
At noon on Inauguration Day, NARA officially began the task of rapidly transferring records from the Obama administration into its archives. The transfer it’s still conducting is the largest in its history, encompassing about 532 million files.
The system NARA used to store records from George W. Bush’s administration, which generated about 228 million emails, is undergoing a revamp so it can store the Obama White House’s 308 million emails, as well as the first-ever transfer to NARA of social media postings, including records from Salesforce, SharePoint, Twitter and Snapchat.
» Get the best federal technology news and ideas delivered right to your inbox. Sign up here.
NARA plans to unveil the new version of its overall Electronic Records Archives system—and its improved tagging and search—in the spring, program manager Ken Hawkins told Nextgov.
Hawkins, based in NARA’s Office of the Chief Information Officer, handles storage for records generated by the Executive Office of the President. Under the Presidential Records Act, those documents must be preserved, and NARA works directly with the White House to arrange the transfer of that data into NARA’s custody. Under Bush, those might have been faxes and letters. Today, they include snaps, tweets, YouTube clips and Salesforce documents.
Since the end of the Bush administration, NARA has been using a content management platform provided by Hitachi to index presidential records, Hawkins explained. In addition to improvements in searching and annotating, it would also store millions of uncompressed raw images, captions and albums from the White House photographer, Pete Souza.
Part of NARA’s technological challenge is to find platforms that can index large amounts of data and also preserve its quality, which can be more difficult for high-resolution images or high-definition video files, Hawkins said.
Social media postings also create interesting storage challenges, Hawkins said. NARA already oversees the archived Twitter, Instagram and YouTube accounts of the White House, including Barack and Michelle Obama’s official accounts; that’s because NARA contacted Twitter to arrange that transfer, Hawkins explained.
“Archiving requirements are not always built into a [social media] platform,” he said. The solution, Hawkins said, is to engage directly with the White House and with social media companies “to develop a two-track approach where the platforms themselves, such as Facebook and Twitter, preserve and sort of freeze a version of the individual accounts.”
Simultaneously, NARA, the White House and those companies collaborated to export and the digital content, Hawkins said, so “essentially the government will have its own copy of the content in its highest resolution.”
Hawkins said he was actually expecting more emails from the Obama White House. Presidents Ronald Reagan and George H.W. Bush combined created 2.4 million emails, and Bill Clinton had 42 million. Because the Obama White House had been significantly more digitally engaged than the Bush administration, which had 228 million emails, NARA had prepared to store an exponentially greater number of emails from Obama’s staff.
“We ended up deciding that the reason it didn’t double or go up more than that … [is] for a finite number of people that work in the organization, there’s only so many hours of a day they can email," Hawkins said.