The National Archives and Records Administration plans to launch in December an online Citizen Archivist Dashboard through which volunteers can tag, transcribe and write articles about scanned NARA documents, said Pamela Wright, the agency's chief digital access strategist.
NARA initially will put up about 300 documents for transcription, Wright said Friday before a panel discussion on social media in government. Those documents will be coded green, yellow and red based on their length and how difficult it is to decipher the handwriting, she said.
The transcription page will include a simple interface with the original document on one screen and the transcribed version on another, she said, plus a magnifying glass feature for a closer look at obscured words.
The dashboard also will include a link for volunteers to tag words and images in the Archives' massive collection of scanned images, she said, making the documents more easily searchable.
She described tagging as a lower impact form of participation that could draw a wider audience.
Wright hopes to draw some specialists in historical fields to the dashboard, especially for certain major projects, but expects the majority of participants will be regular citizens with an interest in archival research. To that end, she's kept the registration process relatively simple, she said.
There will be another function inside the dashboard for the volunteers to write articles about NARA photos and documents, Wright said. Those articles and transcribed documents can be reposted to Wikipedia.
The Archives in May took on a part-time Wikipedian in residence, Dominic McDevitt-Parks, who has spent much of his tenure drumming up volunteers to transcribe Archives documents into searchable text on Wikisource, the Wikipedia repository for primary documents.
He has transferred about 90,000 Archives documents to Wikimedia Commons, the online encyclopedia's image repository, Wright said. The Archives' long-term goal is to get all its billions-strong holdings online, she said.
McDevitt Parks told Nextgov in July one of the greatest barriers to transcribing NARA documents is many of the most historically important ones were scanned with early low-resolution technology making them difficult to decipher. It's tough to make a case for rescanning those documents now, he said, because many of the Archives' holdings haven't yet been cataloged, let alone scanned and posted online.