How the Library of Congress is using both AI and volunteers to unlock public broadcasting history

Main reading room of the Library of Congress. Doug Armand/Getty Images
The FixIt+ platform uses AI-generated transcripts as a starting point, then relies on volunteers to refine them so historic public media becomes easier to search, study and understand.
Public broadcasting has a long history of capturing important moments in American life. It preserved voices from the civil rights movement, debates over war and foreign policy, regional arts coverage and local public affairs programs that reflected the people and places shaping the nation. But many of those moments have also been hard to find, buried in tape vaults, archives and library collections that few people would ever be able to search or even really know about.
That is part of what makes the American Archive of Public Broadcasting (AAPB) so interesting. A collaboration between GBH, which is the Boston public media organization formerly known as WGBH, and the Library of Congress, the archive is working to make historic public media more searchable and accessible, in part by using AI-generated transcripts as a starting point.
The public-facing correction layer for that effort is called FixIt+, a volunteer platform where people can review and refine machine-generated transcripts from older radio and television programs. As AAPB Archives Outreach Manager Meghan Sorensen explained in an interview published by the Library of Congress, “FixIt+ is a volunteer transcript correction platform and open-source project maintained by our team at GBH for the American Archive of Public Broadcasting. Its mission is to make historic public media more accessible by inviting the public to help update and correct computer-generated transcripts in a way that feels easy and engaging.”
That approach makes a lot of sense. I work with transcripts fairly often myself. When I am covering an important speech, a major announcement or a policy presentation, I will often check the transcript as I listen so I can catch words or details that went by too quickly. For modern events, AI-generated transcripts are usually pretty good. Even so, they still stumble in predictable ways. Laughter, coughing and side comments can confuse them, and they sometimes force nonverbal sounds into words that were never spoken.
That problem becomes much more obvious when the recordings are older. In trying out and working with FixIt+, I spent time with broadcasts from the 1960s and 1970s, and the limitations were easy to hear. The audio may have been broadcast quality for its time, but by modern standards it can sound thin, noisy or compressed. Regional accents can make the software hesitate, and background sounds only make the job harder. If the archive simply accepted the AI-generated text as final, there would almost certainly be mistakes left behind. I found quite a few pretty obvious ones within the first several minutes of using the platform. They were not huge errors, but for important historical moments, the transcripts should be as accurate as possible.
FixIt+ handles that problem in a practical way. As I listened to audio or watched the old television program, I could type my suggested correction directly into the line on the transcript. Once I saved the change, it became part of the archive workflow so that other volunteers could review what I had done. They could then approve my change or suggest an alternative. Only after a transcript reaches volunteer consensus is it treated as complete. The project describes this as a “human-in-the-loop” process, meaning people improve transcripts generated by computers instead of relying on the software alone. Sorensen put the larger point plainly: “Technology gives us a great jumping-off point, but it is our volunteers who make the real difference.”
What makes the project especially compelling is the material itself. The recordings available for correction are not filler. They include voices tied to civil rights history, national security, foreign policy and regional cultural life. A volunteer might spend time with a May 28, 1961 program involving Freedom Rider Mary Jean Smith, an April 10, 1975 Bill Moyers Journal conversation with former Defense Secretary Clark Clifford about Vietnam and its aftermath, a February 24, 2012 talk by Donald Rumsfeld at Fort Leavenworth, or a January 23, 2004 episode of Black Horizons that includes a discussion of a Buffalo Soldier stage production. A lot of the programs feature very serious topics with rich historical value. And the sheer range of subjects makes it clear that correcting these transcripts is not just a technical chore. It’s a way to help preserve and open up pieces of our nation’s historical record.
The scope of the archive helps drive that point home. The American Archive of Public Broadcasting draws from more than 100 contributing collections, including radio and television stations and other organizations such as WGBH, WNET, Maryland Public Television, Pacifica Radio Archives and the Library of Congress itself. That breadth means volunteers are not working on one narrow slice of programming. They are helping improve access to a wide cross-section of American public media history. The archive notes that transcripts make programs more searchable and usable, while Sorensen explained why that matters in the clearest possible terms: “Without transcripts, much of our catalog remains hidden. With them, the archive becomes a living, interactive resource which can be discovered, shared and explored by anyone.”
That is what gives FixIt+ its real value. It’s not simply a better way to clean up transcripts. It’s a way to bring more people into the work of preserving and opening up public broadcasting history. For volunteers, the task may begin with correcting a few lines of text. But the larger result is that important voices and moments from the past become easier to find, study and understand, allowing them to escape their vaults and become discoverable once more for future generations.
John Breeden II is an award-winning journalist and reviewer with over 20 years of experience covering technology. He is the CEO of the Tech Writers Bureau, a group that creates technological thought leadership content for organizations of all sizes. Twitter: @LabGuys




