Teaching AI to filter out banned content isn’t the solution advocates hoped for—or the one Silicon Valley promised.
The past six days have been an all-out war between social-media giants and the people who hope to use their platforms to share grisly footage of the Christchurch, New Zealand, mosque shootings. It hasn’t always been clear who’s winning. YouTube described an “unprecedented” rush to upload video of the attack over the weekend, peaking at one attempted upload per second. In a blog post Thursday, Facebook said it removed 1.5 million videos in the first 24 hours after the attack, 1.2 million of which were blocked before being uploaded to the site, which means 300,000 videos were able to slip past its filtering system. These companies have blocked uploads, deleted videos, and banned users—but people are still outsmarting the technology intended to block the footage from spreading on social media.
A Facebook spokesperson told Wired the company was adding each video it found “to an internal database which enables us to detect and automatically remove copies of the videos when uploaded again.” But in the company’s blog post, Facebook’s vice president of product management, Guy Rosen, noted that the “proliferation of many different variants of the video” frustrated attempts to filter out uploads. These attempts, he writes, were driven by “a core community of bad actors working together to continually re-upload edited versions of this video in ways designed to defeat our detection.” (Representatives for Facebook and YouTube didn’t immediately respond to requests for comment.)
These companies are using hashing, which they have long touted as a technological solution for removing extremist, disturbing, or pornographic content. Here’s how it works: When moderators remove a video or an image, hashing technology allows them to extract a unique digital signature based on metadata from the photo or video. New uploads are matched against a hashing database, ensuring that something that’s been banned before isn’t re-uploaded later.
Over time, the database grows and, hopefully, blocks uploads before the snowball effect of multiple users searching, sharing, and re-uploading banned content allows disturbing content to trend. Facebook, for example, also uses hashing to help combat revenge porn and copyright violations, in addition to extremist content.
But when the technology went up against the video from Christchurch, a number of issues arose. First, platforms rely on users to flag videos as the initial hash, and from there it can attempt to block new uploads. Although 200 users watched the initial live-stream in its entirety, none of them flagged it. That’s probably because the alleged killer posted links to the live-stream in extremist corners of the internet, which means the first people to watch it were likely already radicalized. The first user flag came 30 minutes after the live-stream ended, Reuters reports.
Facebook’s scale is such that its crisis-response model relies on everyone—not just its employees— to report content. In its Thursday update, Rosen responded to suggestions that it add a broadcast delay to Facebook Live video. Rosen came out against the idea, writing “given the importance of user reports, adding a delay would only further slow down videos getting reported, reviewed, and first responders being alerted to provide help on the ground.” Facebook’s first line of defense when videos slip past automated filters is reports from viewers. Whether or not they comply depends on the audience.
Second, users sometimes bypass hashing attempts by uploading a version of the video that is watermarked or cropped, or a recording of a screen that is playing the video. Facebook’s blog post said it had 800 slightly altered duplicates in its database the day after the shooting. Each video that passes through hash filtering is another opportunity for a user to rip the video and upload it again himself.
Third, hashing only works on material that’s already been flagged or is very similar to something in the database. As Rosen noted in his blog post, hashing “has worked very well for areas such as nudity, terrorist propaganda and also graphic violence where there is a large number of examples we can use to train our systems.” But the company’s system simply didn’t have large volumes of data on filmed mass shootings, he wrote.
But Hany Farid, a computer-science professor at Dartmouth who specializes in robust hashing and photo forensics, told me he thinks there’s a fourth reason hashing technology fails to block some content.
“[Tech companies] don’t have the infrastructure they need to scale to this size,” he said. He thinks these companies are simply underinvesting in moderation resources. In 2009, Farid worked with Microsoft to develop PhotoID, a hashing technique that detects and blocks the upload of photos and videos that depict child sexual abuse. “This type of hashing technology has been around for decades,” Farid said. “So the fact that they have not refined it and improved it to the point that it should be nearly perfect is, I think, inexcusable.” (In its blog post, Facebook emphasized its efforts to “identify the most effective policy and technical steps” to moderate this type of video, including its matching technology and combatting hate speech.)
For years, Silicon Valley has promised artificial intelligence as a long-term solution for both blocking extremist content and improving working conditions for moderators, who have had to view and block video of the shooting manually. AI, Rosen said at Facebook’s developer conference last year, could one day detect and hash banned material preemptively without human workers having to stare at hours of gruesome material. In congressional testimony last year, Mark Zuckerberg said he expects AI hashing to take over content moderation soon.
But the ability to train algorithms to accurately recognize extreme violence without anyone having to flag it in the first place is still very far off—even the most advanced AI can’t distinguish between a real shooting and a movie scene, Farid told me.
Ultimately, the use case for purely AI-driven content moderation is fairly narrow, says Daphne Keller, the director of intermediary liability at the Stanford Center for Internet and Society, because nuanced decisions are too complex to outsource to machines.
“If context does not matter at all, you can give it to a machine,” she told me. “But, if context does matter, which is the case for most things that are about newsworthy events, nobody has a piece of software that can replace humans.”
A preemptive AI filter would be best suited for, say, beheadings or child porn, where there’s never any legitimate use case and thus no need for human input. But the type of violence seen in New Zealand is different. Banning all footage would interrupt journalistic coverage and legitimate scholarship. In 2016, Facebook apologized after removing a Facebook Live video of the shooting death of Philando Castile. Though the graphic video shows the bloody aftermath of a shooting death, activists argued the video was powerful evidence of the need for police reform, and should therefore remain on the site.
“Ultimately, you need human judgment,” Keller said. “Or else, you need to make a different kind of decision and say, ‘Getting rid of this is so important, and sparing humans from trauma is so important, that we’re going to accept the error of having legal and important speech disappear.’”
But involving humans once again raises the question of scale. “They’ve got 2 billion users,” Farid said of Facebook. “So when you’re talking to me about thousands of moderators, you’re living in a fantasy land.”
Whether humans, automated systems, or some combination of the two decide, there will always be real concerns that platforms are purposely boosting some voices and stifling others. YouTube employed numerous quick fixes over the weekend to stop misinformation and the spread of the video. Instead of ranking search results for words related to the shooting by popularity or users’ likelihood of engaging, it forced news sources to the top of search results based on trustworthiness. On the one hand, it funneled people away from conspiracy theories about crisis actors or mirrored duplicates that bypassed the filter. On the other, that’s exactly what some users wanted. Which is more dysfunctional: giving users what they’ve been searching for, or denying it to them?
“One critique is to say, ‘That’s not what people want. You shouldn’t reward the urge to stare at an accident,’” Keller said. “Another response is to say, ‘I don’t care if people want that. Don’t give it to them, because it’s bad for them.’”
Social-media platforms can white-list certain sources, Farid said, allowing, for example, The New York Times and The Wall Street Journal to upload clips of hashed footage, but the same can’t be done for regular users with individual profiles. That’s certain to spark accusations of preferential treatment. But it also works against the core principles of social media: optimizing for engagement, attention, and content that’s most likely to produce insightful data for advertisers when it’s shared.
Hate and misinformation are antisocial and dangerous, but they’re incredibly well suited to what the platforms are designed for: maximum growth and engagement in a world where millions of people want to share footage of a racist massacre. Farid believes that the fact that the platform had to make significant algorithmic changes to stop enabling the spread of the Christchurch video speaks to something deeply troubling in how it operates normally. “They’re trying to retrofit a system that was never designed to have safeguards in place.”