A Bot That Identifies 'Toxic' Comments Online


A Google-funded algorithm flags messages that are likely to drive others away from a conversation.

Civil conversation in the comment sections of news sites can be hard to come by these days. Whatever intelligent observations do lurk there are often drowned out by obscenities, ad-hominem attacks, and off-topic rants. Some sites, like the one you’re reading, hide the comments section behind a link at the bottom of each article; many others have abolished theirs completely.

One of the few beacons of hope in the morass of bad comments shines at The New York Times, where some articles are accompanied by a stream of respectful, largely thoughtful ideas from readers. But the Times powers its comment section with an engine few other news organizations can afford: a team of human moderators that checks nearly every single comment before it gets published.

For those outlets who can’t hire 14 full-time moderators to comb through roughly 11,000 comments a day, help is on the way. Jigsaw, the Google-owned technology incubator, released a tool Thursday that uses machine-learning algorithms to separate out the worst comments that people leave online.

The tool, called Perspective, learned from the best: It analyzed the Times moderators’ decisions as they triaged reader comments, and used that data to train itself to identify harmful speech. The training materials also included hundreds of thousands of comments on Wikipedia, evaluated by thousands of different moderators.

Perspective’s current focus is on “toxicity,” defined by the likelihood that a comment will drive other participants to leave a conversation, most likely because it’s rude or disrespectful. Developers that adopt the platform can use it as they choose: It can automatically suppress toxic comments outright, or group them to help human moderators choose what to do with them. It could even show a commenter the toxicity rating of his or her comment as it’s being written, in order to encourage the commenter to tone down the language. (That could work a little bit like Nextdoor’s prompts aimed at tamping down on racist posts.)

Perspective’s website lets you test the system by typing in your own phrase. The system then spits out a toxicity rating on a 100-point scale. For example, “You’re tacky and I hate you,” is rated 90 percent toxic. Fair enough. But there are discrepancies—“You’re a butt” is apparently 84 percent toxic, while “You’re a butthead” is only at 36 percent. (When I tried more aggressive insults and abuse—your usual angry comments-section fodder—each scored over 90 percent.)

The Times has been using the system since September, and now runs every single incoming comment through Perspective before putting it in front of a human moderator. Perspective will help the newspaper expand the number of articles that include comments—currently, only about one in ten have comments enabled.

Future versions of Perspective will approach other aspects of online commenting. It may one day be able to tell when a comment is off-topic, for example, by comparing it to the themes contained in the news story it’s referring to.

The platform could help make more comment sections enjoyable and informative—and it might help draw out voices that are often silenced by harassment. A study published in November found that nearly half of Americans have been harassed or abused online, and that women, racial minorities, and LGBT people are more likely to be attacked than others.

The abuse drove people to change their contact information or retreat from family and friends. Worryingly, it also led one in four people to censor themselves in order to avoid further harassment. Most harmful abuse happens on social networks, not news-site comment sections, of course—Twitter is often a loud crossfire of vitriol—but barbs exist on every social platform. Tamping down on abuse on news sites can help make them a safer space for commenters.

Perspective’s developers hope that opening the tool to every publisher will bring comment moderating within reach for more, and perhaps stave off the demise of comment sections. As more news organizations adopt the system, it will continue to learn and improve its accuracy. And if automated moderating proves useful for news sites, it may have a future on larger social media networks, which are most in need of a gatekeeper to stop abuse.