Why People End Up Mad When AI Flags Toxic Speech

wildpixel/iStock.com

Facebook says its artificial intelligence models identified and pulled down 27 million pieces of hate speech in the final three months of 2020.

New research sheds light on why artificial intelligence identification of toxic speech on the internet often frustrates people, despite getting high scores on technical tests.

The main problem: There is a huge difference between evaluating more traditional AI tasks, like recognizing spoken language, and the much messier task of identifying hate speechharassment, or misinformation—especially in today’s polarized environment.

“It appears as if the models are getting almost perfect scores, so some people think they can use them as a sort of black box to test for toxicity,” says Mitchell Gordon, a PhD candidate in computer science at Stanford University who worked on the project. “But that’s not the case. They’re evaluating these models with approaches that work well when the answers are fairly clear, like recognizing whether ‘java’ means coffee or the computer language, but these are tasks where the answers are not clear.”

Facebook says its artificial intelligence models identified and pulled down 27 million pieces of hate speech in the final three months of 2020. In 97% of the cases, the systems took action before humans had even flagged the posts.

That’s a huge advance, and all the other major social media platforms are using AI-powered systems in similar ways. Given that people post hundreds of millions of items every day, from comments and memes to articles, there’s no real alternative. No army of human moderators could keep up on its own.

The team hopes their study will illuminate the gulf between what developers think they’re achieving and the reality—and perhaps help them develop systems that grapple more thoughtfully with the inherent disagreements around toxic speech.

Even People Can't Agree

There are no simple solutions, because there will never be unanimous agreement on highly contested issues. Making matters more complicated, people are often ambivalent and inconsistent about how they react to a particular piece of content.

In one study, for example, human annotators rarely reached agreement when they were asked to label tweets that contained words from a lexicon of hate speech. Only 5% of the tweets were acknowledged by a majority as hate speech, while only 1.3% received unanimous verdicts. In a study on recognizing misinformation, in which people were given statements about purportedly true events, only 70% agreed on whether most of the events had or had not occurred.

Despite this challenge for human moderators, conventional AI models achieve high scores on recognizing toxic speech—.95 “ROCAUC”—a popular metric for evaluating AI models in which 0.5 means pure guessing and 1.0 means perfect performance. But the Stanford team found that the real score is much lower—at most .73—if you factor in the disagreement among human annotators.

Spotting Toxic Speech

In a new study, the team reassesses the performance of today’s AI models by getting a more accurate measure of what people truly believe and how much they disagree among themselves.

Michael Bernstein and Tatsunori Hashimoto, associate and assistant professors of computer science and faculty members of the Stanford Institute for Human-Centered Artificial Intelligence (HAI) oversaw the study.

To get a better measure of real-world views, the researchers developed an algorithm to filter out the “noise”—ambivalence, inconsistency, and misunderstanding—from how people label things like toxicity, leaving an estimate of the amount of true disagreement. They focused on how repeatedly each annotator labeled the same kind of language in the same way. The most consistent or dominant responses became what the researchers call “primary labels,” which the researchers then used as a more precise dataset that captures more of the true range of opinions about potential toxic content.

The team then used that approach to refine datasets that are widely used to train AI models in spotting toxicity, misinformation, and pornography. By applying existing AI metrics to these new “disagreement-adjusted” datasets, the researchers revealed dramatically less confidence about decisions in each category. Instead of getting nearly perfect scores on all fronts, the AI models achieved only .73 ROCAUC in classifying toxicity and 62% accuracy in labeling misinformation. Even for pornography—as in, “I know it when I see it”—the accuracy was only .79.

Controversy Is Inevitable

Gordon says AI models, which must ultimately make a single decision, will never assess hate speech or cyberbullying to everybody’s satisfaction. There will always be vehement disagreement. Giving human annotators more precise definitions of hate speech may not solve the problem either, because people end up suppressing their real views in order to provide the “right” answer.

But if social media platforms have a more accurate picture of what people really believe, as well as which groups hold particular views, they can design systems that make more informed and intentional decisions.

In the end, Gordon suggests, annotators as well as social media executives will have to make value judgments with the knowledge that many decisions will always be controversial.

“Is this going to resolve disagreements in society? No,” says Gordon. “The question is what can you do to make people less unhappy. Given that you will have to make some people unhappy, is there a better way to think about whom you are making unhappy?”

The paper’s additional coauthors include investigators from Stanford and Apple Inc.

This article was originally published in Futurity. It has been republished under the Attribution 4.0 International license.

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.