AI still moderates hate speech
[ad_1]
The results show one of the most difficult aspects of AI-based hate speech detection today: You moderate too little and you don’t fix the problem; too much moderation and you can censor the kind of language that marginalized groups use to empower and defend themselves: “Suddenly, you will completely punish those communities that are mostly targeted by hatred,” says Paul Röttger. PhD student at Oxford Internet Institute and co-author of the article.
According to Lucy Vasserman Jigsaw’s chief software engineer, Perspective transcends these boundaries based on moderators in making the final decision. But this process is not scalable for larger platforms. Jigsaw is now developing a feature that would restore posts and comments based on Perspective uncertainty; it automatically deletes the content because it is hateful and marks the borderline content to humans.
The exciting new research, he says, offers a fine way to assess the state of the art. “A lot of the things that stand out in this article, such as recovered words, are the challenge of these models; that’s what’s popular in the industry, but it’s very difficult to quantify,” he says. Jigsaw is now using HateCheck to better understand the differences between its models and the place they need to improve.
Academics are also happy with the research. “This article provides us with a nice clean resource for evaluating industrial systems,” says Maarten Sap, an AI researcher at the University of Washington, which “allows companies and users to demand improvement”.
Thomas Davidson, an assistant professor of sociology at Rutgers University, agrees. The limitations of language patterns and language confusion mean that there will always be solutions between little and excessive identification of hate speech, he says. “The HateCheck database helps make these offsets visible,” he added.
[ad_2]
Source link