Detecting Hate Speech with GPT-3 (arXiv)

“We use GPT-3 to identify sexist and racist text passages with zero-, one-, and few-shot learning. We find that with zero- and one-shot learning, GPT-3 can identify sexist or racist text with an accuracy between 48 per cent and 69 per cent. With few-shot learning and an instruction included in the prompt, the model’s accuracy can be as high as 78 per cent. We conclude that large language models have a role to play in hate speech detection, and that with further development language models could be used to counter hate speech and even self-police.”

https://arxiv.org/pdf/2103.12407.pdf

Leave a Reply

Your email address will not be published.

Related Post

Using Transfer-based Language Models to Detect Hateful and Offensive Language Online (Proceedings of the Fourth Workshop on Online Abuse and Harms)Using Transfer-based Language Models to Detect Hateful and Offensive Language Online (Proceedings of the Fourth Workshop on Online Abuse and Harms)

“The results indicate that the attention-based models profoundly confuse hate speech with offensive and normal language. However, the pre-trained models outperform state-of-the-art results in terms of accurately predicting the hateful

%d bloggers like this: