“We use GPT-3 to identify sexist and racist text passages with zero-, one-, and few-shot learning. We find that with zero- and one-shot learning, GPT-3 can identify sexist or racist text with an accuracy between 48 per cent and 69 per cent. With few-shot learning and an instruction included in the prompt, the model’s accuracy can be as high as 78 per cent. We conclude that large language models have a role to play in hate speech detection, and that with further development language models could be used to counter hate speech and even self-police.”https://arxiv.org/pdf/2103.12407.pdfShare this:FacebookXLike this:Like Loading... Post navigation Classification of Hate Speech Using Deep Neural Networks (HAL) Latent Hatred: A Benchmark for Understanding Implicit Hate Speech (Association for Computational Linguistics)