Investigating Annotator Bias in Large Language Models for Hate Speech Detection (arXiv)

Jun 28, 2024 #Algorithms, #Assorted

While previous studies have assessed large language models’ (LLMs) effectiveness as annotators in great detail, this work explores the biases inherent in LLMs—specifically, GPT 3.5 and GPT 4o—when annotating data related to hate speech. Understanding prejudices in four important areas—gender, race, religion, and disability—is aided by the research laid out in the study. The researchers evaluate annotator biases, focusing on particularly vulnerable groups within these categories. Moreover, they analyze the annotated data in order to perform a thorough investigation of possible causes of these biases. To carry out this research, the authors present HateSpeechCorpus, our unique hate speech detection dataset. Furthermore, for comparative analysis, they conduct the same tests on the ETHOS dataset.

https://arxiv.org/abs/2406.11109

Investigating Annotator Bias in Large Language Models for Hate Speech Detection (arXiv)

Like this:

Leave a Reply Cancel reply

LATEST NEWS

“They’re Not So Separate After All” – Digital and Analog Dimensions of Radicalization (Policyinstitute.net)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

TAGS

preventhate.org | Policyinstitute.net

Investigating Annotator Bias in Large Language Models for Hate Speech Detection (arXiv)

Share this:

Like this:

Leave a Reply Cancel reply

“They’re Not So Separate After All” – Digital and Analog Dimensions of Radicalization (Policyinstitute.net)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

TAGS