While previous studies have assessed large language models’ (LLMs) effectiveness as annotators in great detail, this work explores the biases inherent in LLMs—specifically, GPT 3.5 and GPT 4o—when annotating data related to hate speech. Understanding prejudices in four important areas—gender, race, religion, and disability—is aided by the research laid out in the study. The researchers evaluate annotator biases, focusing on particularly vulnerable groups within these categories. Moreover, they analyze the annotated data in order to perform a thorough investigation of possible causes of these biases. To carry out this research, the authors present HateSpeechCorpus, our unique hate speech detection dataset. Furthermore, for comparative analysis, they conduct the same tests on the ETHOS dataset.https://arxiv.org/abs/2406.11109Share this:FacebookXLike this:Like Loading... Post navigation Intent-Aware and Hate-Mitigating Counterspeech Generation via Dual-Discriminator Guided LLMs (ACL Anthology) COT: A Generative Approach for Hate Speech Counter-Narratives via Contrastive Optimal Transport (arXiv)