Because hate speech has increased due to the growth of internet platforms, scalable detection techniques are needed. However, these algorithms rely on data that has been categorized by humans, which frequently exhibits biases. This has been mentioned in previous studies, but the relationship between annotator and target features has not been thoroughly explored. The current study fills that gap by demonstrating the relationship between biases and target features using a dataset that is rich in sociodemographic detail for both sides. Different bias frequencies and intensities are identified by the study. Although they both display bias, comparisons with persona-based LLMs reveal that their tendencies differ greatly. These results contribute to a better understanding of annotation bias and guide the creation of more equitable AI-powered hate speech detection systems.

https://arxiv.org/abs/2410.07991

By author

Leave a Reply

Your email address will not be published. Required fields are marked *