Relieving Bias in Hate Speech Detection with a Small Number of Expert Annotations: A Prompt-based Learning Approach (SSRN)

Sep 23, 2025 #Algorithms, #Policies, #Society

The research at hand tackles bias in automatic hate speech detection on social media, where linguistic heterogeneity within speaker groups is frequently overlooked by machine learning models trained on datasets labeled by general annotators. A weakly supervised framework that uses contrastive and prompt-based learning strategies based on huge language models in conjunction with a limited number of expert annotations is suggested as a solution to this problem. A group estimator, pair generator, and knowledge injection are all included into the suggested architecture to improve the model’s sensitivity to sociolinguistic subtleties.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5487166

Relieving Bias in Hate Speech Detection with a Small Number of Expert Annotations: A Prompt-based Learning Approach (SSRN)

Like this:

Leave a Reply Cancel reply

LATEST NEWS

Two Weeks in Soft Security: Free Resources on Countering Extremism, Hate, and Disinformation, October 2025 (I/II)

Audio: preventhate.org, 16 October 2025

Ideology and polarization set the agenda on social media (scientific reports)

Hate Speech on Social Media: A Systemic Narrative Review of Political Science Contributions (MDPI)

Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision–Language Models (ACL Anthology)

preventhate.org | Policyinstitute.net

Relieving Bias in Hate Speech Detection with a Small Number of Expert Annotations: A Prompt-based Learning Approach (SSRN)

Share this:

Like this:

Leave a Reply Cancel reply

Two Weeks in Soft Security: Free Resources on Countering Extremism, Hate, and Disinformation, October 2025 (I/II)

Audio: preventhate.org, 16 October 2025

Ideology and polarization set the agenda on social media (scientific reports)

Hate Speech on Social Media: A Systemic Narrative Review of Political Science Contributions (MDPI)

Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision–Language Models (ACL Anthology)