Compositional Generalisation for Explainable Hate Speech Detection (arXiv)

Jun 5, 2025 #Algorithms

Content moderation relies heavily on hate speech identification, however existing models frequently fall short of generalizing because of biases in the dataset and sentence-level labels that do not account for the structure of hate speech. Models find it difficult to distinguish label meanings from context, even when finer span-level annotations are included (e.g., labeling “artists” as a “target” and “are parasites” as dehumanizing). Novel expression combinations are therefore still difficult to find. The researchers investigate whether generalization is enhanced by training on data with uniformly distributed utterances across contexts. The authors then present U-PLEAD, a dataset consisting of around 364,000 synthetic posts and a benchmark of approximately 8,000 hand verified posts. U-PLEAD produces state-of-the-art results on PLEAD and improves compositional generalization when used with actual data.

https://arxiv.org/abs/2506.03916

Compositional Generalisation for Explainable Hate Speech Detection (arXiv)

Like this:

Leave a Reply Cancel reply

LATEST NEWS

“They’re Not So Separate After All” – Digital and Analog Dimensions of Radicalization (Policyinstitute.net)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

TAGS

preventhate.org | Policyinstitute.net

Compositional Generalisation for Explainable Hate Speech Detection (arXiv)

Share this:

Like this:

Leave a Reply Cancel reply

“They’re Not So Separate After All” – Digital and Analog Dimensions of Radicalization (Policyinstitute.net)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

TAGS