Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis (ACL Anthology)

One major flaw in hate speech identification is that most hate speech datasets ignore the cultural variation within a single language. We provide CREHate, a CRoss-cultural English Hate speech dataset, in order to address this. We use a two-step process to build CREHate: two steps: 1) cross-cultural annotation, and 2) cultural post collection. Using culturally offensive terms we obtain from our survey, we gather postings from four geographically different English-speaking nations (South Africa, Australia, Singapore, and the United Kingdom) and sample them from the SBIC dataset, which primarily reflects North America. The United States and the other four nations provide annotations that are gathered to create representative labels for each nation. Our data reveals statistically significant differences in hate speech annotations between nations.Among all countries, only 56.2% of the postings in CREHate reach consensus, with a pairwise label difference rate of 26% being the highest.Qualitative investigation reveals that annotators’ personal biases on contentious issues and differing interpretations of sarcasm are the main causes of label disagreement.In conclusion, we assess large language models (LLMs) in a zero-shot scenario and demonstrate that the state-of-the-art LLMs typically exhibit superior accuracy on Anglosphere country labels in CREHate.

https://aclanthology.org/2024.naacl-long.236

Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis (ACL Anthology)

Like this:

Leave a Reply Cancel reply

LATEST NEWS

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

Coping with Digital Hostility: How Witnessing and Receiving Hate Speech Elicit Divergent Responses (SSRN)

preventhate.org | Policyinstitute.net

Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis (ACL Anthology)

Share this:

Like this:

Leave a Reply Cancel reply

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

Coping with Digital Hostility: How Witnessing and Receiving Hate Speech Elicit Divergent Responses (SSRN)