Large-Scale Hate Speech Detection with Cross-Domain Transfer (Papers with Code)

The datasets used to train hate speech detection models affect the models’ performance. The majority of existing datasets are created using a small number of hate domains or instances to define hate subjects. Large-scale analysis and transfer learning for hate domains are hampered as a result. In this work, we create massive twitter datasets with 100k human-labeled tweets each for the purpose of detecting hate speech in English and Turkish, a language with limited resources. We have scattered an equal amount of tweets over five domains in our datasets. In terms of large-scale hate speech identification, Transformer-based language models perform at least 5% better in English and 10% better in Turkish when compared to conventional bag-of-words and neural models, according to experimental data corroborated by statistical testing. Additionally, the performance is adaptable to various training sizes; with 20% of training instances, 98% of the English performance and 97% of the Turkish performance are recovered. We also investigate the capacity of cross-domain transfer between hate domains to generalize. We demonstrate that, on average, other domains recover 96% of a target domain’s performance for English and 92% for Turkish. Sports struggle the hardest to generalize to other domains, but gender and religion fare better.

https://paperswithcode.com/paper/large-scale-hate-speech-detection-with-cross-1/review

Large-Scale Hate Speech Detection with Cross-Domain Transfer (Papers with Code)

Like this:

Leave a Reply Cancel reply

LATEST NEWS

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

Coping with Digital Hostility: How Witnessing and Receiving Hate Speech Elicit Divergent Responses (SSRN)

preventhate.org | Policyinstitute.net

Large-Scale Hate Speech Detection with Cross-Domain Transfer (Papers with Code)

Share this:

Like this:

Leave a Reply Cancel reply

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

Coping with Digital Hostility: How Witnessing and Receiving Hate Speech Elicit Divergent Responses (SSRN)