A Survey of Machine Learning Models and Datasets for the Multi-label Classification of Textual Hate Speech in English (arXiv)

Apr 21, 2025 #Algorithms

Researchers developed datasets and machine learning algorithms that tackle the multi-label challenge of classifying hate speech in textual data. The first thorough and methodical review of the scientific literature on this new field of study in English is presented in this work (N=46). The researchers offer a succinct summary of 28 datasets that are suitable for multi-label classification model training, highlighting notable variations in label-set, size, meta-concept, annotation method, and inter-annotator agreement. Inconsistency in evaluation and a preference for designs based on Recurrent Neural Networks (RNNs) and Bidirectional Encoder Representation from Transformers (BERT) are further established by our examination of 24 articles that provide appropriate categorization models. Ten proposals for further study are developed after identifying the following important outstanding issues: limited and sparse datasets, uneven training data, dependence on crowdsourcing platforms, and insufficient methodological alignment.

https://arxiv.org/abs/2504.08609

A Survey of Machine Learning Models and Datasets for the Multi-label Classification of Textual Hate Speech in English (arXiv)

Like this:

Leave a Reply Cancel reply

LATEST NEWS

“They’re Not So Separate After All” – Digital and Analog Dimensions of Radicalization (Policyinstitute.net)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

TAGS

preventhate.org | Policyinstitute.net

A Survey of Machine Learning Models and Datasets for the Multi-label Classification of Textual Hate Speech in English (arXiv)

Share this:

Like this:

Leave a Reply Cancel reply

“They’re Not So Separate After All” – Digital and Analog Dimensions of Radicalization (Policyinstitute.net)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

TAGS