The increase of hate speech on social media necessitates automated detection that strikes a compromise between efficiency and accuracy. 38 models, including transformers (e.g., BERT, RoBERTa), deep neural networks (e.g., CNN, LSTM, HAN), and conventional techniques (e.g., SVM, CatBoost), are evaluated in this study on datasets ranging from 6.5K to 451K samples. With F1-scores above 90%, RoBERTa performs better than the others. While CatBoost and SVM maintain their competitiveness with over 88% F1-scores and reduced computing requirements, Hierarchical Attention Networks outperform other deep learning models. Larger, preprocessed datasets are less effective than balanced, medium sized uncooked datasets. The results inform the creation of an effective hate speech detection system.

https://dl.acm.org/doi/10.1145/3696673.3723061

By author

Leave a Reply

Your email address will not be published. Required fields are marked *