Research on hate speech identification has mostly examined monolingual situations; multilingual and code-switched languages, which provide unique linguistic complications, have received less attention. By contrasting transformer-based models—XLM-RoBERTa, DistilBERT, Multilingual BERT, and mT5—with more conventional machine learning techniques, such as logistic regression, support vector machines, and multinomial naïve Bayes utilizing TF-IDF features, the current work investigates the identification of hate speech in code-switched Spanglish material from social media. According to the results, XLM-RoBERTa performs the best, recognizing code-switched hate speech with 96.14 percent accuracy, 96.16 percent precision, 96.14 percent recall, and a 96.12 percent F1-score. In multilingual environments, transformer-based techniques clearly offer benefits over conventional models, especially SVM (94.03 percent accuracy).

https://www.techrxiv.org/doi/full/10.36227/techrxiv.174362958.86158403/v1

By author

Leave a Reply

Your email address will not be published. Required fields are marked *