Social media hate speech jeopardizes free speech and online safety, particularly when it comes to multilingual users. A trilingual dataset comprising 10,193 annotated tweets in English, Spanish, and Urdu; innovative joint multilingual and translation-based methodologies; and customized annotation rules for consistent labeling are the main features of this work, which presents a multilingual detection strategy. Using transformer-based models, deep learning, and machine learning, the scientists conducted 41 tests. GPT-3.5-turbo performed better than baselines, averaging a 4% increase across languages and enhancing Urdu identification by 8% over XLM-R. The research offers a high-quality dataset and scalable approach for identifying hate speech in underrepresented languages. If you would want a version that is tailored for a particular

https://www.mdpi.com/2073-431X/14/7/279

By author

Leave a Reply

Your email address will not be published. Required fields are marked *