Sociocultural context is necessary for the correct identification and moderation of hate speech and abusive language. Moderation has been either too weak or too dependent on out-of-context keyword detection in many regions of the Global South, which has resulted in censorship and underutilized ads that target minorities. These problems are caused by a lack of community engagement and a lack of local-language data. In order to address this, AfriHate provides annotated datasets in 15 African languages that have been categorized by native speakers who are cognizant of cultural differences. The research describes the difficulties in creating datasets and the baseline classification findings, showing that multilingual models increase accuracy in low-resource scenarios and that performance differs by language.

https://aclanthology.org/2025.naacl-long.92

By author

Leave a Reply

Your email address will not be published. Required fields are marked *