Because of the multimodal and multilingual character of the content as well as the disparate cultural views, hate speech moderation on international platforms presents special difficulties. To what extent do existing vision-language models (VLMs) handle these subtleties? Our goal is to explore this by developing Multi3Hate, the first multimodal and multilingual parallel hate speech dataset annotated by a diverse group of annotators. It includes 300 parallel meme examples in five different languages: Mandarin, Hindi, Spanish, German, and English. In this dataset, we show that cultural background has a considerable impact on multimodal hate speech annotation. Compared to randomly chosen annotator groups, the average pairwise agreement amongst nations is just 74%. According to our qualitative research, cultural variables are responsible for the lowest pairwise label agreement between the USA and India, which is just 67% …

https://arxiv.org/abs/2411.03888

By author

Leave a Reply

Your email address will not be published. Required fields are marked *