Two multimodal hate speech datasets, MHS and MHS-Con, which record fine-grained hostile abstractions in everyday and confusing situations, are curated by the researchers. A number of competing baselines are used to benchmark these datasets. To allow robust hate speech detection in memes, SAFE-MEME (Structured reAsoning Framework) is a revolutionary multimodal Chain-of-Thought-based framework that uses hierarchical categorization (SAFE-MEME-H) and Q&A-style reasoning (SAFE-MEME-QA). With an average improvement of 5% and 4% on MHS and MHS-Con, respectively, SAFE-MEME-QA performs better than current baselines. By contrast, SAFE-MEME-H outperforms only multimodal baselines in MHS-Con, while achieving an average improvement of 6% in MHS. In typical fine-grained hateful meme detection, it is demonstrated that fine-tuning a single-layer adapter inside SAFE-MEME-H works better than completely fine-tuned models. However, when dealing with perplexing circumstances, the completely fine-tuning technique with a Q&A setup yields better results.

https://arxiv.org/html/2412.20541v1

By author

Leave a Reply

Your email address will not be published. Required fields are marked *