Because memes are multimodal—combining text, pictures, and cultural cues—hate speech in them presents difficulties. In order to tackle this issue, we provide MemHateCaptioning, a framework that produces understandable and human-like justifications for the hateful content of memes. It combines Chain-of-Thought prompting with vision-language and big language models (ClipCap, BLIP, and T5) to enhance interpretability. MemHateCaptioning reduces hallucinations and context mistakes while outperforming current models in BLEU and ROUGE-L scores when tested on the HatReD dataset. https://dl.acm.org/doi/10.1145/3701716.3718385 Share this: Print (Opens in new window) Print Share on Facebook (Opens in new window) Facebook Share on LinkedIn (Opens in new window) LinkedIn Share on Reddit (Opens in new window) Reddit Share on WhatsApp (Opens in new window) WhatsApp Share on Bluesky (Opens in new window) Bluesky Email a link to a friend (Opens in new window) Email Like this:Like Loading... Post navigation Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate (arXiv) The Role of Context in Detecting the Target of Hate Speech (ACL Anthology)