Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate (arXiv)

Jun 5, 2025 #Algorithms

Though there are still issues with automated counter-narratives’ (CN) emotive tone, accessibility, and ethical implications, CN presents a potential approach to combating hate speech online. The researchers provide a paradigm for assessing CNs produced by Large Language Models (LLMs) in four areas: ethical soundness, emotive tone, verbosity and readability, and persona framing. On the MT-Conan and HatEval datasets, the investigators evaluate three prompting techniques using GPT-4o-Mini, Cohere’s CommandR-7B, and Meta’s LLaMA 3.1-70B. According to the research presented, LLM-generated CNs are frequently lengthy and tailored for college-level readers, which restricts their use. Although replies to emotionally led prompts are more sympathetic and readable, questions about their efficacy and safety still exist.

https://arxiv.org/abs/2506.04043

Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate (arXiv)

Like this:

Leave a Reply Cancel reply

LATEST NEWS

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

Coping with Digital Hostility: How Witnessing and Receiving Hate Speech Elicit Divergent Responses (SSRN)

preventhate.org | Policyinstitute.net

Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate (arXiv)

Share this:

Like this:

Leave a Reply Cancel reply

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – December 2025 (I/II)

Soft Security Resources: Press Articles, Documents, and Recordings on Countering Extremism, Hate Speech, and False Information – November 2025 (I/I)

New on preventhate.org | Policyinstitute.net, 17 November 2025

Meta Oversight Board’s Nascent Standard on Hate Speech: Towards Plural Standard Setting in International Human Rights Law (SSRN)

Coping with Digital Hostility: How Witnessing and Receiving Hate Speech Elicit Divergent Responses (SSRN)