The application of Large Language Models (LLMs) to combat hate speech is investigated in this work. The first real-world A/B test evaluating the efficacy of LLM-generated counter-speech was carried out by the researchers. In order to lower user interaction beneath tweets that featured hate speech targeting Ukrainian immigrants in Poland, they uploaded 753 artificially created comments during the trial. According to the results, user engagement is considerably reduced by interventions using LLM-generated answers, especially for original tweets with at least ten views, which saw a 20% decline. The architecture of our automatic moderation system, a straightforward metric for gauging user participation, and the methods for carrying out such an experiment are all described in the article. The authors talk about the difficulties and ethical issues associated with using generative AI for conversation moderation.https://aclanthology.org/2024.findings-emnlp.931Share this:FacebookXLike this:Like Loading... Post navigation Including the Voices of Hate Crime Victims in Policymaking and Policy Implementation: a Practical Guide (OSCE ODIHR via Tandis) Online Hate Speech on Social Media Platforms (Research Reels)