Assessing the Hatefulness of Social Media Posts: A Continuous Measure of Hate Using Generative AI (SSRN)

A major issue with social media is online hatred, which has a detrimental impact on people, social media in general, and intolerance in society. To combat it, content moderation relies on dichotomous classifiers that only provide a score based on how likely a communication is to be hostile. However, the limitations of categorical classifiers have hindered social research investigating the factors that influence hatefulness degrees. Using ChatGPT (GPT-4), this study achieved two goals: (1) creating a continuous, 6-interval scale for rating hate communications that is scalable to huge datasets, and (2) comparing GPT-4 to trustworthy human raters on a limited but varied set of hate messages. Although GPT-4’s assessments were much more hostile than those of human raters, the results demonstrate parallelism in the analyses as well as convergent and discriminant validity. The GPT-4 and human raters’ ratings were more strongly associated than those obtained by other classifiers, according to further studies that compare the scores of the GPT-administered measure and human raters to those of other well-known hate classifiers. Limitations and implications for future study are discussed in the work’s conclusion.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5215167

Assessing the Hatefulness of Social Media Posts: A Continuous Measure of Hate Using Generative AI (SSRN)

Byauthor

Like this:

By author

Related Post

Two Weeks in Soft Security: Free Resources on Countering Extremism, Hate, and Disinformation, July 2025 (I/II)

Experiences of online hate and abuse among women in politics (Ofcom)

A Large Language Model-Based Approach for Multilingual Hate Speech Detection on Social Media (MDPI)

Leave a Reply Cancel reply

LATEST NEWS

Website Statistics