In their classification of hate speech across 1.3 million synthetic statements referencing 125 demographic groups, top AI content moderation systems, including models from OpenAI, Google, DeepSeek, and Mistral, exhibit notable discrepancies, according to a recent study led by the University of Pennsylvania. Assessments of comments about education, economic status, and personal interests differed significantly, exposing some communities to greater online harm, whereas assessments of ethnicity, gender, and sexual orientation were relatively aligned. The results raise questions regarding bias, accountability, and the moral governance of AI-driven moderation systems by highlighting the lack of defined criteria and the opaque nature of algorithmic censoring. https://www.independent.co.uk/news/uk/home-news/ai-hate-speech-study-university-pennsylvania-b2826860.html Share this: Click to print (Opens in new window) Print Click to share on Facebook (Opens in new window) Facebook Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Reddit (Opens in new window) Reddit Click to share on WhatsApp (Opens in new window) WhatsApp Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Like this:Like Loading... Post navigation OpenAI, DeepSeek, and Google vary widely in identifying hate speech (Penn Today) WATCHED: A Web AI Agent Tool for Combating Hate Speech by Expanding Data (arXiv)