Exploring Intensities of Hate Speech on Social Media: A Case Study on Explaining Multilingual Models with XAI (JKU Visual Data Science Lab)

In order to present a more accurate spectrum of severity and surmount the constraints of seeing hate speech as a binary task (as typical in sentiment analysis), we classify hate speech into four intensities: no hate, intimidation, offense or discrimination, and promotion of violence. For this, we first involve 31 users in annotating a dataset in English and German. To promote interpretability and transparency, we integrate our ML system in a dashboard provided with explainable AI (XAI). By performing a case study with 40 non-experts moderators, we evaluated the efficacy of the proposed XAI dashboard in supporting content moderation. Our results suggest that assessing hate intensities is important for content moderators, as these can be related to specific penalties. Similarly, XAI seems to be a promising method to improve ML trustworthiness, by this, facilitating moderators’ well-informed decision-making.

https://jku-vds-lab.at/publications/2023_ditox_hate_speech_xai/

Exploring Intensities of Hate Speech on Social Media: A Case Study on Explaining Multilingual Models with XAI (JKU Visual Data Science Lab)

Byauthor

Like this:

By author

Leave a Reply Cancel reply

LATEST NEWS

Misogynous Memes Recognition: Training vs Inference Bias Mitigation Strategies (Italian Journal of Computational Linguistics)

A Vision-Language Model for Multitask Classification of Memes (Neural Networks)

What is online hate and how can you counter it? (Center for Countering Digital Hate)

The Role of Context in Detecting the Target of Hate Speech (ACL Anthology)

WATCHED: A Web AI Agent Tool for Combating Hate Speech by Expanding Data (arXiv)

preventhate.org | Policyinstitute.net

Exploring Intensities of Hate Speech on Social Media: A Case Study on Explaining Multilingual Models with XAI (JKU Visual Data Science Lab)

Byauthor

Share this:

Like this:

By author

Leave a Reply Cancel reply

Misogynous Memes Recognition: Training vs Inference Bias Mitigation Strategies (Italian Journal of Computational Linguistics)

A Vision-Language Model for Multitask Classification of Memes (Neural Networks)

What is online hate and how can you counter it? (Center for Countering Digital Hate)

The Role of Context in Detecting the Target of Hate Speech (ACL Anthology)

WATCHED: A Web AI Agent Tool for Combating Hate Speech by Expanding Data (arXiv)