In order to present a more accurate spectrum of severity and surmount the constraints of seeing hate speech as a binary task (as typical in sentiment analysis), we classify hate speech into four intensities: no hate, intimidation, offense or discrimination, and promotion of violence. For this, we first involve 31 users in annotating a dataset in English and German. To promote interpretability and transparency, we integrate our ML system in a dashboard provided with explainable AI (XAI). By performing a case study with 40 non-experts moderators, we evaluated the efficacy of the proposed XAI dashboard in supporting content moderation. Our results suggest that assessing hate intensities is important for content moderators, as these can be related to specific penalties. Similarly, XAI seems to be a promising method to improve ML trustworthiness, by this, facilitating moderators’ well-informed decision-making. https://jku-vds-lab.at/publications/2023_ditox_hate_speech_xai/ Share this: Click to print (Opens in new window) Print Click to share on Facebook (Opens in new window) Facebook Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Reddit (Opens in new window) Reddit Click to share on WhatsApp (Opens in new window) WhatsApp Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Like this:Like Loading... Post navigation Empirical Evaluation of Public Hate Speech Dataset (SSRN) Be an eSafe kid: Take action against online bullying (eSafetyCommissioner, Australian Government)