Detecting hate speech on social media is complicated by linguistic diversity, informal expression, and challenges like code-mixing, transliteration, and cultural nuance. While fine-tuned models like BERT are standard, recent large language models (LLMs) outperform them and may redefine the field. To demonstrate this, the IndoHateMix dataset was introduced, capturing Hindi-English code-mixed content for testing robustness in multilingual settings. Experiments show LLMs like LLaMA-3.1 consistently beat BERT-based models, even with less data. Their adaptability and generalization suggest a promising shift in combating online hate—raising the debate over prioritizing model development versus expanding diverse datasets. https://arxiv.org/abs/2506.12744 Share this: Click to print (Opens in new window) Print Click to share on Facebook (Opens in new window) Facebook Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Reddit (Opens in new window) Reddit Click to share on WhatsApp (Opens in new window) WhatsApp Click to share on Bluesky (Opens in new window) Bluesky Click to email a link to a friend (Opens in new window) Email Like this:Like Loading... Post navigation Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets (arXiv) Video… Preventing and combatting hate crime, including criminalised hate speech, in focus of a conference in Strasbourg (Council of Europe)