The study looks at how well LLMs do in identifying hate speech in a variety of geographical locations and multilingual datasets. The three elements of the novel assessment methodology that the researchers propose are resilience against adversarial text, geography-aware contextual detection, and binary classification. The authors assess Llama2 (13b), Codellama (7b), and DeepSeekCoder (6.7b) using 1,000 comments from five locations. With an F1-score of 52.18% and the greatest binary classification recall of 70.6%, Codellama outperformed DeepSeekCoder in terms of geographic sensitivity, identifying 63 out of 265 sites. 62.5% of manipulated samples were incorrectly identified by Llama2, illustrating the trade-offs between robustness, contextual knowledge, and accuracy. By highlighting important advantages and disadvantages, this study lays the groundwork for the creation of multilingual hate speech detection systems and offers suggestions for further study and use. https://www.arxiv.org/abs/2502.19612 Share this: Print (Opens in new window) Print Share on Facebook (Opens in new window) Facebook Share on LinkedIn (Opens in new window) LinkedIn Share on Reddit (Opens in new window) Reddit Share on WhatsApp (Opens in new window) WhatsApp Share on Bluesky (Opens in new window) Bluesky Email a link to a friend (Opens in new window) Email Like this:Like Loading... Post navigation Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech (arXiv) Multilingual and Multi-Aspect Hate Speech Analysis (SSRN)