Research on hate speech identification is still primarily focused on English, which hinders advancements for low-resource languages with a wide range of linguistic variants. By gathering and standardizing a meta-collection of hate speech datasets for European Spanish, followed by aligned translations into European Portuguese and two Galician variations, the current study fills these gaps. New standards for the detection of hate speech in Iberian languages are made possible by the multilingual corpora that are produced. The usefulness of multilingual and variety-aware techniques is demonstrated by evaluations of large language models in zero-shot, few-shot, and fine-tuning scenarios, providing a strong basis for further study in underrepresented European contexts.

https://arxiv.org/abs/2510.11167

Leave a Reply

Your email address will not be published. Required fields are marked *