Using both audio and text modalities, the current research provides a thorough analysis of multimodal hate speech detection in the Dravidian languages of Tamil, Telugu, and Malayalam. Codemixing, limited linguistic resources, and various cultural settings make it more difficult to detect hate speech in these languages. In order to create a strong multimodal framework, the method used combines sophisticated methods for audio feature extraction and XLM-Roberta for text representation with feature alignment and fusion. Along with a non-hate category, the dataset is meticulously divided into labeled classes: hate speech based on gender, politics, religion, and personal defamation. According to experimental data, the model introduced obtains an accuracy of around 85 and a macro F1-score of 0.76. https://aclanthology.org/2025.dravidianlangtech-1.119 Share this: Click to share on Facebook (Opens in new window) Facebook Click to share on X (Opens in new window) X Like this:Like Loading... Post navigation AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages (ACL Anthology) Conspiracy to Commit: Information Pollution, Artificial Intelligence, and Real-World Hate Crime (arXiv)