A computationally effective approach for detecting hate speech is presented in the current paper. It is intended for real-time implementation in settings with limited resources. A LoRA-tuned BERTweet model, rule-based pre-filtering, and continuous learning methods are all integrated into the suggested three-layer approach. Our method uses a base model that is 100× smaller (134M vs. 14B parameters), achieving a macro F1 score of 0.85—94% of the performance of state-of-the-art large language models like SafePhi. With just 1.87M trainable parameters (1.37% of full fine-tuning) and a training duration of roughly two hours on a single T4 GPU, the framework outperforms conventional BERT-based techniques in terms of accuracy by means of efficient fine-tuning and dataset unification. The feasibility of implementing strong hate speech detection systems without compromising competitive performance is demonstrated by this work.

https://arxiv.org/html/2511.06051v1

Leave a Reply

Your email address will not be published. Required fields are marked *