The increase in social media contacts in the current digital era exacerbates security risks like hate speech. Current research frequently uses binary classification techniques that require a lot of processing resources, ignoring multiclass imbalance in datasets. In order to overcome class imbalance, this study presents a Contextualized Word Insertion (CWI) data augmentation technique using XLM-RoBERTa. On low-resource devices, it also uses the lightweight TinyLLaMA model to categorize hate speech based on text and emojis. Emojis are transformed into strings and data is cleansed. During training, LoRA is utilized to lower parameters without sacrificing performance. Experiments demonstrate that the suggested strategy performs better than current techniques.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5400860

By author

Leave a Reply

Your email address will not be published. Required fields are marked *