This articleintroduces a 3-class dataset named Pars-HAO, consisting of 8013 tweets, to fill the gap in existing research. We collected the datasetby combining comments from pages that are more exposed to hate speech and using a keyword-based approach. Three annotatorsthen labeled the tweets. In this study, we employed a combination of the Convolutional Neural Network (CNN) model and two widelyrecognized machine learning models, namely Support Vector Machine (SVM) and Logistic Regression (LR), as a baseline. To improvethe classification performance, we employed the Hard Voting ensemble learning technique. Experimental results on the Pars-HAOdataset demonstrated that the Hard voting ensemble learning technique yielded the best outcome, achieving a macro F1-score of68.76%.

https://www.techrxiv.org/articles/preprint/Pars-HAO_Hate_Speech_and_Offensive_Language_Detection_on_Persian_Social_Media_Using_Ensemble_Learning/24106617

By author

Leave a Reply

Your email address will not be published. Required fields are marked *