The study demonstrates the use of sophisticated text embedding techniques such as TF-IDF, CBOW, and GloVE, together with text pre-processing and post-processing to deploy two machine learning paradigms: Random Forest and Support Vector Machine (SVM). The main objective is to thoroughly classify a Twitter dataset into hate speech and then divide it into aggressive and targeted dimensions. The effectiveness of the model is rigorously evaluated based on the intricate relationship between text embeddings and categorization typology. The Random Forest classifier performs exceptionally well at classifying hate speech when paired with TF-IDF embeddings. Meanwhile, combining GloVE embeddings with the SVM algorithm demonstrates exceptional accuracy in differentiating between targeted, non-targeted, aggressive, and non-aggressive categories. Furthermore, CBOW embeddings are effective at classifying hate speech more broadly.
https://jart.icat.unam.mx/index.php/jart/article/view/2466