The researchers collected tweets containing hate speech terms using a crowdsourced lexicon. These tweets were divided into three categories: neither, offensive language, and hate speech. To distinguish between these groups, a multi-class classifier was trained. Analyzing predictions and mistakes showed when it is easier to discern hate speech from offensive language and when it is more difficult. According to the research, sexist tweets are often regarded as objectionable, while racist and homophobic language is more frequently labeled as hate speech. It is very difficult to categorize tweets without overt hateful language.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5148370

By author

Leave a Reply

Your email address will not be published. Required fields are marked *