Upon a rigorous definition of hate speech, we present a new way of labeling hate speech data using LLM with a prompt of Chain-of-Thought. We have applied this approach to re-label 5 widely used training datasets and evaluated them with 4 test sets. In 17 out of 20 cases, we observe an improvement in performance, resulting in an overall 18% improvement. Additionally, for the test sets, we utilize LLM for relabeling, followed by human validation. Upon performance evaluation, we find improvement in 19 out of 20 cases, resulting in an overall 25% performance enhancement.

https://openreview.net/forum?id=D5OyxbCeiSZp

By author

Leave a Reply

Your email address will not be published. Required fields are marked *