Leveraging Pre-existing Resources for Data-Efficient Counter-Narrative Generation in Korean (LREC-COLING 2024)

Authors

  • Seungyoon Lee, Chanjun Park, DaHyun Jung, Hyeonseok Moon, Jaehyung Seo, Sugyeong Eo and Heuiseok Lim

Abstract

Counter-narrative generation, i.e., the generation of fact-based responses to hate speech with the aim of correcting discriminatory beliefs, has been demonstrated to be an effective method to combat hate speech. However, its effectiveness is limited by the resource-intensive nature of dataset construction processes and only focuses on the primary language. To alleviate this problem, we propose a Korean Hate Speech Counter Punch (KHSCP), a cost-effective counter-narrative generation method in the Korean language. To this end, we construct the first hate speech counter-narrative dataset in Korean and pose two research questions. Under the questions, we propose an effective augmentation method and investigate the reasonability of a large-scale language model to overcome data scarcity in low-resource environments by utilizing existing resources. In this regard, we conduct several experiments to verify the effectiveness of the proposed method. The results reveal that the performance can be improved by applying pre-existing resources. Through deep analysis on these experiments, this work proposes the possibility of overcoming the challenges of generating counter-narratives in low-resource environments.

Check out the This Link for more info on our paper