Training Data from : Political Hate Speech Detection and Lexicon Building: A Study in Taiwan (IEEE Explore 2022)
Source:R/data.R
hatespeech_zh_tw.Rd
This dataset is derived sample from the development set of "Political Hate Speech Detection and Lexicon Building: A Study in Taiwan." It contains 1,000 annotated data entries, of which 926 are labeled as '0' (not hate speech) and 74 as '1' (hate speech).
The paper can be accessed at https://ieeexplore.ieee.org/document/9738642.
Usage
data("hatespeech_zh_tw")
Format
A data frame with 2 variables:
- text
Content of the text.
- label
Label indicating whether the text is hate speech: '1' for hate speech and '0' for non-hate speech.
Source
Data provided by the authors Chih-Chien Wang, Min-Yuh Day, and Chun-Lian Wu. Available at https://ieeexplore.ieee.org/document/9738642.