Training Data from : Political Hate Speech Detection and Lexicon Building: A Study in Taiwan (IEEE Explore 2022)

This dataset is derived sample from the development set of "Political Hate Speech Detection and Lexicon Building: A Study in Taiwan." It contains 1,000 annotated data entries, of which 926 are labeled as '0' (not hate speech) and 74 as '1' (hate speech).

The paper can be accessed at https://ieeexplore.ieee.org/document/9738642.

Usage

data("hatespeech_zh_tw")

Format

A data frame with 2 variables:

text: Content of the text.
label: Label indicating whether the text is hate speech: '1' for hate speech and '0' for non-hate speech.

Source

Data provided by the authors Chih-Chien Wang, Min-Yuh Day, and Chun-Lian Wu. Available at https://ieeexplore.ieee.org/document/9738642.

Examples

if (FALSE) { # \dontrun{
data(hatespeech_zh_tw)
head(hatespeech_zh_tw)
} # }