Skip to content

This dataset is derived sample from the development set of "Political Hate Speech Detection and Lexicon Building: A Study in Taiwan." It contains 1,000 annotated data entries, of which 926 are labeled as '0' (not hate speech) and 74 as '1' (hate speech).

The paper can be accessed at https://ieeexplore.ieee.org/document/9738642.

Usage

data("hatespeech_zh_tw")

Format

A data frame with 2 variables:

text

Content of the text.

label

Label indicating whether the text is hate speech: '1' for hate speech and '0' for non-hate speech.

Source

Data provided by the authors Chih-Chien Wang, Min-Yuh Day, and Chun-Lian Wu. Available at https://ieeexplore.ieee.org/document/9738642.

Examples

if (FALSE) { # \dontrun{
data(hatespeech_zh_tw)
head(hatespeech_zh_tw)
} # }