Ice and Fire: Dataset on Sentiment, Emotions, Toxicity, Sarcasm, Hate speech, Sympathy and More in Icelandic Blog Comments

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This study introduces "Ice and Fire," a Multi-Task Learning (MTL) dataset tailored for sentiment analysis in the Icelandic language. It encompasses a wide range of linguistic tasks, including sentiment and emotion detection, as well as the identification of toxicity, hate speech, encouragement, sympathy, sarcasm/irony, and trolling. With 261 fully annotated blog comments and 1,045 comments annotated in at least one task, this contribution marks a significant step forward in the field of Icelandic natural language processing. The dataset provides a comprehensive resource for understanding the nuances of online communication in Icelandic and an interface to expand the annotation effort. Despite the challenges inherent in subjective interpretation of text, our findings highlight the positive potential of this dataset to improve text analysis techniques and encourage more inclusive online discourse in Icelandic communities. With promising baseline performances, "Ice and Fire" sets the stage for future research to enhance automated text analysis and develop sophisticated language technologies, contributing to healthier online environments and advancing Icelandic language resources.

Original languageEnglish
Title of host publicationTRAC 2024
Subtitle of host publication4th Workshop on Threat, Aggression and Cyberbullying at LREC-COLING 2024 - Workshop Proceedings
EditorsRitesh Kumar, Atul Kr. Ojha, Shervin Malmasi, Bharathi Raja Chakravarthi, Bornini Lahiri, Siddharth Singh, Shyam Ratan
PublisherEuropean Language Resources Association (ELRA)
Pages73-84
Number of pages12
ISBN (Electronic)9782493814470
Publication statusPublished - 2024
Event4th Workshop on Threat, Aggression and Cyberbullying, TRAC 2024 - Torino, Italy
Duration: 20 May 2024 → …

Publication series

NameTRAC 2024: 4th Workshop on Threat, Aggression and Cyberbullying at LREC-COLING 2024 - Workshop Proceedings

Conference

Conference4th Workshop on Threat, Aggression and Cyberbullying, TRAC 2024
Country/TerritoryItaly
CityTorino
Period20/05/24 → …

Bibliographical note

Publisher Copyright: © 2024 ELRA Language Resource Association.

Other keywords

  • Icelandic Language Resources
  • Multi-Task Learning
  • Sentiment Analysis

Fingerprint

Dive into the research topics of 'Ice and Fire: Dataset on Sentiment, Emotions, Toxicity, Sarcasm, Hate speech, Sympathy and More in Icelandic Blog Comments'. Together they form a unique fingerprint.

Cite this