Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we experiment with using Stagger, an open-source implementation of an Averaged Perceptron tagger, to tag Icelandic, a morphologically complex language. By adding language specific linguistic features and using IceMorphy, an unknown word guesser, we obtain state-of-the-art tagging accuracy of 92.82%. Furthermore, by adding data from a morphological database, and word embeddings induced from an unannotated corpus, the accuracy increases to 93.84%. This is equivalent to an error reduction of 5.5%, compared to the previously best tagger for Icelandic, consisting of linguistic rules and a Hidden Markov Model.
Original languageEnglish
Title of host publicationProceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013)
Place of PublicationOslo, Norway
PublisherLinköping University Electronic Press, Sweden
Pages105-119
Number of pages15
Publication statusPublished - 1 May 2013

Fingerprint

Dive into the research topics of 'Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic'. Together they form a unique fingerprint.

Cite this