• Treffer 34 von 82
Zurück zur Trefferliste

How do treebank annotation schemes influence parsing results? : or how not to compare apples and oranges

  • In the last decade, the Penn treebank has become the standard data set for evaluating parsers. The fact that most parsers are solely evaluated on this specific data set leaves the question unanswered how much these results depend on the annotation scheme of the treebank. In this paper, we will investigate the influence which different decisions in the annotation schemes of treebanks have on parsing. The investigation uses the comparison of similar treebanks of German, NEGRA and TüBa-D/Z, which are subsequently modified to allow a comparison of the differences. The results show that deleted unary nodes and a flat phrase structure have a negative influence on parsing quality while a flat clause structure has a positive influence.
Metadaten
Verfasserangaben:Sandra KüblerORCiDGND
URN:urn:nbn:de:hebis:30-1110588
URL:http://cl.indiana.edu/~skuebler/papers/treebanks.pdf
ISBN:954-91743-3-6
Herausgeber*in:Galia Angelova, Kalina Bontcheva, Ruslan Mitkov, Nicolas Nicolov, Nikolai Nikolov
Dokumentart:Preprint
Sprache:Englisch
Jahr der Fertigstellung:2005
Jahr der Erstveröffentlichung:2005
Veröffentlichende Institution:Universitätsbibliothek Johann Christian Senckenberg
Datum der Freischaltung:21.10.2008
Seitenzahl:8
Bemerkung:
Erschienen in: Galia Angelova ; Kalina Bontcheva ; Ruslan Mitkov ; Nicolas Nicolov ; Nikolai Nikolov (Hrsg.): International conference recent advances in natural language processing : proceedings, Borovets, Bulgaria, 21-23 September 2005, Shoumen : Incoma, 2005, S. 293-300, ISBN: 954-91743-3-6
Quelle:http://jones.ling.indiana.edu/~skuebler/papers/treebanks.pdf ; (in:) Proceedings of RANLP 2005 - Borovets, 2005.
HeBIS-PPN:206763557
Institute:keine Angabe Fachbereich / Extern
DDC-Klassifikation:4 Sprache / 40 Sprache / 400 Sprache
Sammlungen:Linguistik
Linguistik-Klassifikation:Linguistik-Klassifikation: Computerlinguistik / Computational linguistics
Lizenz (Deutsch):License LogoDeutsches Urheberrecht