• Treffer 8 von 29
Zurück zur Trefferliste

Is it really that difficult to parse German?

  • This paper presents a comparative study of probabilistic treebank parsing of German, using the Negra and TüBa-D/Z treebanks. Experiments with the Stanford parser, which uses a factored PCFG and dependency model, show that, contrary to previous claims for other parsers, lexicalization of PCFG models boosts parsing performance for both treebanks. The experiments also show that there is a big difference in parsing performance, when trained on the Negra and on the TüBa-D/Z treebanks. Parser performance for the models trained on TüBa-D/Z are comparable to parsing results for English with the Stanford parser, when trained on the Penn treebank. This comparison at least suggests that German is not harder to parse than its West-Germanic neighbor language English.
Metadaten
Verfasserangaben:Sandra KüblerORCiDGND, Erhard Hinrichs, Wolfgang Maier
URN:urn:nbn:de:hebis:30-1110601
URL:http://cl.indiana.edu/~skuebler/papers/parsegerman.pdf
ISBN:1-932432-73-6
Dokumentart:Preprint
Sprache:Englisch
Jahr der Fertigstellung:2006
Jahr der Erstveröffentlichung:2006
Veröffentlichende Institution:Universitätsbibliothek Johann Christian Senckenberg
Datum der Freischaltung:21.10.2008
Seitenzahl:9
Bemerkung:
Erschienen in: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), Stroudsburg, PA : Association for Computational Linguistics, 2006, S. 111–119, ISBN: 1-932432-73-6
Quelle:http://jones.ling.indiana.edu/~skuebler/papers/parsegerman.pdf ; (in:) Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP 2006 - Sydney, 2006.
HeBIS-PPN:206767889
Institute:keine Angabe Fachbereich / Extern
DDC-Klassifikation:4 Sprache / 40 Sprache / 400 Sprache
Sammlungen:Linguistik
Linguistik-Klassifikation:Linguistik-Klassifikation: Computerlinguistik / Computational linguistics
Lizenz (Deutsch):License LogoDeutsches Urheberrecht