TüSBL : a similarity-based chunk parser for robust syntactic processing
- Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. The TüSBL parser extends current chunk parsing techniques by a tree-construction component that extends partial chunk parses to complete tree structures including recursive phrase structure as well as function-argument structure. TüSBLs tree construction algorithm relies on techniques from memory-based learning that allow similarity-based classification of a given input structure relative to a pre-stored set of tree instances from a fully annotated treebank. A quantitative evaluation of TüSBL has been conducted using a semi-automatically constructed treebank of German that consists of appr. 67,000 fully annotated sentences. The basic PARSEVAL measures were used although they were developed for parsers that have as their main goal a complete analysis that spans the entire input.This runs counter to the basic philosophy underlying TüSBL, which has as its main goal robustness of partially analyzed structures.
Author: | Sandra KüblerORCiDGND, Erhard Hinrichs |
---|---|
URN: | urn:nbn:de:hebis:30-1110508 |
URL: | http://cl.indiana.edu/~skuebler/papers/hlt01.ps |
Editor: | Morgan Kaufmann |
Document Type: | Preprint |
Language: | English |
Year of Completion: | 2001 |
Year of first Publication: | 2001 |
Publishing Institution: | Universitätsbibliothek Johann Christian Senckenberg |
Release Date: | 2008/10/21 |
Tag: | chunk parsing; robust parsing; similarity-based learning |
GND Keyword: | Satzanalyse |
Page Number: | 6 |
Note: | Erschienen in: Morgan Kaufmann (Hrsg.): Proceedings of the First International Conference on Human Language Technology Research, HLT 2001, San Diego, California, USA, March 18-21, 2001 |
Source: | http://jones.ling.indiana.edu/~skuebler/papers/hlt01.ps ; Proceedings of HLT 2001, (San Diego, California 2001). |
HeBIS-PPN: | 206753691 |
Institutes: | keine Angabe Fachbereich / Extern |
Dewey Decimal Classification: | 4 Sprache / 40 Sprache / 400 Sprache |
Sammlungen: | Linguistik |
Linguistik-Klassifikation: | Linguistik-Klassifikation: Computerlinguistik / Computational linguistics |
Licence (German): | Deutsches Urheberrecht |