TüSBL : a similarity-based chunk parser for robust syntactic processing

Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger st
Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. The TüSBL parser extends current chunk parsing techniques by a tree-construction component that extends partial chunk parses to complete tree structures including recursive phrase structure as well as function-argument structure. TüSBLs tree construction algorithm relies on techniques from memory-based learning that allow similarity-based classification of a given input structure relative to a pre-stored set of tree instances from a fully annotated treebank. A quantitative evaluation of TüSBL has been conducted using a semi-automatically constructed treebank of German that consists of appr. 67,000 fully annotated sentences. The basic PARSEVAL measures were used although they were developed for parsers that have as their main goal a complete analysis that spans the entire input.This runs counter to the basic philosophy underlying TüSBL, which has as its main goal robustness of partially analyzed structures.
show moreshow less

Export metadata

  • Export Bibtex
  • Export RIS

Additional Services

    Share in Twitter Search Google Scholar
Metadaten
Author:Sandra Kübler, Erhard Hinrichs
URN:urn:nbn:de:hebis:30-1110508
Document Type:Article
Language:English
Date of Publication (online):2008/10/21
Year of first Publication:2001
Publishing Institution:Univ.-Bibliothek Frankfurt am Main
Release Date:2008/10/21
Tag:chunk parsing ; robust parsing ; similarity-based learning
SWD-Keyword:Satzanalyse
Source:http://jones.ling.indiana.edu/~skuebler/papers/hlt01.ps ; Proceedings of HLT 2001, (San Diego, California 2001).
HeBIS PPN:206753691
Dewey Decimal Classification:400 Sprache
Sammlungen:Linguistik
Linguistic-Classification:Linguistik-Klassifikation: Computerlinguistik / Computational linguistics
Licence (German):License Logo Veröffentlichungsvertrag für Publikationen

$Rev: 11761 $