3 search hits
-
Proceedings of the LREC workshop on partial parsing : between chunk parsing and deep parsing
(2008)
-
Sandra Kübler
Jakub Piskorski
Adam Przepiorkowski
-
Corpora and evaluation tools for multilingual named entity grammar development
(2003)
-
Christian Bering
Witold Droźdźyński
Gregor Erbach
Clara Guasch
Petr Homola
Sabine Lehmann
Hong Li
Hans-Ulrich Krieger
Jakub Piskorski
Ulrich Schäfer
Atsuko Shimada
Melanie Siegel
Feiyu Xu
Dorothee Ziegler-Eisele
- We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats.
-
An Integrated Architecture for Shallow and Deep Processing
(2002)
-
Berthold Crysmann
Anette Frank
Bernd Kiefer
Stefan Müller
Günter Neumann
Jakub Piskorski
Ulrich Schäfer
Melanie Siegel
Hans Uszkoreit
Feiyu Xu
Markus Becker
Hans-Ulrich Krieger
- We present an architecture for the integration of shallow and deep NLP components which is aimed at flexible combination of different language technologies for a range of practical current and future applications. In particular, we describe the integration of a high-level HPSG parsing system with different high-performance shallow components, ranging from named entity recognition to chunk parsing and shallow clause recognition. The NLP components enrich a representation of natural language text with layers of new XML meta-information using a single shared data structure, called the text chart. We describe details of the integration methods, and show how information extraction and language checking applications for realworld German text benefit from a deep grammatical analysis.