OPUS 4 | Search

2 search hits

1 to 2

Sort by

Corpora and evaluation tools for multilingual named entity grammar development (2003)

Bering, Christian ; Droźdźyński, Witold ; Erbach, Gregor ; Guasch, Clara ; Homola, Petr ; Lehmann, Sabine ; Li, Hong ; Krieger, Hans-Ulrich ; Piskorski, Jakub ; Schäfer, Ulrich ; Shimada, Atsuko ; Siegel, Melanie ; Xu, Feiyu ; Ziegler-Eisele, Dorothee

We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats.

Lernerkorpora : Ressourcen für die Deutsch-als-Fremdsprache-Forschung (2010)

Schmidt, Karin

The article addresses the growing importance of corpus-based research in the field of German foreign language acquisition. German corpora in general and learner corpora in particular are briefly introduced. A short overview of existing German learner corpora is followed by a detailed description of the error-annotated learner corpus Falko, a learner corpus of advanced learner German, which is accessible via internet (without any prior registration) and free of charge. Finally, a short example analysis demonstrates some of the functionalities of Falko. The aim of the article is to encourage researchers to employ corpora as helpful tools in their own work.

1 to 2

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

2 search hits