TY - CONF A1 - Bering, Christian A1 - Droźdźyński, Witold A1 - Erbach, Gregor A1 - Guasch, Clara A1 - Homola, Petr A1 - Lehmann, Sabine A1 - Li, Hong A1 - Krieger, Hans-Ulrich A1 - Piskorski, Jakub A1 - Schäfer, Ulrich A1 - Shimada, Atsuko A1 - Siegel, Melanie A1 - Xu, Feiyu A1 - Ziegler-Eisele, Dorothee T1 - Corpora and evaluation tools for multilingual named entity grammar development N2 - We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats. KW - Computerlinguistik KW - Korpus Y1 - 2011 UR - http://publikationen.ub.uni-frankfurt.de/frontdoor/index/index/docId/23570 UR - https://nbn-resolving.org/urn:nbn:de:hebis:30:3-235707 UR - http://www.melaniesiegel.de/publications/Bering-et-al-2003.pdf N1 - Zuerst erschienen in: Archer et al. (ed.): Proceedings of the 2003 Corpus Linguistics Conference. - Lancaster, Lancaster University, S. 42-52 ER -