Linguistik-Klassifikation
Refine
Year of publication
Document Type
- Conference Proceeding (35) (remove)
Has Fulltext
- yes (35)
Is part of the Bibliography
- no (35)
Keywords
- Computerlinguistik (20)
- Japanisch (10)
- Maschinelle Übersetzung (9)
- Grammatik (3)
- Implementierung <Informatik> (3)
- Korpus <Linguistik> (3)
- Parser (3)
- Standardisierung (3)
- Technische Unterlage (3)
- Höflichkeitsform (2)
Institute
- Extern (22)
- Universitätsbibliothek (1)
We present a novel well-formedness condition for underspecified semantic representations which requires that every correct MRS representation must be a net. We argue that (almost) all correct MRS representations are indeed nets, and apply this condition to identify a set of eleven rules in the English Resource Grammar (ERG) with bugs in their semantics component. Thus we demonstrate that the net test is useful in grammar debugging.
We present an architecture for the integration of shallow and deep NLP components which is aimed at flexible combination of different language technologies for a range of practical current and future applications. In particular, we describe the integration of a high-level HPSG parsing system with different high-performance shallow components, ranging from named entity recognition to chunk parsing and shallow clause recognition. The NLP components enrich a representation of natural language text with layers of new XML meta-information using a single shared data structure, called the text chart. We describe details of the integration methods, and show how information extraction and language checking applications for realworld German text benefit from a deep grammatical analysis.
This paper proposes an annotating scheme that encodes honorifics (respectful words). Honorifics are used extensively in Japanese, reflecting the social relationship (e.g. social ranks and age) of the referents. This referential information is vital for resolving zero
pronouns and improving machine translation outputs. Annotating honorifics is a complex task that involves identifying a predicate with honorifics, assigning ranks to referents of the
predicate, calibrating the ranks, and connecting referents with their predicates.
Der Beitrag behandelt zunächst die Frage, welche Vorteile elektronische Wörterbücher gegenüber traditionell gedruckten Wörterbüchern besitzen. Danach werden drei Online-Programme zur automatischen Übersetzung (Babelfish, Google Übersetzer, Bing Translator) vorgestellt. Beispieltexte werden mit diesen Programmen übersetzt, danach wird die jeweilige Qualität der Übersetzungen beurteilt. Schließlich diskutiert der Beitrag noch die Folgen, die durch die Möglichkeiten automatischen Übersetzens für die Auslandsgermanistik zu erwarten sind. Dabei zeigt sich, dass Programme für das automatische Übersetzen künftig durchaus ernstzunehmende Auswirkungen auf die philologischen Wissenschaften haben können.
Der Übersetzungsprozess der Technischen Dokumentation wird zunehmend mit Maschineller Übersetzung (MÜ) unterstützt. Wir blicken zunächst auf die Ausgangstexte und erstellen automatisch prüfbare Regeln, mit denen diese Texte so editiert werden können, dass sie optimale Ergebnisse in der MÜ liefern. Diese Regeln basieren auf Forschungsergebnissen zur Übersetzbarkeit, auf Forschungsergebnissen zu Translation Mismatches in der MÜ und auf Experimenten.
In this paper, we report on a transformation scheme that turns a Categorial Grammar, more specifically, a Combinatory Categorial Grammar (CCG; see Baldridge, 2002) into a derivation- and meaning-preserving typed feature structure (TFS) grammar.
We describe the main idea which can be traced back at least to work by Karttunen (1986), Uszkoreit (1986), Bouma (1988), and Calder et al. (1988). We then show how a typed representation of complex categories can be extended by other constraints, such as modes, and indicate how the Lambda semantics of combinators is mapped into a TFS representation, using unification to perform perform alpha-conversion and beta-reduction (Barendregt, 1984). We also present first findings concerning runtime measurements, showing that the PET system, originally developed for the HPSG grammar framework, outperforms the OpenCCG parser by a factor of 8–10 in the time domain and a factor of 4–5 in the space domain.
We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats.
In this paper we show an approach to the customization of GermaNet to the German HPSG grammar lexicon developed in the Verbmobil project. GermaNet has a broad coverage of the German base vocabulary and fine-grained semantic classification; while the HPSG grammar lexicon is comparatively small und has a coarse-grained semantic classification. In our approach, we have developed a mapping algorithm to relate the synsets in GermaNet with the semantic sorts in HPSG. The evaluation result shows that this approach is useful for the lexical extension of our deep grammar development to cope with real-world text understanding.
We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool. This grammar is being developed in a multilingual context, requiring MRS structures that are easily comparable across languages.