OPUS 4 | Linguistik

When verbs share their power: the case of the german light verb construction (2008)

Wittenberg, Eva ; Piñango, Maria Mercedes

"Kiezdeutsch goes School" : a multiethnic variety of German from an educational perspective (2008)

Paul, Kerstin ; Freywald, Ulrike ; Wittenberg, Eva

This article presents linguistic features of and educational approaches to a new variety of German that has emerged in multi-ethnic urban areas in Germany: Kiezdeutsch (‘Hood German’). From a linguistic point of view, Kiezdeutsch is very interesting, as it is a multi-ethnolect that combines features of a youth language with those of a contact language. We will present examples that illustrate the grammatical productivity and innovative potential of this variety. From an educational perspective, Kiezdeutsch has also a high potential in many respects: school projects can help enrich intercultural communication and weaken derogatory attitudes. In grammar lessons, Kiezdeutsch can be a means to enhance linguistic competence by having the adolescents analyse their own language. Keywords: German, Kiezdeutsch, multi-ethnolect, migrants’ language, language change, educational proposals

Müssen, dürfen, können, mögen : wie kam der Umlaut in die Präteritopräsentia? – Neues zu einem alten Problem der Irregularität (2008)

Nübling, Damaris

Die deutschen Präteritoprasentia sind, indem alte Perfektformen das heutige Präsens stellen, aus mehreren Griinden als hochgradig irregular zu betrachten; hinzu kommt ein bisher nicht geklärter Umlaut bei vier (von heute sieben) dieser Verben: müssen, dürfen, können und mögen. Bisherige Erklärungsversuche werden diesem Problem nicht gerecht: Zwar versuchen sie durchaus, den Umlaut im Präsens zu motivieren, doch vermögen sie es nicht, sein ausschließliches Vorkommen im Plural des Präsens zu erklären. Hier wird für die These argumentiert, dass es sich um einen (verbalen) Pluralumlaut handelt, der insbesondere auch im Nominalbereich gang und gäbe ist und dort zur gleichen Zeit einen massiven Ausbau (Morphologisierung) erfährt. Damit handelt es sich um einen sog. transkategorialen Marker.

On the way from morphology to phonology : German linking elements and the role of the phonological word (2008)

Nübling, Damaris ; Szczepaniak, Renata

German linking elements are sometimes classified as inflectional affixes, sometimes as derivational affixes, and in any case as morphological units with at least seven realisations (e.g. -s-, -es-, -(e)n-, -e-). This article seeks to show that linking elements are hybrid elements situated between morphology and phonology. On the one hand, they have a clear morphological status since they occur only within compounds (and before a very small set of suffixes) and support the listener in decoding them. On the other hand, they also have to be analysed on the phonological level, as will be shown in this article. Thus, they are marginal morphological units on the pathway to phonology (including prosodics). Although some alloforms can sometimes be considered former inflectional endings and in some cases even continue to demonstrate some inflectional behaviour (such as relatedness to gender and inflection class), they are on their way to becoming markers of ill-formed phonological words. In fact, linking elements, above all the linking -s-, which is extremely productive, help the listener decode compounds containing a bad phonological word as their first constituent, such as Geburt+s+tag ‘birthday’ or Religion+s+unterricht ‘religious education’. By marking the end of a first constituent that differs from an unmarked monopedal phonological word, the linking element aids the listener in correctly decoding and analysing the compound. German compounds are known for their length and complexity, both of which have increased over time—along with the occurrence of linking elements, especially -s-. Thus, a profound instance of language change can be observed in contemporary German, one indicating its typological shift from syllable language to word language.

Was tun mit Flexionsklassen? : Deklinationsklassen und ihr Wandel im Deutschen und seinen Dialekten (2008)

Nübling, Damaris

"Warum Flexionsklassen?" lautet ein synchron ausgerichteter Aufsatz von BERND WIESE (2000), an den dieser Beitrag aus diachroner und dialektaler Perspektive anschließt. Das hier zur Diskussion stehende Phänomen, nämlich die notorische Persistenz von Flexionsklasse (im Folgenden "FK") über Jahrhunderte, ja sogar Jahrtausende hinweg, dürfte noch eines der größten linguistischen Rätsel darstellen, die ihrer Lösung harren. HASPELMATH (2002, 115) eröffnet in seinem Band "Understanding Morphology" das Kapitel über "Inflectional paradigms" mit folgenden Worten: "Perhaps the most important challenge for an insightful description of inflection is the widespread existence of allomorphy in many languages."

Tonal focus reflections in Buli and some Gur relatives (2008)

Schwarz, Anne

Buli is an Oti-Volta tone language spoken in Northern Ghana. This paper outlines the basic features of its tonal system and explores whether and in which way pitch respectively phonemic tone is approached as a means to indicate the pragmatic category of focus. Pursued are cases with focus-related surface tone changes as well as cases where pitch could help to disambiguate between broad and narrow foci. It is argued that focus is not consistently encoded by pitch or tone. Parallel findings for the closely related languages Kopen o (phonetic symbol)nni and Dagbani suggest that the apparent lack of significant prosodic focus signals in Buli might pertain to a larger group of tonal languages of the Gur family.

How do voiced retroflex stops evolve? Evidence from typology and an articulatory study (2008)

Hamann, Silke ; Fuchs, Susanne

The present article illustrates that the specific articulatory and aerodynamic requirements for voiced but not voiceless alveolar or dental stops can cause tongue tip retraction and tongue mid lowering and thus retroflexion of front coronals. This retroflexion is shown to have occurred diachronically in the three typologically unrelated languages Dhao (Malayo-Polynesian), Thulung (Sino-Tibetan), and Afar (East-Cushitic). In addition to the diachronic cases, we provide synchronic data for retroflexion from an articulatory study with four speakers of German, a language usually described as having alveolar stops. With these combined data we supply evidence that voiced retroflex stops (as the only retroflex segments in a language) did not necessarily emerge from implosives, as argued by Haudricourt (1950), Greenberg (1970), Bhat (1973), and Ohala (1983). Instead, we propose that the voiced front coronal plosive /d/ is generally articulated in a way that favours retroflexion, that is, with a smaller and more retracted place of articulation and a lower tongue and jaw position than /t/.

The PaGe 2008 shared task on parsing German (2008)

Kübler, Sandra

The ACL 2008 Workshop on Parsing German features a shared task on parsing German. The goal of the shared task was to find reasons for the radically different behavior of parsers on the different treebanks and between constituent and dependency representations. In this paper, we describe the task and the data sets. In addition, we provide an overview of the test results and a first analysis.

POS tagging for German : how important is the right context? (2008)

Ivanova, Steliana ; Kübler, Sandra

Part-of-Speech tagging is generally performed by Markov models, based on bigram or trigram models. While Markov models have a strong concentration on the left context of a word, many languages require the inclusion of right context for correct disambiguation. We show for German that the best results are reached by a combination of left and right context. If only left context is available, then changing the direction of analysis and going from right to left improves the results. In a version of MBT (Daelemans et al., 1996) with default parameter settings, the inclusion of the right context improved POS tagging accuracy from 94.00% to 96.08%, thus corroborating our hypothesis. The version with optimized parameters reaches 96.73%.

Memory-based vocalization of Arabic (2008)

Kübler, Sandra ; Mohamed, Emad

The problem of vocalization, or diacritization, is essential to many tasks in Arabic NLP. Arabic is generally written without the short vowels, which leads to one written form having several pronunciations with each pronunciation carrying its own meaning(s). In the experiments reported here, we define vocalization as a classification problem in which we decide for each character in the unvocalized word whether it is followed by a short vowel. We investigate the importance of different types of context. Our results show that the combination of using memory-based learning with only a word internal context leads to a word error rate of 6.64%. If a lexical context is added, the results deteriorate slowly.

How to compare treebanks (2008)

Kübler, Sandra ; Maier, Wolfgang ; Rehbein, Ines ; Versley, Yannick

Recent years have seen an increasing interest in developing standards for linguistic annotation, with a focus on the interoperability of the resources. This effort, however, requires a profound knowledge of the advantages and disadvantages of linguistic annotation schemes in order to avoid importing the flaws and weaknesses of existing encoding schemes into the new standards. This paper addresses the question how to compare syntactically annotated corpora and gain insights into the usefulness of specific design decisions. We present an exhaustive evaluation of two German treebanks with crucially different encoding schemes. We evaluate three different parsers trained on the two treebanks and compare results using EVALB, the Leaf-Ancestor metric, and a dependency-based evaluation. Furthermore, we present TePaCoC, a new testsuite for the evaluation of parsers on complex German grammatical constructions. The testsuite provides a well thought-out error classification, which enables us to compare parser output for parsers trained on treebanks with different encoding schemes and provides interesting insights into the impact of treebank annotation schemes on specific constructions like PP attachment or non-constituent coordination.

TuLiPA : towards a multi-formalism parsing environment for grammar engineering (2008)

Kallmeyer, Laura ; Lichte, Timm ; Maier, Wolfgang ; Parmentier, Yannick ; Dellert, Johannes ; Evang, Kilian

In this paper, we present an open-source parsing environment (Tübingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammars (TAG) and Multi-Component Tree-Adjoining Grammars with Tree Tuples (TT-MCTAG)) and allows computation not only of syntactic structures, but also of the corresponding semantic representations. It is used for the development of a tree-based grammar for German.

On the relation between multicomponent tree adjoining grammars with tree tuples (TT-MCTAG) and range concatenation grammars (RCG) (2008)

Kallmeyer, Laura ; Parmentier, Yannick

This paper investigates the relation between TT-MCTAG, a formalism used in computational linguistics, and RCG. RCGs are known to describe exactly the class PTIME; simple RCG even have been shown to be equivalent to linear context-free rewriting systems, i.e., to be mildly context-sensitive. TT-MCTAG has been proposed to model free word order languages. In general, it is NP-complete. In this paper, we will put an additional limitation on the derivations licensed in TT-MCTAG. We show that TT-MCTAG with this additional limitation can be transformed into equivalent simple RCGs. This result is interesting for theoretical reasons (since it shows that TT-MCTAG in this limited form is mildly context-sensitive) and, furthermore, even for practical reasons: We use the proposed transformation from TT-MCTAG to RCG in an actual parser that we have implemented.

Factorizing complementation in a TT-MCTAG for German (2008)

Lichte, Timm ; Kallmeyer, Laura

TT-MCTAG lets one abstract away from the relative order of co-complements in the final derived tree, which is more appropriate than classic TAG when dealing with flexible word order in German. In this paper, we present the analyses for sentential complements, i.e., wh-extraction, thatcomplementation and bridging, and we work out the crucial differences between these and respective accounts in XTAG (for English) and V-TAG (for German).

Developing a TT-MCTAG for German with an RCG-based parser (2008)

Kallmeyer, Laura ; Lichte, Timm ; Maier, Wolfgang ; Parmentier, Yannick ; Dellert, Johannes

Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) redundancy and consistency issues. Furthermore some languages can reveal themselves hard to describe because of specific characteristics, e.g. the free word order in German. In this context, we present (i) a framework allowing to describe tree-based grammars, and (ii) an actual fragment of a core multicomponent tree-adjoining grammar with tree tuples (TT-MCTAG) for German developed using this framework. This framework combines a metagrammar compiler and a parser based on range concatenation grammar (RCG) to respectively check the consistency and the correction of the grammar. The German grammar being developed within this framework already deals with a wide range of scrambling and extraction phenomena.

Convertir des grammaires darbres adjoints à composantes multiples avec tuples d’arbres (TT-MCTAG) en grammaires à concaténation d’intervalles (RCG) (2008)

Kallmeyer, Laura ; Parmentier, Yannick

Cet article étudie la relation entre les grammaires darbres adjoints à composantes multiples avec tuples darbres (TT-MCTAG), un formalisme utilisé en linguistique informatique, et les grammaires à concaténation dintervalles (RCG). Les RCGs sont connues pour décrire exactement la classe PTIME, il a en outre été démontré que les RCGs « simples » sont même équivalentes aux systèmes de réécriture hors-contextes linéaires (LCFRS), en dautres termes, elles sont légèrement sensibles au contexte. TT-MCTAG a été proposé pour modéliser les langages à ordre des mots libre. En général ces langages sont NP-complets. Dans cet article, nous définissons une contrainte additionnelle sur les dérivations autorisées par le formalisme TT-MCTAG. Nous montrons ensuite comment cette forme restreinte de TT-MCTAG peut être convertie en une RCG simple équivalente. Le résultat est intéressant pour des raisons théoriques (puisqu’il montre que la forme restreinte de TT-MCTAG est légèrement sensible au contexte), mais également pour des raisons pratiques (la transformation proposée ici a été utilisée pour implanter un analyseur pour TT-MCTAG).

TuLiPA : a syntax-semantics parsing environment for mildly context-sensitive formalisms (2008)

Parmentier, Yannick ; Kallmeyer, Laura ; Maier, Wolfgang ; Lichte, Timm ; Dellert, Johannes

In this paper we present a parsing architecture that allows processing of different mildly context-sensitive formalisms, in particular Tree-Adjoining Grammar (TAG), Multi-Component Tree-Adjoining Grammar with Tree Tuples (TT-MCTAG) and simple Range Concatenation Grammar (RCG). Furthermore, for tree-based grammars, the parser computes not only syntactic analyses but also the corresponding semantic representations.

Open Access

Linguistik

Filtern

Autor*in

Erscheinungsjahr

Dokumenttyp

Sprache

Volltext vorhanden

Gehört zur Bibliographie

Schlagworte

Institut

77 Treffer