Linguistik
Refine
Year of publication
- 2008 (16) (remove)
Document Type
- Preprint (9)
- Article (4)
- Conference Proceeding (3)
Language
- English (16) (remove)
Has Fulltext
- yes (16)
Is part of the Bibliography
- no (16)
Keywords
- Deutsch (5)
- Multicomponent Tree Adjoining Grammar (4)
- Range Concatenation Grammar (4)
- Syntaktische Analyse (4)
- Grammatik (3)
- German (2)
- Kiezdeutsch (2)
- Soziolinguistik (2)
- Syntax (2)
- (Morpho)syntactic focus strategy (1)
Institute
- Extern (16) (remove)
The present article illustrates that the specific articulatory and aerodynamic requirements for voiced but not voiceless alveolar or dental stops can cause tongue tip retraction and tongue mid lowering and thus retroflexion of front coronals. This retroflexion is shown to have occurred diachronically in the three typologically unrelated languages Dhao (Malayo-Polynesian), Thulung (Sino-Tibetan), and Afar (East-Cushitic). In addition to the diachronic cases, we provide synchronic data for retroflexion from an articulatory study with four speakers of German, a language usually described as having alveolar stops. With these combined data we supply evidence that voiced retroflex stops (as the only retroflex segments in a language) did not necessarily emerge from implosives, as argued by Haudricourt (1950), Greenberg (1970), Bhat (1973), and Ohala (1983). Instead, we propose that the voiced front coronal plosive /d/ is generally articulated in a way that favours retroflexion, that is, with a smaller and more retracted place of articulation and a lower tongue and jaw position than /t/.
This monograph describes the overall language situation in Luxembourg, a highly multilingual country in Western Europe, from a language policy and planning perspective. The first part discusses the social and historical contexts, including major societal changes and uncertainties about the future, which are bound up with Europeanisation and the accelerated processes of globalisation. It also deconstructs the notions of Luxembourgish as a 'minority language' and French as the 'language of prestige', and describes a two-pronged language ideology that allows for either monolingual identification with Luxembourgish or trilingual identification with the languages recognised by the language law of 1984 (Luxembourgish / German / French). The second part discusses the trilingual school-system, a system in which large numbers of romanophone students are forced to go through a German-language literacy programme. The third part provides an overview of language spread in the areas of the media and literary writing. The fourth part examines language purism and tensions concerning the standardisation of Luxembourgish, as well as the debates about language requirements for citizenship. The discussion shows how language policy scholarship needs to be approached from a multidimensional perspective, that is, by taking into account dynamics on the global, regional and local levels in addition to those at the state level.
Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) redundancy and consistency issues. Furthermore some languages can reveal themselves hard to describe because of specific characteristics, e.g. the free word order in German. In this context, we present (i) a framework allowing to describe tree-based grammars, and (ii) an actual fragment of a core multicomponent tree-adjoining grammar with tree tuples (TT-MCTAG) for German developed using this framework. This framework combines a metagrammar compiler and a parser based on range concatenation grammar (RCG) to respectively check the consistency and the correction of the grammar. The German grammar being developed within this framework already deals with a wide range of scrambling and extraction phenomena.
In this paper, we present an open-source parsing environment (Tübingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammars (TAG) and Multi-Component Tree-Adjoining Grammars with Tree Tuples (TT-MCTAG)) and allows computation not only of syntactic structures, but also of the corresponding semantic representations. It is used for the development of a tree-based grammar for German.
This paper investigates the relation between TT-MCTAG, a formalism used in computational linguistics, and RCG. RCGs are known to describe exactly the class PTIME; simple RCG even have been shown to be equivalent to linear context-free rewriting systems, i.e., to be mildly context-sensitive. TT-MCTAG has been proposed to model free word order languages. In general, it is NP-complete. In this paper, we will put an additional limitation on the derivations licensed in TT-MCTAG. We show that TT-MCTAG with this additional limitation can be transformed into equivalent simple RCGs. This result is interesting for theoretical reasons (since it shows that TT-MCTAG in this limited form is mildly context-sensitive) and, furthermore, even for practical reasons: We use the proposed transformation from TT-MCTAG to RCG in an actual parser that we have implemented.
The ACL 2008 Workshop on Parsing German features a shared task on parsing German. The goal of the shared task was to find reasons for the radically different behavior of parsers on the different treebanks and between constituent and dependency representations. In this paper, we describe the task and the data sets. In addition, we provide an overview of the test results and a first analysis.
How to compare treebanks
(2008)
Recent years have seen an increasing interest in developing standards for linguistic annotation, with a focus on the interoperability of the resources. This effort, however, requires a profound knowledge of the advantages and disadvantages of linguistic annotation schemes in order to avoid importing the flaws and weaknesses of existing encoding schemes into the new standards. This paper addresses the question how to compare syntactically annotated corpora and gain insights into the usefulness of specific design decisions. We present an exhaustive evaluation of two German treebanks with crucially different encoding schemes. We evaluate three different parsers trained on the two treebanks and compare results using EVALB, the Leaf-Ancestor metric, and a dependency-based evaluation. Furthermore, we present TePaCoC, a new testsuite for the evaluation of parsers on complex German grammatical constructions. The testsuite provides a well thought-out error classification, which enables us to compare parser output for parsers trained on treebanks with different encoding schemes and provides interesting insights into the impact of treebank annotation schemes on specific constructions like PP attachment or non-constituent coordination.
The problem of vocalization, or diacritization, is essential to many tasks in Arabic NLP. Arabic is generally written without the short vowels, which leads to one written form having several pronunciations with each pronunciation carrying its own meaning(s). In the experiments reported here, we define vocalization as a classification problem in which we decide for each character in the unvocalized word whether it is followed by a short vowel. We investigate the importance of different types of context. Our results show that the combination of using memory-based learning with only a word internal context leads to a word error rate of 6.64%. If a lexical context is added, the results deteriorate slowly.
TT-MCTAG lets one abstract away from the relative order of co-complements in the final derived tree, which is more appropriate than classic TAG when dealing with flexible word order in German. In this paper, we present the analyses for sentential complements, i.e., wh-extraction, thatcomplementation and bridging, and we work out the crucial differences between these and respective accounts in XTAG (for English) and V-TAG (for German).
German linking elements are sometimes classified as inflectional affixes, sometimes as derivational affixes, and in any case as morphological units with at least seven realisations (e.g. -s-, -es-, -(e)n-, -e-). This article seeks to show that linking elements are hybrid elements situated between morphology and phonology. On the one hand, they have a clear morphological status since they occur only within compounds (and before a very small set of suffixes) and support the listener in decoding them. On the other hand, they also have to be analysed on the phonological level, as will be shown in this article. Thus, they are marginal morphological units on the pathway to phonology (including prosodics). Although some alloforms can sometimes be considered former inflectional endings and in some cases even continue to demonstrate some inflectional behaviour (such as relatedness to gender and inflection class), they are on their way to becoming markers of ill-formed phonological words. In fact, linking elements, above all the linking -s-, which is extremely productive, help the listener decode compounds containing a bad phonological word as their first constituent, such as Geburt+s+tag ‘birthday’ or Religion+s+unterricht ‘religious education’. By marking the end of a first constituent that differs from an unmarked monopedal phonological word, the linking element aids the listener in correctly decoding and analysing the compound. German compounds are known for their length and complexity, both of which have increased over time—along with the occurrence of linking elements, especially -s-. Thus, a profound instance of language change can be observed in contemporary German, one indicating its typological shift from syllable language to word language.