Linguistik
Refine
Year of publication
Document Type
- Preprint (122) (remove)
Has Fulltext
- yes (122) (remove)
Is part of the Bibliography
- no (122)
Keywords
- Deutsch (19)
- Multicomponent Tree Adjoining Grammar (9)
- Schweizerdeutsch (9)
- Syntax (9)
- Syntaktische Analyse (8)
- Semantik (7)
- Lexicalized Tree Adjoining Grammar (6)
- Dialektologie (5)
- Optimalitätstheorie (5)
- Range Concatenation Grammar (5)
Institute
- Extern (69)
- Sprachwissenschaften (1)
In this paper, we introduce an extension of the XMG system (eXtensibleMeta-Grammar) in order to allow for the description of Multi-Component Tree Adjoining Grammars. In particular, we introduce the XMG formalism and its implementation, and show how the latter makes it possible to extend the system relatively easily to different target formalisms, thus opening the way towards multi-formalism.
In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used. Another direction in parsing research is the development of dependency parsers. Dependency parsing profits from the non-hierarchical nature of dependency relations, thus lexical information can be included in the parsing process in a much more natural way. Especially machine learning based approaches are very successful (cf. e.g.). The results achieved by these dependency parsers are very competitive although comparisons are difficult because of the differences in annotation. For English, the Penn Treebank has been converted to dependencies. For this version, Nivre et al. report an accuracy rate of 86.3%, as compared to an F-score of 92.1 for Charniaks parser. The Penn Chinese Treebank is also available in a constituent and a dependency representations. The best results reported for parsing experiments with this treebank give an F-score of 81.8 for the constituent version and 79.8% accuracy for the dependency version. The general trend in comparisons between constituent and dependency parsers is that the dependency parser performs slightly worse than the constituent parser. The only exception occurs for German, where F-scores for constituent plus grammatical function parses range between 51.4 and 75.3, depending on the treebank, NEGRA or TüBa-D/Z. The dependency parser based on a converted version of Tüba-D/Z, in contrast, reached an accuracy of 83.4%, i.e. 12 percent points better than the best constituent analysis including grammatical functions.
Weak function word shift
(2004)
The fact that object shift only affects weak pronouns in mainland Scandinavian is seen as an instance of a more general observation that can be made in all Germanic languages: weak function words tend to avoid the edges of larger prosodic domains. This generalisation has been formulated within Optimality Theory in terms of alignment constraints on prosodic structure by Selkirk (1996) in explaining thedistribution of prosodically strong and weak forms of English functionwords, especially modal verbs, prepositions and pronouns. But a purely phonological account fails to integrate the syntactic licensing conditions for object shift in an appropriate way. The standard semantico-syntactic accounts of object shift, onthe other hand, fail to explain why it is only weak pronouns that undergo object shift. This paper develops an Optimality theoretic model of the syntax-phonology interface which is based on the interaction of syntactic and prosodic factors. The account can successfully be applied to further related phenomena in English and German.
In der deutschsprachigen Schweiz stehen sich gesprochene Mundarten und geschriebene Standardsprache gegenüber. Außer in formellen Situationen wird Mundart gesprochen, und bis vor kurzem wurde nur selten Mundart geschrieben, sondern die hochdeutsche Schriftsprache. Die Chat-Kommunikation zeigt einerseits durch die nicht-zeitversetzte quasi-direkte Kommunikation wesentliche Züge von Mündlichkeit, die zusammen mit der Informalität im Chat den Mundartgebrauch fördert. Andererseits ist das Medium immer noch die Schrift, welche die Domäne der Standardsprache darstellt. Mundart und Standardsprache stehen sich also in Chaträumen in direkter Konkurrenz gegenüber. Der folgende Beitrag analysiert quantitativ und qualitativ das Neben- und Miteinander der beiden Varietäten in Schweizer Chaträumen und untersucht das Vorkommen und die Bedingungen von Code-Alternation und Code-Switches.
Sprachwahl und Sprachwahrnehmung sind im Deutschen unabdingbar geprägt durch das Wissen von einer Standardsprache. Dieses Wissen basiert für die meisten Sprecher auf der Erfahrung, dass in der Schule manche sprachliche Formen als korrekt, andere als falsch bewertet werden, außerdem auf der Tatsache, dass es Fixierungen der Regeln des Standards in Lexika und Grammatiken gibt. Wissen und Anerkennung dieses Standards sind unabhängig davon, dass keine dieser Kodifikationen unumstritten ist, dass viele Sprecher die Regeln nicht genau kennen und dass als Vorbilder anerkannte Personen (Nachrichtensprecher, Journalisten bestimmter Zeitschriften, Lehrer, Literaten u.a.) keineswegs einheitliche Regeln verfolgen. Der Standard ist fest assoziiert mit der Erfahrung einer legitimen Regelhaftigkeit, also mit Ordnung. Verwendung von Nonstandard wird mit Bezug auf diese Ordnung und von ihr unterschieden wahrgenommen. Diese relationale Sicht der Dinge ist sowohl subjektiv als auch intersubjektiv.
In this paper, we argue that difficulties in the definition of coreference itself contribute to lower inter-annotator agreement in certain cases. Data from a large referentially annotated corpus serves to corroborate this point, using a quantitative investigation to assess which effects or problems are likely to be the most prominent. Several examples where such problems occur are discussed in more detail, and we then propose a generalisation of Poesio, Reyle and Stevenson’s Justified Sloppiness Hypothesis to provide a unified model for these cases of disagreement and argue that a deeper understanding of the phenomena involved allows to tackle problematic cases in a more principled fashion than would be possible using only pre-theoretic intuitions.
We adopt Markert and Nissim (2005)’s approach of using the World Wide Web to resolve cases of coreferent bridging for German and discuss the strength and weaknesses of this approach. As the general approach of using surface patterns to get information on ontological relations between lexical items has only been tried on English, it is also interesting to see whether the approach works for German as well as it does for English and what differences between these languages need to be accounted for. We also present a novel approach for combining several patterns that yields an ensemble that outperforms the best-performing single patterns in terms of both precision and recall.
Nous présentons ici différents algorithmes d’analyse pour grammaires à concaténation d’intervalles (Range Concatenation Grammar, RCG), dont un nouvel algorithme de type Earley, dans le paradigme de l’analyse déductive. Notre travail est motivé par l’intérêt porté récemment à ce type de grammaire, et comble un manque dans la littérature existante.