Linguistik
Refine
Year of publication
Document Type
- Preprint (62)
- Conference Proceeding (22)
- Article (19)
- Working Paper (13)
- Part of a Book (11)
- Part of Periodical (7)
- Book (6)
- Report (1)
Language
- English (141) (remove)
Has Fulltext
- yes (141)
Is part of the Bibliography
- no (141)
Keywords
- Computerlinguistik (28)
- Deutsch (16)
- Japanisch (15)
- Syntax (12)
- Grammatik (9)
- Multicomponent Tree Adjoining Grammar (9)
- Maschinelle Übersetzung (8)
- Syntaktische Analyse (8)
- Semantik (7)
- Optimalitätstheorie (6)
Institute
- Extern (141) (remove)
Some requirements for a VERBMOBIL system capable of processing Japanese dialogue input have been explored. Based on a pilot study in the VERBMOBIL domain, dialogues between 2 participants and a professional Japanese interpreter have been analyzed with respect to a very typical and frequent feature: zero pronouns. Zero pronouns in Japanese texts or dialogues as well as overt pronouns in English texts or dialogues are an important element of discourse coherence. As to translation, this difference in the use of pronouns is a case of translation mismatch: information not explicitly expressed in the source language is needed in the target language. (Verb argument positions, normally obligatory in English, are rather frequently omitted in Japanese. Furthermore, verbs in Japanese are not marked with respect to features necessary for pronoun selection in English.)
In this paper, we introduce an extension of the XMG system (eXtensibleMeta-Grammar) in order to allow for the description of Multi-Component Tree Adjoining Grammars. In particular, we introduce the XMG formalism and its implementation, and show how the latter makes it possible to extend the system relatively easily to different target formalisms, thus opening the way towards multi-formalism.
In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used. Another direction in parsing research is the development of dependency parsers. Dependency parsing profits from the non-hierarchical nature of dependency relations, thus lexical information can be included in the parsing process in a much more natural way. Especially machine learning based approaches are very successful (cf. e.g.). The results achieved by these dependency parsers are very competitive although comparisons are difficult because of the differences in annotation. For English, the Penn Treebank has been converted to dependencies. For this version, Nivre et al. report an accuracy rate of 86.3%, as compared to an F-score of 92.1 for Charniaks parser. The Penn Chinese Treebank is also available in a constituent and a dependency representations. The best results reported for parsing experiments with this treebank give an F-score of 81.8 for the constituent version and 79.8% accuracy for the dependency version. The general trend in comparisons between constituent and dependency parsers is that the dependency parser performs slightly worse than the constituent parser. The only exception occurs for German, where F-scores for constituent plus grammatical function parses range between 51.4 and 75.3, depending on the treebank, NEGRA or TüBa-D/Z. The dependency parser based on a converted version of Tüba-D/Z, in contrast, reached an accuracy of 83.4%, i.e. 12 percent points better than the best constituent analysis including grammatical functions.
This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogues, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.
Weak function word shift
(2004)
The fact that object shift only affects weak pronouns in mainland Scandinavian is seen as an instance of a more general observation that can be made in all Germanic languages: weak function words tend to avoid the edges of larger prosodic domains. This generalisation has been formulated within Optimality Theory in terms of alignment constraints on prosodic structure by Selkirk (1996) in explaining thedistribution of prosodically strong and weak forms of English functionwords, especially modal verbs, prepositions and pronouns. But a purely phonological account fails to integrate the syntactic licensing conditions for object shift in an appropriate way. The standard semantico-syntactic accounts of object shift, onthe other hand, fail to explain why it is only weak pronouns that undergo object shift. This paper develops an Optimality theoretic model of the syntax-phonology interface which is based on the interaction of syntactic and prosodic factors. The account can successfully be applied to further related phenomena in English and German.
It is the aim of this paper to present and elaborate a new solution to the old syntactic problems connected with the Latin gerundive and gerund, two verbal categories which have been interpreted variously either as adjective (or participle) or noun (or infinitive). These questions have been much discussed for quite a number of years […] but for the most part from a philological or purely diachronic point of view. All these linguists try to explain the peculiarities of these categories and their syntax by showing that the gerund is historically prior to the gerundive. [...] It is our thesis […] that in order to arrive at a unified account of gerundive and gerund we do not have to go back to prehistoric times. Even for the classical language gerund and gerundive represent the same category, in the sense that the gerund can be shown to be a special case of the gerundive. Additional evidence from a parallel construction in Hindi is adduced to make the Latin facts more plausible. It is only in the post-classical language that certain tendencies which had shown up already in Old Latin poetry become stronger and finally lead to a reanalysis of the gerundive and a split into two distinct syntactic constructions. The propositional meaning of the gerundive in its attributive use is explained with reference to a conflict between syntactic and cognitive principles. Special constructions which are the effects of such conflicts can be found in other parts of grammar. Languages differ with respect to the degree of syntacticization (or conventionalization) of these special constructions.
In this paper I present five alternations of the verb system of Modern Greek, which are recurrently mapped on the syntactic frame NPi__NP. The actual claim is that only the participation in alternations and/or the allocation to an alternation variant can reliably determine the relation between a verb derivative and its base. In the second part, the conceptual structures and semantic/situational fields of a large number of “-ízo” derivatives appearing inside alternation classes are presented. The restricted character of the conceptual and situational preferences inside alternations classes suggests the dominant character of the alternations component.
In this paper, we argue that difficulties in the definition of coreference itself contribute to lower inter-annotator agreement in certain cases. Data from a large referentially annotated corpus serves to corroborate this point, using a quantitative investigation to assess which effects or problems are likely to be the most prominent. Several examples where such problems occur are discussed in more detail, and we then propose a generalisation of Poesio, Reyle and Stevenson’s Justified Sloppiness Hypothesis to provide a unified model for these cases of disagreement and argue that a deeper understanding of the phenomena involved allows to tackle problematic cases in a more principled fashion than would be possible using only pre-theoretic intuitions.
We adopt Markert and Nissim (2005)’s approach of using the World Wide Web to resolve cases of coreferent bridging for German and discuss the strength and weaknesses of this approach. As the general approach of using surface patterns to get information on ontological relations between lexical items has only been tried on English, it is also interesting to see whether the approach works for German as well as it does for English and what differences between these languages need to be accounted for. We also present a novel approach for combining several patterns that yields an ensemble that outperforms the best-performing single patterns in terms of both precision and recall.