Linguistik
Refine
Year of publication
Document Type
- Preprint (122) (remove)
Has Fulltext
- yes (122)
Is part of the Bibliography
- no (122) (remove)
Keywords
- Deutsch (19)
- Multicomponent Tree Adjoining Grammar (9)
- Schweizerdeutsch (9)
- Syntax (9)
- Syntaktische Analyse (8)
- Semantik (7)
- Lexicalized Tree Adjoining Grammar (6)
- Dialektologie (5)
- Optimalitätstheorie (5)
- Range Concatenation Grammar (5)
Institute
- Extern (69)
- Sprachwissenschaften (1)
This paper is one argument for a theory of grammatical relations in Chinese in which there are no grammatical relations beyond semantic roles, and no lexical relation-changing rules. As the passive rule is one of the most common relation changing rules cross-linguistically, in this paper I will address the question of whether or not Mandarin Chinese has lexical passives, that is, passives defined as in Relational Grammar (see for example Perlmutter and Postal 1977) and the early Lexical Functional Grammar (LFG) literature (e.g. Bresnan 1982), where a 2-arc (object) is promoted to a 1-arc (subject).
In der folgenden Darstellung geht es einerseits darum, an Beispielen aufzuzeigen, inwiefern die schweizerdeutschen Mundarten und die deutsche Standardsprache in Lautung, Formenbildung, Satzbau und Wortschatz auseinandergehen können, andererseits aber immer auch um das Aufweisen von Gemeinsamkeiten. Oft werden nämlich bestimmte Erscheinungen des dialektalen Sprachbaus vorschnell als Eigenarten der Mundart verstanden, obwohl dieselben Erscheinungen auch im gesprochenen Hochdeutschen anzutreffen sind. Somit liegen also häufig nicht Unterschiede zwischen Mundart und Standardsprache vor, sondern Unterschiede zwischen gesprochener Sprache und geschriebener Sprache. [vollständige Überarbeitung für eine zweite Auflage]
Thirty-one years ago Tsu-lin Mei (1961) argued against the traditional doctrine that saw the subject-predicate distinction in grammar as parallel to the particular- universal distinction in logic, as he said it was a reflex of an Indo-European bias, and could not be valid, as ‘Chinese ... does not admit a distinction into subject and predicate’ (p. 153). This has not stopped linguists working on Chinese from attempting to define ‘subject’ (and ‘object’) in Chinese. Though a number of linguists have lamented the difficulties in trying to define these concepts for Chinese (see below), most work done on Chinese still assumes that Chinese must have the same grammatical features as Indo-European, such as having a subject and a direct object, though no attempt is made to justify that view. This paper challenges that view and argues that there has been no grammaticalization of syntactic functions in Chinese. The correct assignment of semantic roles to the constituents of a discourse is done by the listener on the basis of the discourse structure and pragmatics (information flow, inference, relevance, and real world knowledge) (cf. Li & Thompson 1978, 1979; LaPolla 1990).
This book is a full reference grammar of Qiang, one of the minority languages of southwest China, spoken by about 70,000 Qiang and Tibetan people in Aba Tibetan and Qiang Autonomous Prefecture in northern Sichuan Province. It belongs to the Qiangic branch of Tibeto-Burman (one of the two major branches of Sino-Tibetan). The dialect presented in the book is the Northern Qiang variety spoken in Ronghong Village, Yadu Township, Chibusu District, Mao County. This book, the first book-length description of the Qiang language in English, is the result of many years of work on the language.
A commonly held view in the literature on Scrambling and Clitic Doubling is that both constructions are sensitive to Specificity. For this reason Sportiche (1992) proposes to unify the two, an approach which has become quite standard in the relevant literature ever since. However, the claim that clitic doubling is the counterpart of Germanic scrambling has never been substantiated. In this paper we present extensive evidence from Greek that Clitic Doubling has common formal properties with Germanic Scrambling/Object Shift. Our evidence consists mainly of binding facts observed when doubling takes place, which seem, at first sight, to be completely unexpected. On closer inspection, however, it turns out that these facts are strongly reminiscent of the effects showing up in Germanic scrambling. We propose that these properties can be derived under a theory of clitic constructions along the lines of Sportiche (1992) implemented into the framework of Chomsky (1995). Finally we suggest the that the crosslinguistic distribution of Scrambling as opposed to Clitic Doubling should be linked to a parameter relating to properties of Agr: Move/Merge XP vs. Move/Merge X° to Agr. We show that this parameter unifies the behaviour of subjects and objects within a language and across languages. The paper is organised as follows. In section 2 we present evidence from binding, interpretational and prosodic effects that doubling and scrambling display very similar properties. In section 3 we present Sportiches account and point out some problems for it. In section 4 we present our proposal.
A lot of interest has recently been paid to constraint-based definitions and extensions of Tree Adjoining Grammars (TAG). Examples are the so-called quasi-trees, D-Tree Grammars and Tree Description Grammars. The latter are grammars consisting of a set of formulars denoting trees. TDGs are derivation based where in each derivation step a conjunction is built of the old formular, a formular of the grammar and additional equivalences between node names of the two formulars. This formalism is more powerfull than TAGs. TDGs offer the advantages of MC-TAG and D-Tree Grammars for natural languages and they allow underspecification. However the problem is that TDGs might be unnecessarily powerfull for natural languages. To solve this problem, in this paper, I will propose a local TDGs, a restricted version of TDGs. Local TDGs still have the advantages of TDGs but they are semilinear and therefore more appropriate for natural languages. First, the notion of the semilinearity is defined. Then local TDGs are introduced, and, finally, semilinearity of local Tree Description Languages is proven.
In syntax, the trend nowadays is towards lexicalized grammar formalisms. It is now widely accepted that dividing words into wordclasses may serve as a laborsaving mechanism - but at the same time, it discards all detailed information on the idiosyncratic behavior of words. And that is exactly the type of information that may be necessary in order to parse a sentence. For learning approaches, however, lexicalized grammars represent a challenge for the very reason that they include so much detailed and specific information, which is difficult to learn. This paper will present an algorithm for learning a link grammar of German. The problem of data sparseness is tackled by using all the available information from partial parses as well as from an existing grammar fragment and a tagger. This is a report about work in progress so there are no representative results available yet.
A hierarchy of local TDGs
(1998)
Many recent variants of Tree Adoining Grammars (TAG) allow an underspecifiaction of the parent relation between nodes in a tree, i.e. they do not deal with fully specified trees as it is the case with TAGs.Such TAG variants are for example Description Tree Grammars (DTG), Unordered Vector Grammars with Dominance Links (UVG-DL), a definition of TAGs via so-called quasi trees and Tree Description Grammars (TDG. The last TAg variant, local TDG, is an extension of TAG generating Tree Descriptions. Local TDGs even allow an underspecification of the dominance relation between node names and thereby provide the possibility to generate underspecified representations for structural ambiguities such as quantifier scope ambiguities. This abstract deals with formal properties of local TDGs. A hierarchiy of local TDGs is established together with a pumping lemma for local TDGs of a certain rank.
This paper proposes a compositional semantics for lexicalized tree adjoining grammars (LTAG). Tree-local multicompnent derivations allow seperation of semantiv contribution of a lexical item into one component contributing to the predicate argument structure and second a component contributing to scope semantics. Based on this idea a syntx-semantics interface is presented where the compositional semantics depends only on the derivation structure. It is shown that the derivation structure allows an appropriate amount of underspecification. This is illustrated by investigating underspecified representations for quantifier scpoe ambiguities and related phenomena such as adjunct scope and island constraints.
In this paper we investigate Greek, an optional clitic doubling language not subject to Kaynes generalization (Jaeggli 1982), and we argue that in this language, doubled DPs are in A-positions. We propose that Greek clitics are formal features that move, permitting DPs in argument positions. This leads to a typology according to which there are two types of clitic/agreement languages -configurational and nonconfigurational ones-, depending upon whether clitics are instantiations of formal features or not.
This paper proposes a corpus encoding standard that meets the needs of linguistic research using a variety of linguistic data structures. The standard was developed in SFB 441, a research project at the University of Tuebingen. The principal concern of SFB 441 are the empirical data structures which feed into linguistic theory building. SFB 441 consists of several projects, most of which are building corpora to empirically investigate various linguistic phenomena in various languages (e.g. modal verbs in German, forms of address and politeness in Russian). These corpora will form the components of the "Tuebingen collection of reusable, empirical, linguistic data structures (TUSNELDA)". The TUSNELDA annotation standard aims at providing a uniform encoding scheme for all subcorpora and texts of TUSNELDA such that they can be processed with uniform standardized tools. To guarantee maximal reusability we use XML for encoding. Previous SGML standards for text encoding were provided by the Text Encoding Initiative (TEI) and the Expert Advisory Group on Language Engineering Standards (Corpus Encoding Standard, CES). The TUSNELDA standard is based on TEI and XCES (XML version of CES) but takes into account the specific needs of the SFB projects, i.e. the peculiarities of the examined languages and linguistic phenomena.
Existing analyses of German scrambling phenomena within TAG-related formalisms all use non-local variants of TAG. However, there are good reasons to prefer local grammars, in particular with respect to the use of the derivation structure for semantics. Therefore this paper proposes to use local TDGs, a TAG-variant generating tree descriptions that shows a local derivation structure. However the construction of minimal trees for the derived tree descriptions is not subject to any locality constraint. This provides just the amount of non-locality needed for an adequate analysis of scrambling. To illustrate this a local TDG for some German scrambling data is presented.
Das Chunkparsing bietet einen besonders vielversprechenden Ansatz zum robusten, partiellen Parsing mit dem Ziel einer breiten Datenabdeckung. Ziel beim Chunkparsing ist eine partielle, nicht-rekursive syntaktische Struktur. Dieser extrem effiziente Parsing-Ansatz läßt sich als Kaskade endlicher Transducer realisieren. In diesem Beitrag wird TüSBL vorgestellt, ein System, bei dem die Eingabe aus spontaner, gesprochener Spache besteht, die dem Parser in Form eines Worthypothesengraphen aus einem Spracherkenner zur Verfügung gestellt wird. Chunkparsing ist für eine solche Anwendung besonders geeignet, da es fragmentarische oder nicht wohlgeformte Äußerungen robust behandeln kann. Des weiteren wird eine Baumkonstruktionskomponente vorgestellt, die die partiellen Chunkstrukturen zu vollständigen Bäumen mit grammatischen Funktionen erweitert. Das System wird anhand manuell überprüfter Systemeingaben evaluiert, da sich die üblichen Evaluationsparameter hierfür nicht eignen.
In this paper, we investigate the role of sub-optimality in training data for part-of-speech tagging. In particular, we examine to what extent the size of the training corpus and certain types of errors in it affect the performance of the tagger. We distinguish four types of errors: If a word is assigned a wrong tag, this tag can belong to the ambiguity class of the word (i.e. to the set of possible tags for that word) or not; furthermore, the major syntactic category (e.g. "N" or "V") can be correctly assigned (e.g. if a finite verb is classified as an infinitive) or not (e.g. if a verb is classified as a noun). We empirically explore the decrease of performance that each of these error types causes for different sizes of the training set. Our results show that those types of errors that are easier to eliminate have a particularly negative effect on the performance. Thus, it is worthwhile concentrating on the elimination of these types of errors, especially if the training corpus is large.
Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. Such larger structures are not only desirable for a deeper syntactic analysis. They also constitute a necessary prerequisite for assigning function-argument structure. The present paper offers a similaritybased algorithm for assigning functional labels such as subject, object, head, complement, etc. to complete syntactic structures on the basis of prechunked input. The evaluation of the algorithm has concentrated on measuring the quality of functional labels. It was performed on a German and an English treebank using two different annotation schemes at the level of function argument structure. The results of 89.73% correct functional labels for German and 90.40%for English validate the general approach.
Der TUSNELDA-Standard : ein Korpusannotierungsstandard zur Unterstützung linguistischer Forschung
(2001)
Die Verwendung von Standards für die Annotierung größerer Sammlungen elektronischer Texte (Korpora) ist eine Voraussetzung für eine mögliche Wiederverwendung dieser Korpora. Dieser Artikel stellt einen Korpusannotierungsstandard vor, der die Anforderungen der Untersuchung unterschiedlichster linguistischer Phänomene berücksichtigt. Der Standard wurde im SFB 441 an der Universität Tübingen entwickelt. Er geht von bestehenden Standards, insbesondere CES und TEI, aus, die sich als teilweise zu ausführlich und zu wenig restriktiv,teilweise auch als nicht ausdrucksstark genug erweisen, um den Bedürfnissen korpusbasierter linguistischer Forschung gerecht zu werden.
Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. The TüSBL parser extends current chunk parsing techniques by a tree-construction component that extends partial chunk parses to complete tree structures including recursive phrase structure as well as function-argument structure. TüSBLs tree construction algorithm relies on techniques from memory-based learning that allow similarity-based classification of a given input structure relative to a pre-stored set of tree instances from a fully annotated treebank. A quantitative evaluation of TüSBL has been conducted using a semi-automatically constructed treebank of German that consists of appr. 67,000 fully annotated sentences. The basic PARSEVAL measures were used although they were developed for parsers that have as their main goal a complete analysis that spans the entire input.This runs counter to the basic philosophy underlying TüSBL, which has as its main goal robustness of partially analyzed structures.
Maschinelles Lernen wird häufig zur effzienten Annotation großer Datenmengen eingesetzt. Die Forschung zu maschinellen Lernverfahren beschränkt sich i.a. darauf unterschiedliche Lernverfahren zu vergelichen oder die optimale größe der Trainingsdaten zu bestimmen. Bisher wurde jedoch nicht untersucht, in wie weit sich linguistisches Wissen bei der Aufgabendefinition positiv auswirken kann. Dies soll hier anhand des Lernens von Base-Nominalphrasen mit drei unterschiedlichen Definitionen untersucht werden. Die Definitionen unterscheiden sich im Grad der linguistisch motivierten Erweiterungen, die zu einer eher praktisch motivierten ersten Definition hinzu kamen. Die Untersuchungen ergaben, dass sich die Anzahl der falsch klasssifizierten Wörter um ein Drittel reduzieren lässt.
In der Abteilung Grammatik des Instituts für Deutsche Sprache, Mannheim, wird derzeit ein neues Projekt entwickelt, und zwar das einer Grammatik des Deutschen im europäischen Vergleich (GDE). Dieses Projekt fügt sich ein in die kontrastive Tradition des IDS, ist jedoch andererseits auch in vieler Hinsicht innovativ. Bevor ich das Projekt im Einzelnen vorstelle, versuche ich den Bogen zurück zu den kontrastiven Grammatiken zu schlagen. Gerade die Leserschaft polnischer Germanisten braucht an die Tradition kontrastiver Grammatikschreibung sicher nicht eigens erinnert zu werden. Denn diese Tradition, die untrennbar mit dem Namen Ulrich Engel verknüpft ist, ist gerade erst in der neu erschienenen deutsch-polnischen kontrastiven Grammatik kulminiert. Im Bereich der kontrastiven Grammatiken zu Sprachenpaaren, von denen das Deutsche ein Element ist, verfügt das IDS also über eine vergleichsweise reiche Tradition. Am IDS oder in Kooperation mit dem IDS wurden kontrastive Grammatiken zu den Sprachenpaaren Deutsch – Französisch (Zemb 1978), Deutsch – Serbokroatisch , Deutsch – Spanisch (Cartegena/Gauger 1989), Deutsch – Rumänisch (Engel u.a. 1993) erarbeitet. Zum Sprachenpaar Englisch – Deutsch liegt mit Hawkins 1986 eine typologisch-vergleichende Grammatik vor. Die deutsch-polnische kontrastive Grammatik, die unter der Leitung von Ulrich Engel erarbeitet wurde, ist 1999 erscheinen. Abraham 1994 und Glinz 1994 konfrontieren das Deutsche, mit durchaus unterschiedlicher Akzentsetzung, mit mehreren anderen europäischen Sprachen. An der Berliner Humboldt-Universität laufen derzeit die Vorarbeiten zu einer deutsch-russischen kontrastiven Grammatik (Initiative Wolfgang Gladrow und Michail Kotin). Die Aufgabe einer 'Grammatik des Deutschen im europäischen Kontext' ist also hinlänglich vorbereitet.
This paper is part of a research project on OT Syntax and the typology of the free relative (FR) construction. It concentrates on the details of an OT analysis and some of its consequences for OT syntax. I will not present a general discussion of the phenomenon and the many controversial issues it is famous for in generative syntax.
Quantitative evaluation of parsers has traditionally centered around the PARSEVAL measures of crossing brackets, (labeled) precision, and (labeled) recall. However, it is well known that these measures do not give an accurate picture of the quality of the parsers output. Furthermore, we will show that they are especially unsuited for partial parsers. In recent years, research has concentrated on dependencybased evaluation measures. We will show in this paper that such a dependency-based evaluation scheme is particularly suitable for partial parsers. TüBa-D, the treebank used here for evaluation, contains all the necessary dependency information so that the conversion of trees into a dependency structure does not have to rely on heuristics. Therefore, the dependency representations are not only reliable, they are also linguistically motivated and can be used for linguistic purposes.
This paper provides an overview of current research on a hybrid and robust parsing architecture for the morphological, syntactic and semantic annotation of German text corpora. The novel contribution of this research lies not in the individual parsing modules, each of which relies on state-of-the-art algorithms and techniques. Rather what is new about the present approach is the combination of these modules into a single architecture. This combination provides a means to significantly optimize the performance of each component, resulting in an increased accuracy of annotation.
Die Sprachsituation der deutschen Schweiz, wo die Mundarten den großen Teil der gesprochenen Sprachrealität darstellen, bietet ein weites Feld für Erforschung der gesprochenen Sprache. Die starke Position der Mundarten und die weitgehend mündliche Überlieferung machen sie für die Sprachwandelforschung interessant. Nachdem die Erforschung von Sprachwandel lange auf der Rekonstruktion gesprochener Sprache aus Schriftzeugnissen beschränkt war, kann seit dem wissenschaftlich reflektierten Festhalten gesprochener Sprache in Transkripten und seit der Möglichkeit zur Tonarchivierung auf historische Zeugnisse gesprochener Sprache zurückgegriffen werden. So kann die primäre Sprachform berücksichtigt werden. Denn obwohl Lautwandel lange der zentrale Bereich der Sprachgeschichtsschreibung war und die Sprachgeschichtsschreibung weitgehend vom "Primat des Sprechens" (Sonderegger 1979, 11) ausgegangen war, musste sie sich lange mit Schriftzeugnissen abfinden, die nur Reflexe gesprochener Sprache darstellten.
The definition of similarity between sentences is formulated on the levels of words, POS tags, and chunks (Abney 91; Abney 96). The evaluation of this approach shows that while precision and recall based on the PARSEVAL measures (Black et al. 91) do not reach state of the art Parsers yet (F1=87.19 on syntactic constituents, F1=77.78 including functionargument structure), the parser shows a very reliable performance where function-argument structure is concerned (F1=96.52). The lower F-scores are very often due to unattached constituents.
This paper addresses the problem ofconstraints for relative quantifier sope, in partiular in inverse linking readings wherecertain scope orders are exluded. We show how to account for such restrictions in the Tree Adjoining Grammar (TAG) framework by adopting a notion offlexible composition. In the semantics we use for TAG we introduce quantifier sets that group quantifiers that are "glued" together in the sense that no other quantifieran scopally intervene between them. Theflexible composition approach allows us to obtain the desired quantifier sets and thereby the desiredconstraints for quantifier sope.
This paper argues for a particular architecture of OT syntax. This architecture hasthree core features: i) it is bidirectional, the usual production-oriented optimisation (called ‘first optimisation’ here) is accompanied by a second step that checks the recoverability of an underlying form; ii) this underlying form already contains a full-fledged syntactic specification; iii) especially the procedure checking for recoverability makes crucial use of semantic and pragmatic factors. The first section motivates the basic architecture. The second section shows with two examples, how contextual factors are integrated. The third section examines its implications for learning theory, and the fourth section concludes with a broader discussion of the advantages and disadvantages of the proposed model.
Is language the key to number? This article argues that the human language faculty provides the cognitive equipment that enables humans to develop a systematic number concept. Crucially, this concept is based on non-iconic representations that involve relations between relations: relations between numbers are linked with relations between objects. In contrast to this, language-independent numerosity concepts provide only iconic representations. The pattern of forming relations between relations lies at the heart of our language faculty, suggesting that it is language that enables humans to make the step from these iconic representations, which we share with other species, to a generalised concept of number.
Sino-Tibetan is a prime example of how strongly a language family can typologically diversify under the pressure of areal spread features (Matisoff 1991, 1999). One of the manifestation of this is the average length of prosodic words. In Southeast Asia, prosodic words tend to average on one or one-and-a-half syllables. In the Himalayas, by contrast, it is not uncommon to encounter prosodic words containing five to ten syllables. The following pair of examples illustrates this.
LTAG semantics for questions
(2004)
This papers presents a compositional semantic analysis of interrogatives clauses in LTAG (Lexicalized Tree Adjoining Grammar) that captures the scopal properties of wh- and nonwh-quantificational elements. It is shown that the present approach derives the correct semantics for examples claimed to be problematic for LTAG semantic approaches based on the derivation tree. The paper further provides an LTAG semantics for embedded interrogatives.
Weak function word shift
(2004)
The fact that object shift only affects weak pronouns in mainland Scandinavian is seen as an instance of a more general observation that can be made in all Germanic languages: weak function words tend to avoid the edges of larger prosodic domains. This generalisation has been formulated within Optimality Theory in terms of alignment constraints on prosodic structure by Selkirk (1996) in explaining thedistribution of prosodically strong and weak forms of English functionwords, especially modal verbs, prepositions and pronouns. But a purely phonological account fails to integrate the syntactic licensing conditions for object shift in an appropriate way. The standard semantico-syntactic accounts of object shift, onthe other hand, fail to explain why it is only weak pronouns that undergo object shift. This paper develops an Optimality theoretic model of the syntax-phonology interface which is based on the interaction of syntactic and prosodic factors. The account can successfully be applied to further related phenomena in English and German.
The purpose of this paper is to describe the TüBa-D/Z treebank of written German and to compare it to the independently developed TIGER treebank (Brants et al., 2002). Both treebanks, TIGER and TüBa-D/Z, use an annotation framework that is based on phrase structure grammar and that is enhanced by a level of predicate-argument structure. The comparison between the annotation schemes of the two treebanks focuses on the different treatments of free word order and discontinuous constituents in German as well as on differences in phrase-internal annotation.
Tree-local MCTAG with shared nodes : an analysis of word order variation in German and Korean
(2004)
Tree Adjoining Grammars (TAG) are known not to be powerful enough to deal with scrambling in free word order languages. The TAG-variants proposed so far in order to account for scrambling are not entirely satisfying. Therefore, an alternative extension of TAG is introduced based on the notion of node sharing. Considering data from German and Korean, it is shown that this TAG-extension can adequately analyse scrambling data, also in combination with extraposition and topicalization.
This paper reports on the SYN-RA (SYNtax-based Reference Annotation) project, an on-going project of annotating German newspaper texts with referential relations. The project has developed an inventory of anaphoric and coreference relations for German in the context of a unified, XML-based annotation scheme for combining morphological, syntactic, semantic, and anaphoric information. The paper discusses how this unified annotation scheme relates to other formats currently discussed in the literature, in particular the annotation graph model of Bird and Liberman (2001) and the pie-in-thesky scheme for semantic annotation.
The purpose of this paper is to describe recent developments in the morphological, syntactic, and semantic annotation of the TüBa-D/Z treebank of German. The TüBa-D/Z annotation scheme is derived from the Verbmobil treebank of spoken German [4, 10], but has been extended along various dimensions to accommodate the characteristics of written texts. TüBa-D/Z uses as its data source the "die tageszeitung" (taz) newspaper corpus. The Verbmobil treebank annotation scheme distinguishes four levels of syntactic constituency: the lexical level, the phrasal level, the level of topological fields, and the clausal level. The primary ordering principle of a clause is the inventory of topological fields, which characterize the word order regularities among different clause types of German, and which are widely accepted among descriptive linguists of German [3, 6]. The TüBa-D/Z annotation relies on a context-free backbone (i.e. proper trees without crossing branches) of phrase structure combined with edge labels that specify the grammatical function of the phrase in question. The syntactic annotation scheme of the TüBa-D/Z is described in more detail in [12, 11]. TüBa-D/Z currently comprises approximately 15 000 sentences, with approximately 7 000 sentences being in the correction phase. The latter will be released along with an updated version of the existing treebank before the end of this year. The treebank is available in an XML format, in the NEGRA export format [1] and in the Penn treebank bracketing format. The XML format contains all types of information as described above, the NEGRA export format contains all sentenceinternal information while the Penn treebank format includes only those layers of information that can be expressed as pure tree structures. Over the course of the last year, more fine grained linguistic annotations have been added along the following dimensions: 1. the basic Stuttgart-Tübingen tagset, STTS, [9] labels have been enriched by relevant features of inflectional morphology, 2. named entity information has been encoded as part of the syntactic annotation, and 3. a set of anaphoric and coreference relations has been added to link referentially dependent noun phrases. In the following sections, we will describe each of these innovations in turn and will demonstrate how the additional annotations can be incorporated into one comprehensive annotation scheme.
Davidsonian event semantics has an impressive track record as a framework for natural language analysis. In recent years it has become popular to assume that not only action verbs but predicates of all sorts have an additional event argument. Yet, this hypothesis is not without controversy in particular wrt the particularly challenging case of statives. Maienborn (2003a, 2004) argues that there is a need for distinguishing two kinds of states. While verbs such as sit, stand, sleep refer to eventualities in the sense of Davidson (= Davidsonian states), the states denoted by such stative verbs like know, weigh,and own, as well as any combination of copula plus predicate are of a different ontological type (= Kimian states). Against this background, the present study assesses the two main arguments that have been raised in favour of a Davidsonian approach for statives. These are the combination with certain manner adverbials and Parsons (2000) so-called time travel argument. It will be argued that the manner data which, at first sight, seem to provide evidence for a Davidsonian approach to statives are better analysed as non-compositional reinterpretations triggered by the lack of a regular Davidsonian event argument. As for Parsons´s time travel argument, it turns out that the original version does not supply the kind of support for the Davidsonian approach that Parsons supposed. However, properly adapted, the time travel argument may provide additional evidence for the need of reifying the denotatum of statives, as suggested by the assumption of Kimian states.
This paper sets up a framework for LTAG (Lexicalized Tree Adjoining Grammar) semantics that brings together ideas from different recent approaches addressing some shortcomings of TAG semantics based on the derivation tree. Within this framework, several sample analyses are proposed, and it is shown that the framework allows to analyze data that have been claimed to be problematic for derivation tree based LTAG semantics approaches.
On embedded implicatures
(2004)
The Gricean approach explains implicatures by assumptions about the pragmatics of entire utterances. The phenomenon of embedded implicatures remains a challenge for this approach since in such cases apparently implicatures contribute to the truth-conditional content of constituents smaller than utterances. In this paper, I investigate three areas where embedded implicatures seem to differ from implicatures at the utterance level: optionality, epistemic status, and implicated presuppositions. I conclude that the differences between the two kinds of implicatures justify an approach that maintains Gricean assumptions at the utterance level, and assumes a special operator for embedded implicatures.
Die Ressource "Wissen" rückte in den letzten Jahrzehnten als Quelle wissenschaftlicher Innovation immer stärker ins Zentrum des Interesses. Diese Fokussierung mündete in eine Selbstreflexion der Wissenschaft und der wissenschaftlichen Disziplinen: Thematisiert werden vor allem die Art und Weise, wie Wissen gewonnen wird, sowie die damit zusammenhängende Frage nach der Konstruktion von Wissenschaftlichkeit, womit das Bewusstsein gleichzeitig auf die mehr und mehr sich auflösende Abgrenzung zwischen den Disziplinen beziehungsweise zwischen den drei hauptsächlichen Wissenschaftskulturen, von Natur-, Geistes- und Kultur- sowie Sozialwissenschaften gelenkt wird. Innerhalb und außerhalb der Universitäten bildeten und bilden sich nicht immer klar verortbare "trading zones" (Gallison 1997), in denen neue Formen und Techniken der Wissensproduktion und Wissensvermittlung geprüft, geübt und teilweise auch institutionalisiert werden. ...
The aim of this paper is the exploration of an optimality theoretic architecture for syntax that is guided by the concept of "correspondence": syntax is understood as the mechanism of "translating" underlying representations into a surface form. In minimalism, this surface form is called "Phonological Form" (PF). Both semantic and abstract syntactic information are reflected by the surface form. The empirical domain where this architecture is tested are minimal link effects, especially in the case of "wh"-movement. The OT constraints require the surface form to reflect the underlying semantic and syntactic representations as maximally as possible. The means by which underlying relations and properties are encoded are precedence, adjacency, surface morphology and prosodic structure. Information that is not encoded in one of these ways remains unexpressed, and gets lost unless it is recoverable via the context. Different kinds of information are often expressed by the same means. The resulting conflicts are resolved by the relative ranking of the relevant correspondence constraints.
The argument that I tried to elaborate on in this paper is that the conceptual problem behind the traditional competence/performance distinction does not go away, even if we abandon its original Chomskyan formulation. It returns as the question about the relation between the model of the grammar and the results of empirical investigations – the question of empirical verification The theoretical concept of markedness is argued to be an ideal correlate of gradience. Optimality Theory, being based on markedness, is a promising framework for the task of bridging the gap between model and empirical world. However, this task not only requires a model of grammar, but also a theory of the methods that are chosen in empirical investigations and how their results are interpreted, and a theory of how to derive predictions for these particular empirical investigations from the model. Stochastic Optimality Theory is one possible formulation of a proposal that derives empirical predictions from an OT model. However, I hope to have shown that it is not enough to take frequency distributions and relative acceptabilities at face value, and simply construe some Stochastic OT model that fits the facts. These facts first of all need to be interpreted, and those factors that the grammar has to account for must be sorted out from those about which grammar should have nothing to say. This task, to my mind, is more complicated than the picture that a simplistic application of (not only) Stochastic OT might draw.
Sprachwahl und Sprachwahrnehmung sind im Deutschen unabdingbar geprägt durch das Wissen von einer Standardsprache. Dieses Wissen basiert für die meisten Sprecher auf der Erfahrung, dass in der Schule manche sprachliche Formen als korrekt, andere als falsch bewertet werden, außerdem auf der Tatsache, dass es Fixierungen der Regeln des Standards in Lexika und Grammatiken gibt. Wissen und Anerkennung dieses Standards sind unabhängig davon, dass keine dieser Kodifikationen unumstritten ist, dass viele Sprecher die Regeln nicht genau kennen und dass als Vorbilder anerkannte Personen (Nachrichtensprecher, Journalisten bestimmter Zeitschriften, Lehrer, Literaten u.a.) keineswegs einheitliche Regeln verfolgen. Der Standard ist fest assoziiert mit der Erfahrung einer legitimen Regelhaftigkeit, also mit Ordnung. Verwendung von Nonstandard wird mit Bezug auf diese Ordnung und von ihr unterschieden wahrgenommen. Diese relationale Sicht der Dinge ist sowohl subjektiv als auch intersubjektiv.
In the last decade, the Penn treebank has become the standard data set for evaluating parsers. The fact that most parsers are solely evaluated on this specific data set leaves the question unanswered how much these results depend on the annotation scheme of the treebank. In this paper, we will investigate the influence which different decisions in the annotation schemes of treebanks have on parsing. The investigation uses the comparison of similar treebanks of German, NEGRA and TüBa-D/Z, which are subsequently modified to allow a comparison of the differences. The results show that deleted unary nodes and a flat phrase structure have a negative influence on parsing quality while a flat clause structure has a positive influence.
This paper develops a framework for TAG (Tree Adjoining Grammar) semantics that brings together ideas from different recent approaches.Then, within this framework, an analysis of scope is proposed that accounts for the different scopal properties of quantifiers, adverbs, raising verbs and attitude verbs. Finally, including situation variables in the semantics, different situation binding possibilities are derived for different types of quantificational elements.