Linguistik
Refine
Year of publication
Document Type
- Preprint (97) (remove)
Language
- English (97) (remove)
Has Fulltext
- yes (97)
Is part of the Bibliography
- no (97)
Keywords
- Deutsch (17)
- Multicomponent Tree Adjoining Grammar (9)
- Syntax (9)
- Syntaktische Analyse (8)
- Semantik (7)
- Lexicalized Tree Adjoining Grammar (6)
- Optimalitätstheorie (5)
- Range Concatenation Grammar (5)
- syntax (5)
- Englisch (4)
Institute
- Extern (62)
This paper is one argument for a theory of grammatical relations in Chinese in which there are no grammatical relations beyond semantic roles, and no lexical relation-changing rules. As the passive rule is one of the most common relation changing rules cross-linguistically, in this paper I will address the question of whether or not Mandarin Chinese has lexical passives, that is, passives defined as in Relational Grammar (see for example Perlmutter and Postal 1977) and the early Lexical Functional Grammar (LFG) literature (e.g. Bresnan 1982), where a 2-arc (object) is promoted to a 1-arc (subject).
Thirty-one years ago Tsu-lin Mei (1961) argued against the traditional doctrine that saw the subject-predicate distinction in grammar as parallel to the particular- universal distinction in logic, as he said it was a reflex of an Indo-European bias, and could not be valid, as ‘Chinese ... does not admit a distinction into subject and predicate’ (p. 153). This has not stopped linguists working on Chinese from attempting to define ‘subject’ (and ‘object’) in Chinese. Though a number of linguists have lamented the difficulties in trying to define these concepts for Chinese (see below), most work done on Chinese still assumes that Chinese must have the same grammatical features as Indo-European, such as having a subject and a direct object, though no attempt is made to justify that view. This paper challenges that view and argues that there has been no grammaticalization of syntactic functions in Chinese. The correct assignment of semantic roles to the constituents of a discourse is done by the listener on the basis of the discourse structure and pragmatics (information flow, inference, relevance, and real world knowledge) (cf. Li & Thompson 1978, 1979; LaPolla 1990).
This book is a full reference grammar of Qiang, one of the minority languages of southwest China, spoken by about 70,000 Qiang and Tibetan people in Aba Tibetan and Qiang Autonomous Prefecture in northern Sichuan Province. It belongs to the Qiangic branch of Tibeto-Burman (one of the two major branches of Sino-Tibetan). The dialect presented in the book is the Northern Qiang variety spoken in Ronghong Village, Yadu Township, Chibusu District, Mao County. This book, the first book-length description of the Qiang language in English, is the result of many years of work on the language.
A commonly held view in the literature on Scrambling and Clitic Doubling is that both constructions are sensitive to Specificity. For this reason Sportiche (1992) proposes to unify the two, an approach which has become quite standard in the relevant literature ever since. However, the claim that clitic doubling is the counterpart of Germanic scrambling has never been substantiated. In this paper we present extensive evidence from Greek that Clitic Doubling has common formal properties with Germanic Scrambling/Object Shift. Our evidence consists mainly of binding facts observed when doubling takes place, which seem, at first sight, to be completely unexpected. On closer inspection, however, it turns out that these facts are strongly reminiscent of the effects showing up in Germanic scrambling. We propose that these properties can be derived under a theory of clitic constructions along the lines of Sportiche (1992) implemented into the framework of Chomsky (1995). Finally we suggest the that the crosslinguistic distribution of Scrambling as opposed to Clitic Doubling should be linked to a parameter relating to properties of Agr: Move/Merge XP vs. Move/Merge X° to Agr. We show that this parameter unifies the behaviour of subjects and objects within a language and across languages. The paper is organised as follows. In section 2 we present evidence from binding, interpretational and prosodic effects that doubling and scrambling display very similar properties. In section 3 we present Sportiches account and point out some problems for it. In section 4 we present our proposal.
A lot of interest has recently been paid to constraint-based definitions and extensions of Tree Adjoining Grammars (TAG). Examples are the so-called quasi-trees, D-Tree Grammars and Tree Description Grammars. The latter are grammars consisting of a set of formulars denoting trees. TDGs are derivation based where in each derivation step a conjunction is built of the old formular, a formular of the grammar and additional equivalences between node names of the two formulars. This formalism is more powerfull than TAGs. TDGs offer the advantages of MC-TAG and D-Tree Grammars for natural languages and they allow underspecification. However the problem is that TDGs might be unnecessarily powerfull for natural languages. To solve this problem, in this paper, I will propose a local TDGs, a restricted version of TDGs. Local TDGs still have the advantages of TDGs but they are semilinear and therefore more appropriate for natural languages. First, the notion of the semilinearity is defined. Then local TDGs are introduced, and, finally, semilinearity of local Tree Description Languages is proven.
In syntax, the trend nowadays is towards lexicalized grammar formalisms. It is now widely accepted that dividing words into wordclasses may serve as a laborsaving mechanism - but at the same time, it discards all detailed information on the idiosyncratic behavior of words. And that is exactly the type of information that may be necessary in order to parse a sentence. For learning approaches, however, lexicalized grammars represent a challenge for the very reason that they include so much detailed and specific information, which is difficult to learn. This paper will present an algorithm for learning a link grammar of German. The problem of data sparseness is tackled by using all the available information from partial parses as well as from an existing grammar fragment and a tagger. This is a report about work in progress so there are no representative results available yet.
A hierarchy of local TDGs
(1998)
Many recent variants of Tree Adoining Grammars (TAG) allow an underspecifiaction of the parent relation between nodes in a tree, i.e. they do not deal with fully specified trees as it is the case with TAGs.Such TAG variants are for example Description Tree Grammars (DTG), Unordered Vector Grammars with Dominance Links (UVG-DL), a definition of TAGs via so-called quasi trees and Tree Description Grammars (TDG. The last TAg variant, local TDG, is an extension of TAG generating Tree Descriptions. Local TDGs even allow an underspecification of the dominance relation between node names and thereby provide the possibility to generate underspecified representations for structural ambiguities such as quantifier scope ambiguities. This abstract deals with formal properties of local TDGs. A hierarchiy of local TDGs is established together with a pumping lemma for local TDGs of a certain rank.
This paper proposes a compositional semantics for lexicalized tree adjoining grammars (LTAG). Tree-local multicompnent derivations allow seperation of semantiv contribution of a lexical item into one component contributing to the predicate argument structure and second a component contributing to scope semantics. Based on this idea a syntx-semantics interface is presented where the compositional semantics depends only on the derivation structure. It is shown that the derivation structure allows an appropriate amount of underspecification. This is illustrated by investigating underspecified representations for quantifier scpoe ambiguities and related phenomena such as adjunct scope and island constraints.
In this paper we investigate Greek, an optional clitic doubling language not subject to Kaynes generalization (Jaeggli 1982), and we argue that in this language, doubled DPs are in A-positions. We propose that Greek clitics are formal features that move, permitting DPs in argument positions. This leads to a typology according to which there are two types of clitic/agreement languages -configurational and nonconfigurational ones-, depending upon whether clitics are instantiations of formal features or not.
This paper proposes a corpus encoding standard that meets the needs of linguistic research using a variety of linguistic data structures. The standard was developed in SFB 441, a research project at the University of Tuebingen. The principal concern of SFB 441 are the empirical data structures which feed into linguistic theory building. SFB 441 consists of several projects, most of which are building corpora to empirically investigate various linguistic phenomena in various languages (e.g. modal verbs in German, forms of address and politeness in Russian). These corpora will form the components of the "Tuebingen collection of reusable, empirical, linguistic data structures (TUSNELDA)". The TUSNELDA annotation standard aims at providing a uniform encoding scheme for all subcorpora and texts of TUSNELDA such that they can be processed with uniform standardized tools. To guarantee maximal reusability we use XML for encoding. Previous SGML standards for text encoding were provided by the Text Encoding Initiative (TEI) and the Expert Advisory Group on Language Engineering Standards (Corpus Encoding Standard, CES). The TUSNELDA standard is based on TEI and XCES (XML version of CES) but takes into account the specific needs of the SFB projects, i.e. the peculiarities of the examined languages and linguistic phenomena.
Existing analyses of German scrambling phenomena within TAG-related formalisms all use non-local variants of TAG. However, there are good reasons to prefer local grammars, in particular with respect to the use of the derivation structure for semantics. Therefore this paper proposes to use local TDGs, a TAG-variant generating tree descriptions that shows a local derivation structure. However the construction of minimal trees for the derived tree descriptions is not subject to any locality constraint. This provides just the amount of non-locality needed for an adequate analysis of scrambling. To illustrate this a local TDG for some German scrambling data is presented.
In this paper, we investigate the role of sub-optimality in training data for part-of-speech tagging. In particular, we examine to what extent the size of the training corpus and certain types of errors in it affect the performance of the tagger. We distinguish four types of errors: If a word is assigned a wrong tag, this tag can belong to the ambiguity class of the word (i.e. to the set of possible tags for that word) or not; furthermore, the major syntactic category (e.g. "N" or "V") can be correctly assigned (e.g. if a finite verb is classified as an infinitive) or not (e.g. if a verb is classified as a noun). We empirically explore the decrease of performance that each of these error types causes for different sizes of the training set. Our results show that those types of errors that are easier to eliminate have a particularly negative effect on the performance. Thus, it is worthwhile concentrating on the elimination of these types of errors, especially if the training corpus is large.
Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. Such larger structures are not only desirable for a deeper syntactic analysis. They also constitute a necessary prerequisite for assigning function-argument structure. The present paper offers a similaritybased algorithm for assigning functional labels such as subject, object, head, complement, etc. to complete syntactic structures on the basis of prechunked input. The evaluation of the algorithm has concentrated on measuring the quality of functional labels. It was performed on a German and an English treebank using two different annotation schemes at the level of function argument structure. The results of 89.73% correct functional labels for German and 90.40%for English validate the general approach.
Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. The TüSBL parser extends current chunk parsing techniques by a tree-construction component that extends partial chunk parses to complete tree structures including recursive phrase structure as well as function-argument structure. TüSBLs tree construction algorithm relies on techniques from memory-based learning that allow similarity-based classification of a given input structure relative to a pre-stored set of tree instances from a fully annotated treebank. A quantitative evaluation of TüSBL has been conducted using a semi-automatically constructed treebank of German that consists of appr. 67,000 fully annotated sentences. The basic PARSEVAL measures were used although they were developed for parsers that have as their main goal a complete analysis that spans the entire input.This runs counter to the basic philosophy underlying TüSBL, which has as its main goal robustness of partially analyzed structures.
This paper is part of a research project on OT Syntax and the typology of the free relative (FR) construction. It concentrates on the details of an OT analysis and some of its consequences for OT syntax. I will not present a general discussion of the phenomenon and the many controversial issues it is famous for in generative syntax.
Quantitative evaluation of parsers has traditionally centered around the PARSEVAL measures of crossing brackets, (labeled) precision, and (labeled) recall. However, it is well known that these measures do not give an accurate picture of the quality of the parsers output. Furthermore, we will show that they are especially unsuited for partial parsers. In recent years, research has concentrated on dependencybased evaluation measures. We will show in this paper that such a dependency-based evaluation scheme is particularly suitable for partial parsers. TüBa-D, the treebank used here for evaluation, contains all the necessary dependency information so that the conversion of trees into a dependency structure does not have to rely on heuristics. Therefore, the dependency representations are not only reliable, they are also linguistically motivated and can be used for linguistic purposes.
This paper provides an overview of current research on a hybrid and robust parsing architecture for the morphological, syntactic and semantic annotation of German text corpora. The novel contribution of this research lies not in the individual parsing modules, each of which relies on state-of-the-art algorithms and techniques. Rather what is new about the present approach is the combination of these modules into a single architecture. This combination provides a means to significantly optimize the performance of each component, resulting in an increased accuracy of annotation.