OPUS 4 | Linguistik

Antecedent selection techniques for high-recall roreference resolution (2007)

We investigate methods to improve the recall in coreference resolution by also trying to resolve those definite descriptions where no earlier mention of the referent shares the same lexical head (coreferent bridging). The problem, which is notably harder than identifying coreference relations among mentions which have the same lexical head, has been tackled with several rather different approaches, and we attempt to provide a meaningful classification along with a quantitative comparison. Based on the different merits of the methods, we discuss possibilities to improve them and show how they can be effectively combined.

A constraint-based approach to noun phrase coreference resolution in German newspaper text (2006)

Versley, Yannick

In this paper, we investigate the usefulness of a wide range of features for their usefulness in the resolution of nominal coreference, both as hard constraints (i.e. completely removing elements from the list of possible candidates) as well as soft constraints (where a cumulation of violations of soft constraints will make it less likely that a candidate is chosen as the antecedent). We present a state of the art system based on such constraints and weights estimated with a maximum entropy model, using lexical information to resolve cases of coreferent bridging.

Von Geisterhand zu Potte gekommen : eine HPSG-Analyse von PPs mit unikaler Komponente (2003)

Söhn, Jan-Philipp

Im Fokus dieser Magisterarbeit stehen Präpositionalphrasen (PP), deren Komplement eine unikale Komponente ist. Es handelt sich bei diesen Komplementen um Nomen, die außerhalb einer PP nicht vorkommen bzw. in anderen Umgebungen nicht die selbe Bedeutung haben. Um dieses Phänomen zu beschreiben wird eine Analyse innerhalb der Head-Driven Phrase Structure Grammar (HPSG) entwickelt. Grundkenntnisse über Struktur und Begrifflichkeiten der HPSG werden in dieser Arbeit vorausgesetzt, als Referenz siehe [PS94]. Die Gliederung gestaltet sich wie folgt: Zunächst werden die zu untersuchenden Daten im Detail dargestellt. Anschließend werden verschiedene Analysemöglichkeiten innerhalb der Theorie der HPSG in Betracht gezogen, nämlich Selektion, Konstruktion und Kollokation. Dabei muss festgestellt werden, dass die existierenden Ansätze den Daten nicht oder nur unbefriedigend gerecht werden können. Der Ansatz, der letztendlich verfolgt wird, besteht darin, den bereits existierenden Selektionsmechanismus über SPEC zu generalisieren. Dieses Vorgehen erlaubt dann der unikalen NP, die Präposition, mit welcher sie einhergeht, zu selegieren. Hierzu werden einige, jedoch vertretbare Änderungen in der HPSG-Architektur vorgenommen und es wird gezeigt, wie mit dem generalisierten Mechanismus die Daten behandelt werden können. Daran anschließend folgt eine Erweiterung des Phänomenbereichs auf Paarformeln. Ferner wird ein Einwand im Zusammenhang mit der Analyse des Komplements als NP bzw. DP diskutiert und zur weiteren Motivation des Ansatzes wird noch ein weiteres lokales Phänomen,die Distribution der Spur, mit der hier vorgestellten Herangehensweise modelliert. Darüberhinaus wird die Frage untersucht, ob man nicht auch PPs mit festen Verben geschickt analysieren kann. Dazu wird ein Weg, Lexeme zu selegieren, eingeführt und der entwickelte Mechanismus erweitert. Diese Erweiterung findet Anwendung bei der Modellierung der lokalen Distribution einer Partikel. Eine Zusammenfassung, sowie ein Ausblick auf weiterführende Fragestellungen schließen die Arbeit ab.

The syntax of existential sentences in Serbian (2008)

Hartmann, Jutta M. ; Milicevic, Nataša

Freeze (1992) argued on the basis of data from several different languages that there is a close relationship between existential sentences (stating the existence of an entity) and locative sentences (stating the location of an entity). Freeze (1992) proposes that they are both derived from the same base structure and that the surface differences are rather due to the distinct information structures.This paper argues against this position with the data from Serbian existentials, which show clear syntactic differences from the locatives. Thus, the close relationship between existential and locative sentences that Freeze (1992) observes is conceptual, but not (necessarily) part of the syntax of the language. In order to account for the data, we propose that existential sentences originate from a different syntactic predication structure than the locative ones. The existential meaning arises, as we will show, from the interaction of this predication structure with the structure and meaning of the noun phrase.

When negation is not negation (2005)

Milicevic, Nataša

In this paper I will discuss the formation of different types of yes/no questions in Serbian (examples in (1)), focusing on the syntactically and semantically puzzling example (1d), which involves the negative auxiliary inversion. Although there is a negative marker on the fronted auxiliary, the construction does not involve sentential negation. This coincides with the fact that the negative quantifying NPIs cannot be licensed. The question formation and sentential negation have similar syntactic effects cross-linguistically. This has led to various attempts to formulate a unifying syntactic account of the phenomena (ever since Klima 1964). One striking fact about the two syntactic contexts is that both license weak NPIs (Negative Polarity Items). It has been suggested (cf. Laka 1990, Culicover 1991) that the derivation of both interrogatives and negatives involves the same type of functional projection PolP (polarity phrase). One such account of the formation of negative interrogatives in Serbo- Croatian is offered by Progovac (2005). She proposes that there are two PolPs optionally cooccurring in the same clause, in which both positive and negative polarity items check their positive or negative features (following Haegeman and Zanuttini (1991) feature-checking account of negative structures, and the insights of Brown(1999) on the negation in Russian). On her account, the negative auxiliary question in (1d), is the case when both polarity phrases are present. The higher has [-pos +neg] features, and the lower one (below TP) is [-pos -neg]. Although her account correctly predicts the ungrammaticality of (2a) in contrast with (1c), it wrongly predicts the (2b) to be grammatical. I will argue that Progovac’s theory regarding the nature of the PolP is wrong. It employs both the binary feature valuation on the polarity head and the hierarchical ordering of the two polarity phrases, which eventually leads to overgeneration. On the account presented here the nature of the question marker (li vs zar) is highly relevant. Notice that (1b) and (1d) express presuppositions regarding the truth value of the propositions. In this way they contrast with (1a) and (1c). In addition, the type (1b) (with the question particle zar) can introduce both the positive and negative presupposition as shown in (3), which, semantically, makes this construction compatible with negative auxiliary questions in English (4a). The polarity items licensed in the relevant structures are also of the same type in both languages. The fronted-negative-auxiliary questions (1d) in Serbian are only possible with the particle li. In this case the presupposition is exclusively positive. The peculiar question/focus marking function of li (in Bulgarian and Russian) is well known. However, it is always assumed that its focus marking role is not relevant for the formation of yes/no questions. This I believe is not correct. The syntactic explanation of the interpretational facts points to the following: A) The possibility of the separate lexical encoding (particle zar) of the ‘rhetorical’ yes/no questions in Serbian allows the embedding of both positive and negated sentences, in which case the (weak) NPIs can remain in local relation with the negated verb. B) Recall that Serbian is an NC language, which requires local/c-command relation between the verbal negative marker and the NPI. With the negative inverted auxiliary questions this condition is not met, and the licensing of an n-word is not possible. C) The impossibility of licensing a weak NPI (i-words in the examples below) is due to the nature of the question marker li. (1) a. Da li je Vera videla ikoga / nekoga / *nikoga? DA Q aux Vera see.part.F.Sg anyone someone noone “Did Vera see anyone/someone/noone?” b. Zar je Vera videla ikoga / nekoga / *nikoga? ZAR aux Vera see.part.F.Sg anyone someone noone “Is it really the fact that Vera saw anyone/someone?” c. Je li Vera videla ikoga / nekoga /*nikoga? aux Q Vera see.part.F.Sg anyone someone noone “Did Vera see anyone/someone/noone?” d. Nije li Vera videla *ikoga / nekoga / *nikoga? neg+aux Q Vera see.part.F.Sg anyone someone noone “Didn’t Vera see someone?”/ “Vera saw someone, didn’t she?” (2) a. *Nije li Vera videla nikoga? neg+aux Q Vera see.part.F.Sg noone b. *Nije li Vera videla ikoga? neg+aux Q Vera see.part.F.Sg anyone (3) a. Zar je Vera videla nekoga / ikoga? ZAR aux Vera see.part.F.Sg someone/anyone b. Zar Vera nije videla nekoga/nikoga? ZAR Vera neg+aux see.part.F.Sg someone/anyone (4) a. Didn’t Vera (NOT) see someone/anyone? b. Vera saw someone, didn’t she?

Review of Regina Pustet : Copulas: universals in the categorization of the lexicon: (Oxford University Press 2003; 262pp) (2007)

Maienborn, Claudia

The renowned Grimm Dictionary (1854-1961) makes the statement that the German copula sein (to be) is “the most general and colourless of all verbal concepts” (der allgemeinste und farbloseste aller verbalbegriffe). A more concise summary of the linguistic issues surrounding the copula is hardly possible. These two properties (and the latent tension between them!) make copulas a particularly interesting and vexing subject of linguistic research. Copulas appear to be almost colourless, i.e., devoid of any concrete meaning, thus leading to the question of why such expressions exist at all, not only in German but in the majority of the world’s languages. And at the same time copulas presumably provide the best window into the core of verbal concepts thereby telling us what it actually means to be a verb – at least in a language like German or English. While there is a rather rich body of research on copulas in philosophical and formal semantics including several in-depth studies on the copular systems of individual languages, copulas have received comparably little attention from a typological perspective. The monograph of Regina Pustet sets out to fill this gap. She presents an extensive cross-linguistic study of copula usage based on a sample of 154 languages drawn from the language families of the world. The analysis is embedded in the theoretical framework of functional typology. The study aims at uncovering universal principles that govern the distribution of copulas in nominal, adjectival, and verbal predications. Its major objective is the development of a “semantically-based model of copula distribution” (p.62) by means of which the presence vs. absence of copulas can be motivated through the inherent meaning of the lexical items they potentially combine with. Drawing mainly on the work by Givón (1979, 1984) and Croft (1991, 2001), who provide a functional foundation of the traditional parts of speech, Pustet identifies four semantic parameters which, if taken together, are claimed to support substantial generalisations on copula distribution – within a given language as well as crosslinguistically. These parameters are DYNAMICITY, TRANSIENCE, TRANSITIVITY, and DEPENDENCY. Pustet goes on to argue – and this is in fact the driving force behind the overall monograph – that the distributional behaviour of copulas, in turn, yields a useful methodology for developing a general approach to lexical categorization. Thus, in the long run Pustet aims at contributing to a better understanding of the traditional parts of speech, noun, adjective, and verb by defining them in terms of “semantic feature bundles, which can be arranged in [a] coherent semantic similarity space” (p.193).

On the limits of the Davidsonian approach : the case of copula sentences (2005)

Maienborn, Claudia

Since Donald Davidson’s seminal work “The Logical Form of Action Sentences” (1967) event arguments have become an integral component of virtually every semantic theory. Over the past years Davidson´s proposal has been continuously extended such that nowadays event(uality) arguments are generally associated not only with action verbs but with predicates of all sorts. The reasons for such an extension are seldom explicitly justified. Most problematical in this respect is the case of stative expressions. By taking a closer look at copula sentences the present study assesses the legitimacy of stretching the Davidsonian notion of events and discusses its consequences. A careful application of some standard eventuality diagnostics (perception reports, combination with locative modifiers and manner adverbials) as well as some new diagnostics (behavior of certain degree adverbials) reveals that copular expressions do not behave as expected under a Davidsonian perspective: they fail all eventuality tests, regardless of whether they represent stage-level or individual-level predicates. In this respect, copular expressions pattern with stative verbs like know, hate, and resemble, which in turn differ sharply from state verbs like stand, sit, and sleep. The latter pass all of the eventuality tests and therefore qualify as true “Davidsonian state” expressions. On the basis of these empirical observations and taking up ideas of Kim (1969, 1976) and Asher (1993, 2000), an alternative account of copular expressions (and stative verbs) is provided, according to which the copula introduces a referential argument for a temporally bound property exemplification (= “Kimian state”). Considerations on some logical properties, viz. closure conditions and the latent infinite regress of eventualities, suggest that supplementing Davidsonian eventualities with Kimian states may yield not only a more adequate analysis of copula sentences but also a better understanding of eventualities in general.

On Davidsonian and kimian states (2004)

Maienborn, Claudia

Davidsonian event semantics has an impressive track record as a framework for natural language analysis. In recent years it has become popular to assume that not only action verbs but predicates of all sorts have an additional event argument. Yet, this hypothesis is not without controversy in particular wrt the particularly challenging case of statives. Maienborn (2003a, 2004) argues that there is a need for distinguishing two kinds of states. While verbs such as sit, stand, sleep refer to eventualities in the sense of Davidson (= Davidsonian states), the states denoted by such stative verbs like know, weigh,and own, as well as any combination of copula plus predicate are of a different ontological type (= Kimian states). Against this background, the present study assesses the two main arguments that have been raised in favour of a Davidsonian approach for statives. These are the combination with certain manner adverbials and Parsons (2000) so-called time travel argument. It will be argued that the manner data which, at first sight, seem to provide evidence for a Davidsonian approach to statives are better analysed as non-compositional reinterpretations triggered by the lack of a regular Davidsonian event argument. As for Parsons´s time travel argument, it turns out that the original version does not supply the kind of support for the Davidsonian approach that Parsons supposed. However, properly adapted, the time travel argument may provide additional evidence for the need of reifying the denotatum of statives, as suggested by the assumption of Kimian states.

Modifying (the grammar of) adjuncts : an introduction (2003)

Lang, Ewald ; Maienborn, Claudia ; Fabricius-Hansen, Cathrine

One aspect of the progress being made is that the focus of attention has widened. Adverbials, though still the heart of the matter, now form part of a much larger set of constituent types subsumed under the general syntactic label of adjunct; while modifier has become the semantic counterpart on the same level of generality. So one of the readings of Modifying Adjuncts stands for the focus on this intersection. Moreover, recent years have seen a number of studies which attest an increasing interest in adjunct issues. There is an impressive number of monographs, e.g. Alexiadou (1997), Laenzlinger (1998), Cinque (1999), Pittner (1999), Ernst (2002), which, by presenting in-depth analyses of the syntax of adjuncts, have sharpened the debate on syntactic theorizing. Serious attempts to gain a broader view on adjuncts are witnessed by several collections, see Alexiadou and Svenonius (2000), Austin, Engelberg and Rauh (in progress); of particular importance are the contributions to vol. 12.1 of the Italian Journal of Linguistics (2000), a special issue on adverbs, the Introductions to which by Corver and Delfitto (2000) and Delfitto (2000) may be seen as the best state-of-the-art article on adverbs and adverbial modification currently on the market. To try and test a fresh view on adjuncts was the leitmotif of the Oslo Conference “Approaching the Grammar of Adjuncts” (Sept 22–25, 1999), which provided the initial forum for the papers contained in this volume and initiated a period of discussion and continuing interaction among the contributors, from which the versions published here have greatly profited. The aim of the Oslo conference, and hence the focus of the present volume, was to encourage syntacticians and semanticists to open their minds to a more integrative approach to adjuncts, thereby paying attention to, and attempting to account for, the various interfaces that the grammar of adjuncts crucially embodies. From this perspective, the present volume is to be conceived of as an interim balance of current trends in modifying the views on adjuncts. In introducing the papers, we will refrain from rephrasing the abstracts, but will instead offer a guided tour through the major problem areas they are tackling. Assessed by thematic convergence and mutual reference, the contributions form four groups, which led us to arrange them into subparts of the book. Our commenting on these is intended (i) to provide a first glance at the contents, (ii) to reveal some of the reasons why adjuncts indeed are, and certainly will remain, a challenging issue, and thereby (iii) to show some facets of what we consider novel and promising approaches.

Eventualities and different things : a reply (2005)

Maienborn, Claudia

“Comments are very welcome!” This basic attitude and the many ways of implementing it contribute immensely to the fascination of engaging in scientific research. I am grateful to Theoretical Linguistics for providing a public platform for this kind of scholarly exchange and I thank all commentators for their thoughtful, stimulating, and often challenging contributions to my target article. My response will address two main issues that are raised by the commentaries. The first issue is shaped by a cluster of questions relating to ontology. The second issue concerns questions of methodology pertaining in particular to the problem of judging data.

Event-internal modifiers : semantic underspecification and conceptual interpretation (2003)

Maienborn, Claudia

The article offers evidence that there are two variants of adverbial modification that differ with respect to the way in which a modifier is linked to the verbs eventuality argument. So-called event-external modifiers relate to the full eventuality, whereas event-internal modifiers relate to some integral part of it. The choice between external and internal modification is shown to be dependent on the modifiers syntactic base position. Event-external modifiers are base-generated at the VP periphery, whereas event-internal modifiers are base-generated at the V periphery. These observations are accounted for by a refined version of the standard Davidsonian approach to adverbial modification according to which modification is mediated by a free variable. In the case of external modification, the grammar takes responsibility for identifying the free variable with the verbs eventuality argument, whereas in the case of internal modification, a value for the free variable is determined by the conceptual system on the basis of contextually salient world knowledge. For the intriguing problem that certain locative modifiers occasionally seem to have nonlocative (instrumental, positional, or manner) readings, the advocated approach can provide a rather simple solution.

Das Zustandspassiv : grammatische Einordnung – Bildungsbeschränkungen – Interpretationsspielraum (2005)

Maienborn, Claudia

Against a Davidsonian analysis of copula sentences (2003)

Maienborn, Claudia

Semantic research over the past three decades has provided impressive confirmation of Donald Davidsons famous claim that “there is a lot of language we can make systematic sense of if we suppose events exist” (Davidson 1980:137). Nowadays, Davidsonian event arguments are no longer reserved only for action verbs (as Davidson originally proposed) or even only for the category of verbs, but instead are widely assumed to be associated with any kind of predicate (e.g. Higginbotham 2000, Parsons 2000).1 The following quotation from Higginbotham and Ramchand (1997) illustrates the reasoning that motivates this move: "Once we assume that predicates (or their verbal, etc. heads) have a position for events, taking the many consequences that stem therefrom, as outlined in publications originating with Donald Davidson (1967), and further applied in Higginbotham (1985, 1989), and Terence Parsons (1990), we are not in a position to deny an event-position to any predicate; for the evidence for, and applications of, the assumption are the same for all predicates. (Higginbotham and Ramchand 1997:54)" In fact, since Davidson’s original proposal the burden of proof for postulating event arguments seems to have shifted completely, leading Raposo and Uriagereka (1995), for example, to the following verdict: "it is unclear what it means for a predicate not to have a Davidsonian argument (Raposo and Uriagereka 1995:182)" That is, Davidsonian eventuality arguments apparently have become something like a trademark for predicates in general. The goal of the present paper is to subject this view of the relationship between predicates and events to real scrutiny. By taking a closer look at the simplest independent predicational structure – viz. copula sentences – I will argue that current Davidsonian approaches tend to stretch the notion of events too far, thereby giving up much of its linguistic and ontological usefulness. More specifically, the paper will tackle the following three questions: 1. Do copula sentences support the current view of the inherent event-relatedness of predicates? 2. If not, what is a possible alternative to an event-based analysis of copula sentences? 3. What does this tell us about Davidsonian events? The paper is organized as follows: Section 2 first reviews current event-based analyses of copula sentences and then gives a brief summary of the Davidsonian notion of events. Section 3 examines the behavior of copula sentences with respect to some standard (as well as some new) eventuality diagnostics. Copula expressions will turn out to fail all eventuality tests. They differ sharply from state verbs like stand, sit, sleep in this respect. (The latter pass all eventuality tests and therefore qualify as true “Davidsonian state” expressions.) On the basis of these observations, section 4 provides an alternative account of copula sentences that combines Kim’s (1969, 1976) notion of property exemplifications with Ashers (1993, 2000) conception of abstract objects. Specifically, I will argue that the copula introduces a referential argument for a temporally bound property exemplification (= “Kimian state”). The proposal is implemented within a DRT framework. Finally, section 5 offers some concluding remarks and suggests that supplementing Davidsonian eventualities by Kimian states not only yields a more adequate analysis for copula expressions and the like but may also improve our treatment of events.

A pragmatic explanation of the stage level/individual level contrast in combination with locatives (2004)

Maienborn, Claudia

One important difference between stage level predicates (SLPs) and individual level predicates (ILPs) is their behavior with respect to locative modifiers. It is commonly assumed that SLPs but not ILPs combine with locatives. The present study argues against a semantic account for this behavior (as advanced by e.g. Kratzer 1995, Chierchia 1995) and proposes a genuinely pragmatic explanation of the observed stage level/individual level contrast instead. The proposal is spelled out using Blutners (1998, 2000) optimality theoretic version of the Gricean maxims. Building on the observation that the respective locatives are not event-related but frame-setting modifiers, the preference for main predicates that express temporary properties is explained as a side-effect of “synchronizing” the main predicate with the locative frame in the course of finding an optimal interpretation. By emphasizing the division of labor between grammar and pragmatics, the proposed solution takes a considerable load off of semantics.

A discourse-based account of Spanish ser/estar (2005)

Maienborn, Claudia

The study offers a discourse-based account of the Spanish copula forms ser and estar, which are generally considered to be lexical exponents of the stage-level/individual-level contrast. It argues against the popular view that the distinction between SLPs and ILPs rests on a fundamental cognitive division of the world that is reflected in the grammar. As it happens, conceptual oppositions like “temporary vs. permanent” or “arbitrary vs. essential“ provide only a preference for the interpretation of estar and ser. In addition, the evidence for an SLP/ILP impact on the grammar turns out to be far less conclusive than is currently assumed. The study argues against event-based accounts of the ser/estar contrast in particular, showing that ser and estar pattern alike in failing all of the standard eventuality tests. The discourse-based account proposed instead assumes that ser and estar both display the same lexical semantics (which is identical to the semantics of English be, German sein, etc.); estar differs from ser only in presupposing a relation to a specific discourse situation. By using estar a speaker restricts his or her claim to a specific discourse situation, whereas by using ser, the speaker makes no such restriction. The preference for interpreting estar predications as denoting temporary properties and ser predications as denoting permanent properties follows from economy principles driving the pragmatic legitimation of estars discourse dependence. The analysis proposed in this paper can also account for the observation that ser predications do not give rise to thetic judgements. The proposal is couched in terms of the framework of DRT.

Proceedings of the LREC workshop on partial parsing : between chunk parsing and deep parsing (2008)

Kübler, Sandra ; Piskorski, Jakub ; Przepiorkowski, Adam

Why is German dependency parsing more reliable than constituent parsing? (2006)

Kübler, Sandra ; Prokic, Jelena

In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used. Another direction in parsing research is the development of dependency parsers. Dependency parsing profits from the non-hierarchical nature of dependency relations, thus lexical information can be included in the parsing process in a much more natural way. Especially machine learning based approaches are very successful (cf. e.g.). The results achieved by these dependency parsers are very competitive although comparisons are difficult because of the differences in annotation. For English, the Penn Treebank has been converted to dependencies. For this version, Nivre et al. report an accuracy rate of 86.3%, as compared to an F-score of 92.1 for Charniaks parser. The Penn Chinese Treebank is also available in a constituent and a dependency representations. The best results reported for parsing experiments with this treebank give an F-score of 81.8 for the constituent version and 79.8% accuracy for the dependency version. The general trend in comparisons between constituent and dependency parsers is that the dependency parser performs slightly worse than the constituent parser. The only exception occurs for German, where F-scores for constituent plus grammatical function parses range between 51.4 and 75.3, depending on the treebank, NEGRA or TüBa-D/Z. The dependency parser based on a converted version of Tüba-D/Z, in contrast, reached an accuracy of 83.4%, i.e. 12 percent points better than the best constituent analysis including grammatical functions.

What linguists always wanted to know about german and did not know how to estimate (2006)

Hinrichs, Erhard ; Kübler, Sandra

This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogues, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.

Treebank profiling of spoken and written German (2005)

Hinrichs, Erhard ; Kübler, Sandra

This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogs, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper ´die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.

Towards case-based parsing : are chunks reliable indicators for syntax trees? (2006)

Kübler, Sandra

This paper presents an approach to the question whether it is possible to construct a parser based on ideas from case-based reasoning. Such a parser would employ a partial analysis of the input sentence to select a (nearly) complete syntax tree and then adapt this tree to the input sentence. The experiments performed on German data from the Tüba-D/Z treebank and the KaRoPars partial parser show that a wide range of levels of generality can be reached, depending on which types of information are used to determine the similarity between input sentence and training sentences. The results are such that it is possible to construct a case-based parser. The optimal setting out of those presented here need to be determined empirically.

Towards a dependency-oriented evaluation for partial parsing (2002)

Kübler, Sandra ; Telljohann, Heike

Quantitative evaluation of parsers has traditionally centered around the PARSEVAL measures of crossing brackets, (labeled) precision, and (labeled) recall. However, it is well known that these measures do not give an accurate picture of the quality of the parsers output. Furthermore, we will show that they are especially unsuited for partial parsers. In recent years, research has concentrated on dependencybased evaluation measures. We will show in this paper that such a dependency-based evaluation scheme is particularly suitable for partial parsers. TüBa-D, the treebank used here for evaluation, contains all the necessary dependency information so that the conversion of trees into a dependency structure does not have to rely on heuristics. Therefore, the dependency representations are not only reliable, they are also linguistically motivated and can be used for linguistic purposes.

The Tüba-D/Z treebank : annotating German with a context-free backbone (2004)

Telljohann, Heike ; Hinrichs, Erhard ; Kübler, Sandra

The purpose of this paper is to describe the TüBa-D/Z treebank of written German and to compare it to the independently developed TIGER treebank (Brants et al., 2002). Both treebanks, TIGER and TüBa-D/Z, use an annotation framework that is based on phrase structure grammar and that is enhanced by a level of predicate-argument structure. The comparison between the annotation schemes of the two treebanks focuses on the different treatments of free word order and discontinuous constituents in German as well as on differences in phrase-internal annotation.

The earliest Gullah/AAVE texts : a case of 19th century mesolectal variation (2003)

Troike, Rudolph C.

The earliest known extensive texts in Gullah (and perhaps African American Vernacular English as well) to appear in print were published in The Riverside Magazine for Young People in November, 1868, under the title "Negro Fables" (p. 505-507). These are four animal stories, which the editor of the magazine, Horace Elisha Scudder, described in his column only as having been "taken down from the lips of an old negro, in the vicinity of Charleston" (see Appendix for the editor´s comments and the full text of the stories).2 The Story-Teller was evidently a genuine "man of words" (Abrahams, 1983), a true raconteur who could artistically embellish a simple traditional account (perhaps further embellished by the transcriber) in a variety of ways. That he commanded a certain range of Gullah is evident from particular signature features in the texts, but the absence of other typical Gullah features and the presence of shared Gullah/African American Vernacular English usages, together with the periodic appearance of standard English forms, demonstrate that these texts provide perhaps the earliest actual documentation (apart from early tertiary comments, cited e.g. in Feagin, 1997, p. 128-129) of register variation or style/code-switching among Gullah speakers. ...

The PaGe 2008 shared task on parsing German (2008)

Kübler, Sandra

The ACL 2008 Workshop on Parsing German features a shared task on parsing German. The goal of the shared task was to find reasons for the radically different behavior of parsers on the different treebanks and between constituent and dependency representations. In this paper, we describe the task and the data sets. In addition, we provide an overview of the test results and a first analysis.

The CoNLL 2007 shared task on dependency parsing (2007)

Nivre, Joakim ; Hall, Johan ; Kübler, Sandra ; McDonald, Ryan ; Nilsson, Jens ; Riedel, Sebastian ; Yuret, Deniz

The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In this paper, we define the tasks of the different tracks and describe how the data sets were created from existing treebanks for ten languages. In addition, we characterize the different approaches of the participating systems, report the test results, and provide a first analysis of these results.

Sometimes less is more : Romanian word sense disambiguation revisited (2007)

Dinu, Georgiana ; Kübler, Sandra

Recent approaches to Word Sense Disambiguation (WSD) generally fall into two classes: (1) information-intensive approaches and (2) information-poor approaches. Our hypothesis is that for memory-based learning (MBL), a reduced amount of data is more beneficial than the full range of features used in the past. Our experiments show that MBL combined with a restricted set of features and a feature selection method that minimizes the feature set leads to competitive results, outperforming all systems that participated in the SENSEVAL-3 competition on the Romanian data. Thus, with this specific method, a tightly controlled feature set improves the accuracy of the classifier, reaching 74.0% in the fine-grained and 78.7% in the coarse-grained evaluation.

Robustes chunkparsing mit variabler Analysetiefe (2000)

Kübler, Sandra ; Hinrichs, Erhard

Das Chunkparsing bietet einen besonders vielversprechenden Ansatz zum robusten, partiellen Parsing mit dem Ziel einer breiten Datenabdeckung. Ziel beim Chunkparsing ist eine partielle, nicht-rekursive syntaktische Struktur. Dieser extrem effiziente Parsing-Ansatz läßt sich als Kaskade endlicher Transducer realisieren. In diesem Beitrag wird TüSBL vorgestellt, ein System, bei dem die Eingabe aus spontaner, gesprochener Spache besteht, die dem Parser in Form eines Worthypothesengraphen aus einem Spracherkenner zur Verfügung gestellt wird. Chunkparsing ist für eine solche Anwendung besonders geeignet, da es fragmentarische oder nicht wohlgeformte Äußerungen robust behandeln kann. Des weiteren wird eine Baumkonstruktionskomponente vorgestellt, die die partiellen Chunkstrukturen zu vollständigen Bäumen mit grammatischen Funktionen erweitert. Das System wird anhand manuell überprüfter Systemeingaben evaluiert, da sich die üblichen Evaluationsparameter hierfür nicht eignen.

Recent developments in linguistic annotations of the TüBa-D/Z treebank (2004)

Hinrichs, Erhard ; Kübler, Sandra ; Naumann, Karin ; Telljohann, Heike ; Trushkina, Julia

The purpose of this paper is to describe recent developments in the morphological, syntactic, and semantic annotation of the TüBa-D/Z treebank of German. The TüBa-D/Z annotation scheme is derived from the Verbmobil treebank of spoken German [4, 10], but has been extended along various dimensions to accommodate the characteristics of written texts. TüBa-D/Z uses as its data source the "die tageszeitung" (taz) newspaper corpus. The Verbmobil treebank annotation scheme distinguishes four levels of syntactic constituency: the lexical level, the phrasal level, the level of topological fields, and the clausal level. The primary ordering principle of a clause is the inventory of topological fields, which characterize the word order regularities among different clause types of German, and which are widely accepted among descriptive linguists of German [3, 6]. The TüBa-D/Z annotation relies on a context-free backbone (i.e. proper trees without crossing branches) of phrase structure combined with edge labels that specify the grammatical function of the phrase in question. The syntactic annotation scheme of the TüBa-D/Z is described in more detail in [12, 11]. TüBa-D/Z currently comprises approximately 15 000 sentences, with approximately 7 000 sentences being in the correction phase. The latter will be released along with an updated version of the existing treebank before the end of this year. The treebank is available in an XML format, in the NEGRA export format [1] and in the Penn treebank bracketing format. The XML format contains all types of information as described above, the NEGRA export format contains all sentenceinternal information while the Penn treebank format includes only those layers of information that can be expressed as pure tree structures. Over the course of the last year, more fine grained linguistic annotations have been added along the following dimensions: 1. the basic Stuttgart-Tübingen tagset, STTS, [9] labels have been enriched by relevant features of inflectional morphology, 2. named entity information has been encoded as part of the syntactic annotation, and 3. a set of anaphoric and coreference relations has been added to link referentially dependent noun phrases. In the following sections, we will describe each of these innovations in turn and will demonstrate how the additional annotations can be incorporated into one comprehensive annotation scheme.

POS tagging for German : how important is the right context? (2008)

Ivanova, Steliana ; Kübler, Sandra

Part-of-Speech tagging is generally performed by Markov models, based on bigram or trigram models. While Markov models have a strong concentration on the left context of a word, many languages require the inclusion of right context for correct disambiguation. We show for German that the best results are reached by a combination of left and right context. If only left context is available, then changing the direction of analysis and going from right to left improves the results. In a version of MBT (Daelemans et al., 1996) with default parameter settings, the inclusion of the right context improved POS tagging accuracy from 94.00% to 96.08%, thus corroborating our hypothesis. The version with optimized parameters reaches 96.73%.

Parsing without grammar - using complete trees instead (2003)

Kübler, Sandra

The definition of similarity between sentences is formulated on the levels of words, POS tags, and chunks (Abney 91; Abney 96). The evaluation of this approach shows that while precision and recall based on the PARSEVAL measures (Black et al. 91) do not reach state of the art Parsers yet (F1=87.19 on syntactic constituents, F1=77.78 including functionargument structure), the parser shows a very reliable performance where function-argument structure is concerned (F1=96.52). The lower F-scores are very often due to unattached constituents.

Memory-based vocalization of Arabic (2008)

Kübler, Sandra ; Mohamed, Emad

The problem of vocalization, or diacritization, is essential to many tasks in Arabic NLP. Arabic is generally written without the short vowels, which leads to one written form having several pronunciations with each pronunciation carrying its own meaning(s). In the experiments reported here, we define vocalization as a classification problem in which we decide for each character in the unvocalized word whether it is followed by a short vowel. We investigate the importance of different types of context. Our results show that the combination of using memory-based learning with only a word internal context leads to a word error rate of 6.64%. If a lexical context is added, the results deteriorate slowly.

Maschineller Erwerb von Wortklassifikationsregeln (1995)

Kübler, Sandra

In dieser Arbeit soll erst ein kurzer Überblick über die Gebiete der Wortklassifizierung und des maschinellen Lernens gegeben werden (Kap. 1). Dann wird der Ansatz der transformationsbasierten fehlergesteuerten Wortklassifizierung (Transformation-Based Error-Driven Tagging) von Brill (1992, 1993, 1994) vorgestellt und für die Verwendung für deutschsprachige Korpora angepaßt (Kap. 2). Hierbei handelt es sich um ein regelbasiertes System, bei dem die Regeln im Gegensatz zu den bisher vorhandenen Systemen nicht manuell erarbeitet und dem System vorgegeben werden; das System erwirbt die Regeln vielmehr selbst anhand von wenigen Regelschemata aus einem kleinen bereits getaggten Lernkorpus. In Kapitel 3 werden die Ergebnisse aus der Anwendung des Systems auf Teile eines deutschsprachigen Korpus dargestellt. In Kapitel 4 schließlich werden andere Taggingsysteme vorgestellt und mit dem System von Brill (1993) anhand von acht Kriterien verglichen.

Learning a lexicalized grammar for German (1998)

Kübler, Sandra

In syntax, the trend nowadays is towards lexicalized grammar formalisms. It is now widely accepted that dividing words into wordclasses may serve as a laborsaving mechanism - but at the same time, it discards all detailed information on the idiosyncratic behavior of words. And that is exactly the type of information that may be necessary in order to parse a sentence. For learning approaches, however, lexicalized grammars represent a challenge for the very reason that they include so much detailed and specific information, which is difficult to learn. This paper will present an algorithm for learning a link grammar of German. The problem of data sparseness is tackled by using all the available information from partial parses as well as from an existing grammar fragment and a tagger. This is a report about work in progress so there are no representative results available yet.

Machine learning approaches in computational linguistics : introduction (2002)

Hinrichs, Erhard ; Kübler, Sandra

Is it really that difficult to parse German? (2006)

Kübler, Sandra ; Hinrichs, Erhard ; Maier, Wolfgang

This paper presents a comparative study of probabilistic treebank parsing of German, using the Negra and TüBa-D/Z treebanks. Experiments with the Stanford parser, which uses a factored PCFG and dependency model, show that, contrary to previous claims for other parsers, lexicalization of PCFG models boosts parsing performance for both treebanks. The experiments also show that there is a big difference in parsing performance, when trained on the Negra and on the TüBa-D/Z treebanks. Parser performance for the models trained on TüBa-D/Z are comparable to parsing results for English with the Stanford parser, when trained on the Penn treebank. This comparison at least suggests that German is not harder to parse than its West-Germanic neighbor language English.

How to compare treebanks (2008)

Kübler, Sandra ; Maier, Wolfgang ; Rehbein, Ines ; Versley, Yannick

Recent years have seen an increasing interest in developing standards for linguistic annotation, with a focus on the interoperability of the resources. This effort, however, requires a profound knowledge of the advantages and disadvantages of linguistic annotation schemes in order to avoid importing the flaws and weaknesses of existing encoding schemes into the new standards. This paper addresses the question how to compare syntactically annotated corpora and gain insights into the usefulness of specific design decisions. We present an exhaustive evaluation of two German treebanks with crucially different encoding schemes. We evaluate three different parsers trained on the two treebanks and compare results using EVALB, the Leaf-Ancestor metric, and a dependency-based evaluation. Furthermore, we present TePaCoC, a new testsuite for the evaluation of parsers on complex German grammatical constructions. The testsuite provides a well thought-out error classification, which enables us to compare parser output for parsers trained on treebanks with different encoding schemes and provides interesting insights into the impact of treebank annotation schemes on specific constructions like PP attachment or non-constituent coordination.

How do treebank annotation schemes influence parsing results? : or how not to compare apples and oranges (2005)

Kübler, Sandra

In the last decade, the Penn treebank has become the standard data set for evaluating parsers. The fact that most parsers are solely evaluated on this specific data set leaves the question unanswered how much these results depend on the annotation scheme of the treebank. In this paper, we will investigate the influence which different decisions in the annotation schemes of treebanks have on parsing. The investigation uses the comparison of similar treebanks of German, NEGRA and TüBa-D/Z, which are subsequently modified to allow a comparison of the differences. The results show that deleted unary nodes and a flat phrase structure have a negative influence on parsing quality while a flat clause structure has a positive influence.

From phrase structure to dependencies, and back (2004)

Ule, Tylman ; Kübler, Sandra

Transforming constituent-based annotation into dependency-based annotation has been shown to work for different treebanks and annotation schemes (e.g. Lin (1995) has transformed the Penn treebank, and Kübler and Telljohann (2002) the Tübinger Baumbank des Deutschen (TüBa-D/Z)). These ventures are usually triggered by the conflict between theory-neutral annotation, that targets most needs of a wider audience, and theory-specific annotation, that provides more fine-grained information for a smaller audience. As a compromise, it has been pointed out that treebanks can be designed to support more than one theory from the start (Nivre, 2003). We argue that information can also be added to an existing annotation scheme so that it supports additional theory-specific annotations. We also argue that such a transformation is useful for improving and extending the original annotation scheme with respect to both ambiguous annotation and annotation errors. We show this by analysing problems that arise when generating dependency information from the constituent-based TüBa-D/Z.

From chunks to function-argument structure : a similarity-based approach (2001)

Kübler, Sandra ; Hinrichs, Erhard

Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. Such larger structures are not only desirable for a deeper syntactic analysis. They also constitute a necessary prerequisite for assigning function-argument structure. The present paper offers a similaritybased algorithm for assigning functional labels such as subject, object, head, complement, etc. to complete syntactic structures on the basis of prechunked input. The evaluation of the algorithm has concentrated on measuring the quality of functional labels. It was performed on a German and an English treebank using two different annotation schemes at the level of function argument structure. The results of 89.73% correct functional labels for German and 90.40%for English validate the general approach.

Evaluating POS tagging under sub-optimal conditions : or: does meticulousness pay? (2000)

Kübler, Sandra ; Wagner, Andreas

In this paper, we investigate the role of sub-optimality in training data for part-of-speech tagging. In particular, we examine to what extent the size of the training corpus and certain types of errors in it affect the performance of the tagger. We distinguish four types of errors: If a word is assigned a wrong tag, this tag can belong to the ambiguity class of the word (i.e. to the set of possible tags for that word) or not; furthermore, the major syntactic category (e.g. "N" or "V") can be correctly assigned (e.g. if a finite verb is classified as an infinitive) or not (e.g. if a verb is classified as a noun). We empirically explore the decrease of performance that each of these error types causes for different sizes of the training set. Our results show that those types of errors that are easier to eliminate have a particularly negative effect on the performance. Thus, it is worthwhile concentrating on the elimination of these types of errors, especially if the training corpus is large.

Combining dependency parsing with PP attachment (2007)

Kübler, Sandra ; Ivanova, Steliana ; Klett, Eva

Prepositional phrase (PP) attachment is one of the major sources for errors in traditional statistical parsers. The reason for that lies in the type of information necessary for resolving structural ambiguities. For parsing, it is assumed that distributional information of parts-of-speech and phrases is sufficient for disambiguation. For PP attachment, in contrast, lexical information is needed. The problem of PP attachment has sparked much interest ever since Hindle and Rooth (1993) formulated the problem in a way that can be easily handled by machine learning approaches: In their approach, PP attachment is reduced to the decision between noun and verb attachment; and the relevant information is reduced to the two possible attachment sites (the noun and the verb) and the preposition of the PP. Brill and Resnik (1994) extended the feature set to the now standard 4-tupel also containing the noun inside the PP. Among many publications on the problem of PP attachment, Volk (2001; 2002) describes the only system for German. He uses a combination of supervised and unsupervised methods. The supervised method is based on the back-off model by Collins and Brooks (1995), the unsupervised part consists of heuristics such as ”If there is a support verb construction present, choose verb attachment”. Volk trains his back-off model on the Negra treebank (Skut et al., 1998) and extracts frequencies for the heuristics from the ”Computerzeitung”. The latter also serves as test data set. Consequently, it is difficult to compare Volk’s results to other results for German, including the results presented here, since not only he uses a combination of supervised and unsupervised learning, but he also performs domain adaptation. Most of the researchers working on PP attachment seem to be satisfied with a PP attachment system; we have found hardly any work on integrating the results of such approaches into actual parsers. The only exceptions are Mehl et al. (1998) and Foth and Menzel (2006), both working with German data. Mehl et al. report a slight improvement of PP attachment from 475 correct PPs out of 681 PPs for the original parser to 481 PPs. Foth and Menzel report an improvement of overall accuracy from 90.7% to 92.2%. Both integrate statistical attachment preferences into a parser. First, we will investigate whether dependency parsing, which generally uses lexical information, shows the same performance on PP attachment as an independent PP attachment classifier does. Then we will investigate an approach that allows the integration of PP attachment information into the output of a parser without having to modify the parser: The results of an independent PP attachment classifier are integrated into the parse of a dependency parser for German in a postprocessing step.

Braucht Nominalphrasenerkennung linguistisches Wissen? (2001)

Kübler, Sandra

Maschinelles Lernen wird häufig zur effzienten Annotation großer Datenmengen eingesetzt. Die Forschung zu maschinellen Lernverfahren beschränkt sich i.a. darauf unterschiedliche Lernverfahren zu vergelichen oder die optimale größe der Trainingsdaten zu bestimmen. Bisher wurde jedoch nicht untersucht, in wie weit sich linguistisches Wissen bei der Aufgabendefinition positiv auswirken kann. Dies soll hier anhand des Lernens von Base-Nominalphrasen mit drei unterschiedlichen Definitionen untersucht werden. Die Definitionen unterscheiden sich im Grad der linguistisch motivierten Erweiterungen, die zu einer eher praktisch motivierten ersten Definition hinzu kamen. Die Untersuchungen ergaben, dass sich die Anzahl der falsch klasssifizierten Wörter um ein Drittel reduzieren lässt.

Annotation compatibility working group report (2006)

This report explores the question of compatibility between annotation projects including translating annotation formalisms to each other or to common forms. Compatibility issues are crucial for systems that use the results of multiple annotation projects. We hope that this report will begin a concerted effort in the field to track the compatibility of annotation schemes for part of speech tagging, time annotation, treebanking, role labeling and other phenomena.

A unified representation for morphological, syntactic, semantic, and referential annotations (2004)

Hinrichs, Erhard ; Kübler, Sandra ; Naumann, Karin

This paper reports on the SYN-RA (SYNtax-based Reference Annotation) project, an on-going project of annotating German newspaper texts with referential relations. The project has developed an inventory of anaphoric and coreference relations for German in the context of a unified, XML-based annotation scheme for combining morphological, syntactic, semantic, and anaphoric information. The paper discusses how this unified annotation scheme relates to other formats currently discussed in the literature, in particular the annotation graph model of Bird and Liberman (2001) and the pie-in-thesky scheme for semantic annotation.

TüSBL : a similarity-based chunk parser for robust syntactic processing (2001)

Kübler, Sandra ; Hinrichs, Erhard

Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. The TüSBL parser extends current chunk parsing techniques by a tree-construction component that extends partial chunk parses to complete tree structures including recursive phrase structure as well as function-argument structure. TüSBLs tree construction algorithm relies on techniques from memory-based learning that allow similarity-based classification of a given input structure relative to a pre-stored set of tree instances from a fully annotated treebank. A quantitative evaluation of TüSBL has been conducted using a semi-automatically constructed treebank of German that consists of appr. 67,000 fully annotated sentences. The basic PARSEVAL measures were used although they were developed for parsers that have as their main goal a complete analysis that spans the entire input.This runs counter to the basic philosophy underlying TüSBL, which has as its main goal robustness of partially analyzed structures.

A hybrid architecture for robust parsing of german (2002)

Hinrichs, Erhard ; Kübler, Sandra ; Müller, Frank H. ; Ule, Tylman

This paper provides an overview of current research on a hybrid and robust parsing architecture for the morphological, syntactic and semantic annotation of German text corpora. The novel contribution of this research lies not in the individual parsing modules, each of which relies on state-of-the-art algorithms and techniques. Rather what is new about the present approach is the combination of these modules into a single architecture. This combination provides a means to significantly optimize the performance of each component, resulting in an increased accuracy of annotation.

Local tree description grammars (1997)

Kallmeyer, Laura

A lot of interest has recently been paid to constraint-based definitions and extensions of Tree Adjoining Grammars (TAG). Examples are the so-called quasi-trees, D-Tree Grammars and Tree Description Grammars. The latter are grammars consisting of a set of formulars denoting trees. TDGs are derivation based where in each derivation step a conjunction is built of the old formular, a formular of the grammar and additional equivalences between node names of the two formulars. This formalism is more powerfull than TAGs. TDGs offer the advantages of MC-TAG and D-Tree Grammars for natural languages and they allow underspecification. However the problem is that TDGs might be unnecessarily powerfull for natural languages. To solve this problem, in this paper, I will propose a local TDGs, a restricted version of TDGs. Local TDGs still have the advantages of TDGs but they are semilinear and therefore more appropriate for natural languages. First, the notion of the semilinearity is defined. Then local TDGs are introduced, and, finally, semilinearity of local Tree Description Languages is proven.

Factoring predicate argument and scope semantics : underspecified semantics with LTAG (1999)

Kallmeyer, Laura ; Joshi, Aravind K.

This paper proposes a compositional semantics for lexicalized tree adjoining grammars (LTAG). Tree-local multicompnent derivations allow seperation of semantiv contribution of a lexical item into one component contributing to the predicate argument structure and second a component contributing to scope semantics. Based on this idea a syntx-semantics interface is presented where the compositional semantics depends only on the derivation structure. It is shown that the derivation structure allows an appropriate amount of underspecification. This is illustrated by investigating underspecified representations for quantifier scpoe ambiguities and related phenomena such as adjunct scope and island constraints.

A hierarchy of local TDGs (1998)

Kallmeyer, Laura

Many recent variants of Tree Adoining Grammars (TAG) allow an underspecifiaction of the parent relation between nodes in a tree, i.e. they do not deal with fully specified trees as it is the case with TAGs.Such TAG variants are for example Description Tree Grammars (DTG), Unordered Vector Grammars with Dominance Links (UVG-DL), a definition of TAGs via so-called quasi trees and Tree Description Grammars (TDG. The last TAg variant, local TDG, is an extension of TAG generating Tree Descriptions. Local TDGs even allow an underspecification of the dominance relation between node names and thereby provide the possibility to generate underspecified representations for structural ambiguities such as quantifier scope ambiguities. This abstract deals with formal properties of local TDGs. A hierarchiy of local TDGs is established together with a pumping lemma for local TDGs of a certain rank.

Tree-local MCTAG with shared nodes : an analysis of word order variation in German and Korean (2004)

Kallmeyer, Laura ; Yoon, SinWon

Tree Adjoining Grammars (TAG) are known not to be powerful enough to deal with scrambling in free word order languages. The TAG-variants proposed so far in order to account for scrambling are not entirely satisfying. Therefore, an alternative extension of TAG is introduced based on the notion of node sharing. Considering data from German and Korean, it is shown that this TAG-extension can adequately analyse scrambling data, also in combination with extraposition and topicalization.

TuLiPA : towards a multi-formalism parsing environment for grammar engineering (2008)

Kallmeyer, Laura ; Lichte, Timm ; Maier, Wolfgang ; Parmentier, Yannick ; Dellert, Johannes ; Evang, Kilian

In this paper, we present an open-source parsing environment (Tübingen Linguistic Parsing Architecture, TuLiPA) which uses Range Concatenation Grammar (RCG) as a pivot formalism, thus opening the way to the parsing of several mildly context-sensitive formalisms. This environment currently supports tree-based grammars (namely Tree-Adjoining Grammars (TAG) and Multi-Component Tree-Adjoining Grammars with Tree Tuples (TT-MCTAG)) and allows computation not only of syntactic structures, but also of the corresponding semantic representations. It is used for the development of a tree-based grammar for German.

The TUSNELDA annotation standard : an XML encoding standard for multilingual corpora supporting various aspects of linguistic research (2000)

Kallmeyer, Laura ; Wagner, Andreas

This paper proposes a corpus encoding standard that meets the needs of linguistic research using a variety of linguistic data structures. The standard was developed in SFB 441, a research project at the University of Tuebingen. The principal concern of SFB 441 are the empirical data structures which feed into linguistic theory building. SFB 441 consists of several projects, most of which are building corpora to empirically investigate various linguistic phenomena in various languages (e.g. modal verbs in German, forms of address and politeness in Russian). These corpora will form the components of the "Tuebingen collection of reusable, empirical, linguistic data structures (TUSNELDA)". The TUSNELDA annotation standard aims at providing a uniform encoding scheme for all subcorpora and texts of TUSNELDA such that they can be processed with uniform standardized tools. To guarantee maximal reusability we use XML for encoding. Previous SGML standards for text encoding were provided by the Text Encoding Initiative (TEI) and the Expert Advisory Group on Language Engineering Standards (Corpus Encoding Standard, CES). The TUSNELDA standard is based on TEI and XCES (XML version of CES) but takes into account the specific needs of the SFB projects, i.e. the peculiarities of the examined languages and linguistic phenomena.

Semantic construction in feature-based TAG (2003)

Gardent, Claire ; Kallmeyer, Laura

We propose a semantic construction method for Feature-Based Tree Adjoining Grammar which is based on the derived tree, compare it with related proposals and briefly discuss some implementation possibilities.

Scrambling in german and the non-locality of local TDGs (2000)

Kallmeyer, Laura

Existing analyses of German scrambling phenomena within TAG-related formalisms all use non-local variants of TAG. However, there are good reasons to prefer local grammars, in particular with respect to the use of the derivation structure for semantics. Therefore this paper proposes to use local TDGs, a TAG-variant generating tree descriptions that shows a local derivation structure. However the construction of minimal trees for the derived tree descriptions is not subject to any locality constraint. This provides just the amount of non-locality needed for an adequate analysis of scrambling. To illustrate this a local TDG for some German scrambling data is presented.

Scope and situation binding in LTAG using semantic unification (2005)

Romero, Maribel ; Kallmeyer, Laura

This paper develops a framework for TAG (Tree Adjoining Grammar) semantics that brings together ideas from different recent approaches.Then, within this framework, an analysis of scope is proposed that accounts for the different scopal properties of quantifiers, adverbs, raising verbs and attitude verbs. Finally, including situation variables in the semantics, different situation binding possibilities are derived for different types of quantificational elements.

Reflexives and reciprocals in LTAG (2007)

Kallmeyer, Laura ; Romero, Maribel

This paper presents an LTAG analysis of reflexives like himself and reciprocals like each other. These items need to find a c-commanding antecedent from which they retrieve (part of) their own denotation and with which they syntactically agree. The relation between anaphoric item and antecendent must satisfy the following important locality conditions (Chomsky (1981)).

Quantifier scope in German : an MCTAG analysis (2006)

Kallmeyer, Laura ; Romero, Maribel

Relative quantifier scope in German depends, in contrast to English, very much on word order. The scope possibilities of a quantifier are determined by its surface position, its base position and the type of the quantifier. In this paper we propose a multicomponent analysis for German quantifiers computing the scope of the quantifier, in particular its minimal nuclear scope, depending on the syntactic configuration it occurs in.

On the relation between multicomponent tree adjoining grammars with tree tuples (TT-MCTAG) and range concatenation grammars (RCG) (2008)

Kallmeyer, Laura ; Parmentier, Yannick

This paper investigates the relation between TT-MCTAG, a formalism used in computational linguistics, and RCG. RCGs are known to describe exactly the class PTIME; simple RCG even have been shown to be equivalent to linear context-free rewriting systems, i.e., to be mildly context-sensitive. TT-MCTAG has been proposed to model free word order languages. In general, it is NP-complete. In this paper, we will put an additional limitation on the derivations licensed in TT-MCTAG. We show that TT-MCTAG with this additional limitation can be transformed into equivalent simple RCGs. This result is interesting for theoretical reasons (since it shows that TT-MCTAG in this limited form is mildly context-sensitive) and, furthermore, even for practical reasons: We use the proposed transformation from TT-MCTAG to RCG in an actual parser that we have implemented.

LTAG semantics with semantic unification (2004)

Kallmeyer, Laura ; Romero, Maribel

This paper sets up a framework for LTAG (Lexicalized Tree Adjoining Grammar) semantics that brings together ideas from different recent approaches addressing some shortcomings of TAG semantics based on the derivation tree. Within this framework, several sample analyses are proposed, and it is shown that the framework allows to analyze data that have been claimed to be problematic for derivation tree based LTAG semantics approaches.

LTAG semantics for questions (2004)

Romero, Maribel ; Kallmeyer, Laura ; Babko-Malaya, Olga

This papers presents a compositional semantic analysis of interrogatives clauses in LTAG (Lexicalized Tree Adjoining Grammar) that captures the scopal properties of wh- and nonwh-quantificational elements. It is shown that the present approach derives the correct semantics for examples claimed to be problematic for LTAG semantic approaches based on the derivation tree. The paper further provides an LTAG semantics for embedded interrogatives.

LTAG analysis for pied-piping and stranding of wh-phrases (2004)

Kallmeyer, Laura ; Scheffler, Tatjana

In this paper we propose a syntactic and semantic analysis of complex questions. We consider questions involving pied piping and stranding and we propose elementary trees and semantic representations that allow to account for both constructions in a uniform way.

Licensing german negative polarity items in LTAG (2006)

Lichte, Timm ; Kallmeyer, Laura

Our paper aims at capturing the distribution of negative polarity items (NPIs) within lexicalized Tree Adjoining Grammar (LTAG). The condition under which an NPI can occur in a sentence is for it to be in the scope of a negation with no quantifiers scopally intervening. We model this restriction within a recent framework for LTAG semantics based on semantic unification. The proposed analysis provides features that signal the presence of a negation in the semantics and that specify its scope. We extend our analysis to modelling the interaction of NPI licensing and neg raising constructions.

Flexible composition in LTAG : quantifier scope and inverse linking (2003)

Joshi, Aravind K. ; Kallmeyer, Laura ; Romero, Maribel

This paper addresses the problem ofconstraints for relative quantifier sope, in partiular in inverse linking readings wherecertain scope orders are exluded. We show how to account for such restrictions in the Tree Adjoining Grammar (TAG) framework by adopting a notion offlexible composition. In the semantics we use for TAG we introduce quantifier sets that group quantifiers that are "glued" together in the sense that no other quantifieran scopally intervene between them. Theflexible composition approach allows us to obtain the desired quantifier sets and thereby the desiredconstraints for quantifier sope.

Feature logic-based semantic composition : a comparison between LRS and LTAG (2007)

Richter, Frank ; Kallmeyer, Laura

In this paper we will explore the similarities and differences between two feature logic-based approaches to the composition of semantic representations. The first approach is formulated for Lexicalized Tree Adjoining Grammar (LTAG, Joshi and Schabes 1997), the second is Lexical Ressource Semantics (LRS, Richter and Sailer 2004) and was first defined in Head-driven Phrase Structure Grammar. The two frameworks have several common characteristics that make them easy to compare: 1 They use languages of two sorted type theory for semantic representations. 2. They allow underspecification. LTAG uses scope constraints while LRS provides component-of contraints. 3 They use feature logics for computing semantic representations. 4. they are designed for computational applications. By comparing the two frameworks we will also point outsome characteristics and advantages of feature logic-based semantic computation in genereal.

Factorizing complementation in a TT-MCTAG for German (2008)

Lichte, Timm ; Kallmeyer, Laura

TT-MCTAG lets one abstract away from the relative order of co-complements in the final derived tree, which is more appropriate than classic TAG when dealing with flexible word order in German. In this paper, we present the analyses for sentential complements, i.e., wh-extraction, thatcomplementation and bridging, and we work out the crucial differences between these and respective accounts in XTAG (for English) and V-TAG (for German).

Factoring Predicate Argument and Scope Semantics : underspecified Semantics with LTAG (2003)

Kallmeyer, Laura ; Joshi, Aravind K.

In this paper we propose a compositional semantics for lexicalized tree-adjoining grammar (LTAG). Tree-local multicomponent derivations allow separation of the semantic contribution of a lexical item into one component contributing to the predicate argument structure and a second component contributing to scope semantics. Based on this idea a syntax-semantics interface is presented where the compositional semantics depends only on the derivation structure. It is shown that the derivation structure (and indirectly the locality of derivations) allows an appropriate amount of underspecification. This is illustrated by investigating underspecified representations for quantifier scope ambiguities and related phenomena such as adjunct scope and island constraints.

XMG : eXtending MetaGrammars to MCTAG (2007)

Parmentier, Yannick ; Kallmeyer, Laura ; Lichte, Timm ; Maier, Wolfgang

In this paper, we introduce an extension of the XMG system (eXtensibleMeta-Grammar) in order to allow for the description of Multi-Component Tree Adjoining Grammars. In particular, we introduce the XMG formalism and its implementation, and show how the latter makes it possible to extend the system relatively easily to different target formalisms, thus opening the way towards multi-formalism.

Developing a TT-MCTAG for German with an RCG-based parser (2008)

Kallmeyer, Laura ; Lichte, Timm ; Maier, Wolfgang ; Parmentier, Yannick ; Dellert, Johannes

Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) redundancy and consistency issues. Furthermore some languages can reveal themselves hard to describe because of specific characteristics, e.g. the free word order in German. In this context, we present (i) a framework allowing to describe tree-based grammars, and (ii) an actual fragment of a core multicomponent tree-adjoining grammar with tree tuples (TT-MCTAG) for German developed using this framework. This framework combines a metagrammar compiler and a parser based on range concatenation grammar (RCG) to respectively check the consistency and the correction of the grammar. The German grammar being developed within this framework already deals with a wide range of scrambling and extraction phenomena.

Der TUSNELDA-Standard : ein Korpusannotierungsstandard zur Unterstützung linguistischer Forschung (2001)

Wagner, Andreas ; Kallmeyer, Laura

Die Verwendung von Standards für die Annotierung größerer Sammlungen elektronischer Texte (Korpora) ist eine Voraussetzung für eine mögliche Wiederverwendung dieser Korpora. Dieser Artikel stellt einen Korpusannotierungsstandard vor, der die Anforderungen der Untersuchung unterschiedlichster linguistischer Phänomene berücksichtigt. Der Standard wurde im SFB 441 an der Universität Tübingen entwickelt. Er geht von bestehenden Standards, insbesondere CES und TEI, aus, die sich als teilweise zu ausführlich und zu wenig restriktiv,teilweise auch als nicht ausdrucksstark genug erweisen, um den Bedürfnissen korpusbasierter linguistischer Forschung gerecht zu werden.

Convertir des grammaires darbres adjoints à composantes multiples avec tuples d’arbres (TT-MCTAG) en grammaires à concaténation d’intervalles (RCG) (2008)

Kallmeyer, Laura ; Parmentier, Yannick

Cet article étudie la relation entre les grammaires darbres adjoints à composantes multiples avec tuples darbres (TT-MCTAG), un formalisme utilisé en linguistique informatique, et les grammaires à concaténation dintervalles (RCG). Les RCGs sont connues pour décrire exactement la classe PTIME, il a en outre été démontré que les RCGs « simples » sont même équivalentes aux systèmes de réécriture hors-contextes linéaires (LCFRS), en dautres termes, elles sont légèrement sensibles au contexte. TT-MCTAG a été proposé pour modéliser les langages à ordre des mots libre. En général ces langages sont NP-complets. Dans cet article, nous définissons une contrainte additionnelle sur les dérivations autorisées par le formalisme TT-MCTAG. Nous montrons ensuite comment cette forme restreinte de TT-MCTAG peut être convertie en une RCG simple équivalente. Le résultat est intéressant pour des raisons théoriques (puisqu’il montre que la forme restreinte de TT-MCTAG est légèrement sensible au contexte), mais également pour des raisons pratiques (la transformation proposée ici a été utilisée pour implanter un analyseur pour TT-MCTAG).

Constraint-based computational semantics : a comparison between LTAG and LRS (2006)

Kallmeyer, Laura ; Richter, Frank

This paper compares two approaches to computational semantics, namely semantic unification in Lexicalized Tree Adjoining Grammars (LTAG) and Lexical Resource Semantics (LRS) in HPSG. There are striking similarities between the frameworks that make them comparable in many respects. We will exemplify the differences and similarities by looking at several phenomena. We will show, first of all, that many intuitions about the mechanisms of semantic computations can be implemented in similar ways in both frameworks. Secondly, we will identify some aspects in which the frameworks intrinsically differ due to more general differences between the approaches to formal grammar adopted by LTAG and HPSG.

Comparing lexicalized grammar formalisms in an empirically adequate way : the notion of generative attachment capacity (2006)

Kallmeyer, Laura

The work presented here addresses the question of how to determine whether a grammar formalism is powerful enough to describe natural languages. The expressive power of a formalism can be characterized in terms of i) the string languages it generates (weak generative capacity (WGC)) or ii) the tree languages it generates (strong generative capacity (SGC)). The notion of WGC is not enough to determine whether a formalism is adequate for natural languages. We argue that even SGC is problematic since the sets of trees a grammar formalism for natural languages should be able to generate is difficult to determine. The concrete syntactic structures assumed for natural languages depend very much on theoretical stipulations and empirical evidence for syntactic structures is rather hard to obtain. Therefore, for lexicalized formalisms, we propose to consider the ability to generate certain strings together with specific predicate argument dependencies as a criterion for adequacy for natural languages.

TuLiPA : a syntax-semantics parsing environment for mildly context-sensitive formalisms (2008)

Parmentier, Yannick ; Kallmeyer, Laura ; Maier, Wolfgang ; Lichte, Timm ; Dellert, Johannes

In this paper we present a parsing architecture that allows processing of different mildly context-sensitive formalisms, in particular Tree-Adjoining Grammar (TAG), Multi-Component Tree-Adjoining Grammar with Tree Tuples (TT-MCTAG) and simple Range Concatenation Grammar (RCG). Furthermore, for tree-based grammars, the parser computes not only syntactic analyses but also the corresponding semantic representations.

A descriptive characterization of multicomponent tree adjoining grammars (2005)

Kallmeyer, Laura

Multicomponent Tree Adjoining Grammars (MCTAG) is a formalism that has been shown to be useful for many natural language applications. The definition of MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. Looking only at the result of a derivation (i.e., the derived tree and the derivation tree), this simultaneity is no longer visible and therefore cannot be checked. I.e., this way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. Therefore, in this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees the MCTAG licences.

A declarative characterization of different types of multicomponent tree adjoining grammars (2007)

Kallmeyer, Laura

Multicomponent Tree Adjoining Grammars (MCTAG) is a formalism that has been shown to be useful for many natural language applications. The definition of MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. This way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. In this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees (in the underlying TAG) the MCTAG licences. This definition gives a better understanding of the formalism, it allows a more systematic comparison of different types of MCTAG, and, furthermore, it can be exploited for parsing.

Linguosomatische Sprachdidaktik : Wissenschaftstheoretische Grundlegung (2006)

Herrmann, Wolfgang

Die Theorie des sprachlichen Lernens und Lehrens ist bis in die siebziger Jahre des 20. Jahrhunderts hinein eine "Meisterlehre" (Müller-Michaels 1980) gewesen. Große Vorbilder eines Volkes (z.B. Mose), Leiter philosophischer Schulen (z.B. Platon) oder Äbte von Klöstern (z.B. Augustinus) und schließlich staatlich geprüfte Oberstudiendirektoren (z.B. Ulshöfer) beschrieben den jüngeren Kollegen, was sich beim Lehren der Sprache über Jahrzehnte bewährt habe: wie man am besten den Sprachunterricht erteile (Müller 1922, Seidemann 1973, Ulshöfer 1968, Essen 1968). Mit der Etablierung der Sprachdidaktiken an den Universitäten ist das Konzept der "norm-setzenden Handlungswissenschaften" Müller-Michaels 1980, Ivo 1975) entwickelt worden. Der Forscher (nicht mehr als Meister der Praxis ausgewiesen) untersucht die Prozesse des sprachlichen Lehrens und Lernens, indem er im "Feld" des Praktikers Erhebungen anstellt, um anschließend die erhobenen Daten einer Hypothesenprüfung zu unterziehen. Als Handlungsfeld wird besonders die Schule berücksichtigt. Die Methoden der Forschung sind vorwiegend "quasi-experimentell". In der Nachfolge der Sprachtheorie Chomsky´s (Chomsky 1965) sind die experimentellen Ansätze zur Untersuchung des Spracherwerbs, der Spracherwerbsstörung und der betreffenden Interventionen entwickelt worden (de Villiers/ de Villiers 1970, Hörmann 1978). Ort der Untersuchung ist das Labor. Das Design dieser Sprachdidaktik (bzw. Psycholinguistik, Kognitionswissenschaften etc.) ist experimentell (z.B. Herrmann 2004). Alle drei Konzepte stehen sich in vielerlei Hinsicht antagonistisch gegenüber. Sie auseinander zu halten - und andererseits mit Gewinn aufeinander zu beziehen -, gehört zu den Basis-Fähigkeiten der linguosomatischen Berufe und ihrer zugrundeliegenden Theorie (Beispiel Sprachlehrberufe, Phoniatrie, Sprachheil-Sonderpädagogik, psychosomatische Sprachtherapien). Daher sind die signifikanten Gegensätze der drei Konzepte herauszuarbeiten und ihre widerstrebenden Konsequenzen aufeinander zu beziehen.

Brain electric fields, belief in the paranormal, and reading of emotion words (2003)

Gianotti, Lorena R. R.

The present work reports two experiments on brain electric correlates of cognitive and emotional functions. (1) Studying paranormal belief, 35-channel resting EEG (10 believers and 13 skeptics) was analyzed with "Low Resolution Electromagnetic Tomography" (LORETA) in seven frequency bands. LORETA gravity centers of all bands shifted to the left in believers vs. sceptics, and showed that believers had stronger left fronto-temporo-parietal activity than skeptics. Self-rating of affective attitude showed believers to be less negative than skeptics. The observed EEG lateralization agreed with the ‘valence hypothesis’ that posits predominant left hemispheric processing for positive emotions. (2) Studying emotions, positive and negative emotion words were presented to 21 subjects while "Event-Related Potentials" (ERPs) were recorded. During word presentation (450 ms), 13 microstates (steps of information processing) were identified. Three microstates showed different potential maps for positive vs. negative words; LORETA functional imaging showed stronger activity in microstate #4 (106-122 ms) for positive words right anterior, for negative words left central; in #6 (138-166 ms) for positive words left anterior, for negative words left posterior; in #7 (166-198 ms), for positive words right anterior, for negative words right central. In conclusion: during word processing, the extraction of emotion content starts as early as 106 ms after stimulus onset; the brain identifies emotion content repeatedly in three separate, brief microstate epochs; and, this processing of emotion content in the three microstates involves different brain mechanisms to represent the distinction positive vs. negative valence.

Pieces of the be perfect in German and older English (2006)

McFadden, Thomas ; Alexiadou, Artemis

This paper examines the development of periphrastic constructions involving auxiliary "have" and "be" with a past participle in the history of English, on the basis of parsed electronic corpora. It is argued that the two constructions represented distinct syntactic and semantic structures: while the one with have developed into a true perfect in the course of Middle English, the one with be remained a stative resultative throughout its history. In this way, it is explained why the be construction was rarely or never used in a number of contexts, including past counterfactuals, iteratives, duratives, certain kinds of infinitives and various other utterance types that cannot be characterized as perfects of result. When the construction with have became a true perfect, it was used in such contexts, regardless of the identity of the main verb, leading to the appearance of have with verbs like come which had previously only taken be. Crucially, however, have was not spreading at the expense of be, as the be perfect had never been used in such contexts, but rather at the expense of the old simple past. At least until the end of the Early Modern English period, the shift in the relative frequency of have and be perfects is to be explained in terms of the expansion of the former into new contexts, while the latter remained stable. A formal analysis is proposed, taking as its starting point a comparison with German which shows that the older English be perfect indeed behaves more like the German stative passive than its haben and sein perfects.

Perfects, resultatives and auxiliaries in early English (2007)

McFadden, Thomas ; Alexiadou, Artemis

In this paper, we will argue for a novel analysis of the auxiliary alternation in Early English, its development and subsequent loss which has broader consequences for the way that auxiliary selection is looked at cross-linguistically. We will present evidence that the choice of auxiliaries accompanying past participles in Early English differed in several significant respects from that in the familiar modern European languages. Specifically, while the construction with have became a full-fledged perfect by some time in the ME period, that with be was actually a stative resultative, which it remained until it was lost. We will show that this accounts for some otherwise surprising restrictions on the distribution of BE in Early English and allows a better understanding of the spread of HAVE through late ME and EModE. Perhaps more importantly, the Early English facts also provide insight into the genesis of the kind of auxiliary selection found in German, Dutch and Italian. Our analysis of them furthermore suggests a promising strategy for explaining cross-linguistic variation in auxiliary selection in terms of variation in the syntactico-semantic structure of the perfect. In this introductory section, we will first provide some background on the historical situation we will be discussing, then we will lay out the main claims for which we will be arguing in the paper.

Die Tropenlehre (Theorie der Bildersprache) oder gründliche und praktische Anleitung zum schönen und blühenden Style durch Tropen und bildliche Redefiguren : mit beigefügten Muster-Beispielen, in lateinischer und deutscher Sprache, aus den Werken der auserlesensten Schriftsteller der alten und neuen Zeit ; nebst einem mit allerlei Tropen reichlich ausgestatteten Anhange von größeren lateinischen und deutschen Aufsätzen ; für studierende Jünglinge, angehende Prediger alle Freunde der tropischen und lebendigen Ausdrucksweise (1833)

Harnach, Carl Honor

Accomodation theory : communication, context, and consequences (1991)

Giles, Howard ; Coupland, Nikolas ; Coupland, Justine

Uncovering the un-word : a study in lexical pragmatics (2002)

Horn, Laurence R.

In this paper I seek to account for the productive word-formation process resulting in the current proliferation of un-nouns, the semi-legitimate offspring of Humpty Dumpty´s un-birthday present (1871) and 7-Up´s commercial incarnation as The Un-Cola (1968), a construction that can be linked to the more well-established categories of un-adjectives and un-verbs, whose formation constraints we will also examine. Drawing on a large corpus of novel un-nouns I have assembled in collaboration with Beth Levin presented in the Appendices to this paper, I will invoke Rosch´s prototype semantics and Aristotle´s notion of PRIVATIVE opposites, defined in terms of a marked exception to a general class property, to generalize across the different categories of un-words. It will be argued that a given un-noun refers either to an element just outside a given category with whose members it shares a salient function (e.g. un-cola) or to a peripheral member of a given category (an unhotel is a hotel but not a good exemplar of the class-not a HOTEL hotel).

Auxiliary selection and counterfactuality in the history of English and Germanic (2006)

McFadden, Thomas ; Alexiadou, Artemis

The retreat of BE as perfect auxiliary in the history of English is examined. Corpus data are presented showing that the initial advance of HAVE was most closely connected to a restriction against BE in past counterfactuals. Other factors which have been reported to favor the spread of HAVE are either dependent on the counterfactual effect, or significantly weaker in comparison. It is argued that the effect can be traced to the semantics of the BE perfect, which denoted resultativity rather than anteriority proper. Related data from other older Germanic and Romance languages are presented, and finally implications for existing theories of auxiliary selection stemming from the findings presented are discussed.

Counterfactuals and the loss of BE in the history of English (2008)

McFadden, Thomas ; Alexiadou, Artemis

In the course of the ME period, HAVE began to encroach on territory previously held by BE. According to Rydén and Brorström (1987); Kytö (1997), this occurred especially in iterative and durational contexts, in the perfect infinitive and modal constructions. In Early Modern English (henceforth EModE), BE was increasingly restricted to the most common intransitives come and go, before disappearing entirely in the 18th and 19th centuries. This development raises a number of questions, both historical and theoretical. First, why did HAVE start spreading at the expense of BE in the first place? Second, why was the change conditioned by the factors mentioned by Rydén and Brorström (1987) and Kytö (1997)? Third, why did the change take on the order of 800 years to go to completion? Fourth, what implications does the change have for general theories of auxiliary selection? In this paper we’ll try to answer the first question by focusing on one the earliest clearly identifiable advance of HAVE onto BE territory – its first appearance with the verb come, which for a number of reasons is an ideal verb to focus on. First, come is by far the most common intransitive verb, so we get large enough numbers for statistical analysis. Second, clauses containing the past participle of come with a form of BE are unambiguous perfects: they cannot be passives, and they did not continue into modern English with a stative reading like he is gone. Third, and perhaps most importantly, come selected BE categorically in the early stages of English, so the first examples we find with HAVE are clear evidence for innovation. We will present evidence from a corpus study showing that the first spread of HAVE was due to a ban on auxiliary BE in certain types of counterfactual perfects, and will propose an account for that ban in terms of Iatridou’s (2000) Exclusion theory of counterfactuals.

Instrument subjects are agents or causers (2006)

Alexiadou, Artemis ; Schäfer, Florian

It has often been noticed that one syntactic argument position can be realized by elements which seem to realize different thematic roles. This is notably the case with the external argument position of verbs of change of state which licenses volitional agents, instruments or natural forces/causers, showing the generality and abstractness of the external argument relation. (1) a. John broke the window (Agent) b. The hammer broke the window (Instrument) c. The storm broke the window (Causer) In order to capture this generality, Van Valin & Wilkins (1996) and Ramchand (2003) among others have proposed that the thematic role of the external argument position is in fact underspecified. The relevant notion is that of an effector (in Van Valin & Wilkins) or of an abstract causer/initiator (in Ramchand). In this paper we argue against a total underspecification of the external argument relation. While we agree that (1b) does not instantiate an instrument theta role in subject position, we argue that a complete underspecification of the external theta-position is not feasible, but that two types of external theta roles have to be distinguished, Agents and Causers. Our arguments are based on languages where Agents and Causers show morpho-syntactic independence (section 2.1) and the behavior of instrument subjects in English, Dutch, German and Greek (section 2.2 and 3). We show that instrument subjects are either Agent or Causer like. In section (4) we give an analysis how arguments realizing these thematic notions are introduced into syntax.

Verbs, nouns and affixation (2008)

Alexiadou, Artemis ; Grimshaw, Jane

What explains the rich patterns of deverbal nominalization? Why do some nouns have argument structure, while others do not? We seek a solution in which properties of deverbal nouns are composed from properties of verbs, properties of nouns, and properties of the morphemes that relate them. The theory of each plus the theory of howthey combine, should give the explanation. In exploring this, we investigate properties of two theories of nominalization. In one, the verb-like properties of deverbal nouns result from verbal syntactic structure (a “structural model”). See, for example, van Hout & Roeper 1998, Fu, Roeper and Borer 1993, 2001, to appear, Alexiadou 2001, to appear). According to the structural hypothesis, some nouns contain VPs and/or verbal functional layers. In the other theory, the verbal properties of deverbal nouns result from the event structure and argument structure of the DPs that they head. By “event structure” we mean a representation of the elements and structure of a linguistic event, not a representation of the world. We refer to this view as the “event model”. According to the event model hypothesis, all derived nouns are represented with the same syntactic structure, the difference lying in argument structure – which in turn is critically related to event structure, in the way sketched in Grimshaw (1990), Siloni (1997) among others. In pursuing these lines of analysis, and at least to some extent disentangling their properties, we reach the conclusion that, with respect to a core set of phenomena, the two theories are remarkably similar – specifically, they achieve success with the same problems, and must resort to the same stipulations to address the remaining issues that we discuss (although the stipulations are couched in different forms).

A note on non-canonical passives : the case of the get-passive (2005)

Alexiadou, Artemis

In many languages, a passive-like meaning may be obtained through a noncanonical passive construction. The get passive (1b) in English, the se faire passive (2b) in French and the kriegen passive (3b) in German represent typical manifestations. This squib focuses on the behavior of the get-passive in English and discusses a number of restrictions associated with it as well as the status of get.

Adjectival modification and multiple determiners (1998)

Alexiadou, Artemis ; Wilder, Chris

The present paper deals with the distribution of the definite determiner and certain related aspects of adjectival modification in Greek DPs. As (1) shows, determiners in Greek DPs precede adjectives and adjectives precede nouns. All three categories overtly agree in gender, number and case.

Clitic-doubling and (non-)configurationality (2000)

Alexiadou, Artemis ; Anagnostopoulou, Elena

In this paper we investigate Greek, an optional clitic doubling language not subject to Kaynes generalization (Jaeggli 1982), and we argue that in this language, doubled DPs are in A-positions. We propose that Greek clitics are formal features that move, permitting DPs in argument positions. This leads to a typology according to which there are two types of clitic/agreement languages -configurational and nonconfigurational ones-, depending upon whether clitics are instantiations of formal features or not.

Adjective Syntax and (the absence of) noun raising in the DP (2003)

Alexiadou, Artemis

The paper is structured as follows. Section 2.1 introduces the basic classes of adjectives that constitute the factual core of the paper. Section 2.2 summarizes in greater detail the X° and the XP movement approaches to word order variation within the DP. Section 3 briefly discusses problems for both approaches. Sections 4.1, 5.1, and 5.2 draw from Alexiadou (2001) and contain a discussion of Greek DS and its relevance for a re-analysis of the word order variation in the Romance DP. Section 4.2 introduces refinements to Alexiadou & Wilder (1998) and Alexiadou (2001). Section 5.3. discusses certain issues that arise from the analysis of postnominal adjectives in Romance as involving raising of XPs. Section 6 discusses phenomena found in other languages, which at first sight seem similar to DS. However, I show that double definiteness in e.g. Hebrew, Scandinavian or other Balkan languages constitutes a different type of phenomenon from Greek DS, thus making a distinction between determiners that introduce CPs (Greek) and those that are merely morphological/agreement markers (Hebrew, Scandinavian, Albanian).

Class features as probes (2008)

Alexiadou, Artemis ; Müller, Gereon

In this article, we adress (i) the form and (ii) the function on inflection class features in minimalist grammar. The empirical evidence comes from noun inflection systems involving fusional markers in German, Greek, and Russian. As for (i), we argue (based on instances of transparadigmatic syncretism) that class features are not privative; rather, class information must be decomposed into more abstract, binary features. Concerning (ii), we propose that class features qualify as the very device that brings about fusional infection: They are uninterpretable in syntax and actas probes on stems, with matching inflection markers as goels, and thus trigger morphological Agree operations that merge stem and inflection marker before syntax is reached.

The subject-in-situ generalization revisited (2007)

Alexiadou, Artemis ; Anagnostopoulou, Elena

The goal of this paper is to re-examine the status of the condition in (1) proposed in Alexiadou and Anagnostopoulou (2001; henceforth A&A 2001), in view of recent developments in syntactic theory. (1) The subject-in-situ generalization (SSG) By Spell-Out, vP can contain only one argument with a structural Case feature. We argue that (1) is a more general condition than previously recognized, and that the domain of its application is parametrized. More specifically, based on a comparison between Indo-European (IE) and Khoisan languages, we argue that (1) supports an interpretation of the EPP as a general principle, and not as a property of T. Viewed this way, the SSG is a condition that forces dislocation of arguments as a consequence of a constraint on Case checking.

Plural marking in argument supporting nominalizations (2008)

Alexiadou, Artemis ; Iordăchioaia, Gianina Nicoleta ; Soare, Elena

This paper investigates the conditions under which Argument Supporting Nominalizations (ASNs) can receive plural marking. Under ASNs, we discuss deverbal nouns that express an event and preserve argument structure. In our discussion we consider ASNs in Romanian, English and German.

PP licensing in nominalizations (2008)

Alexiadou, Artemis ; Anagnostopoulou, Elena ; Schäfer, Florian

In this paper we compare the distribution of PPs introducing external arguments in nominalizations with PPs introducing external arguments in the verbal domain. We show that several mismatches exist between the behavior of PPs in nominalizations and PPs in the verbal domain. This leads us to suggest that while PPs in the verbal domain are licensed by functional structure alone, within the nominal domain, PPs can also be licensed via an interplay of the encyclopaedic meaning of the root involved and the properties of the preposition itself. This second mechanism kicks in in the absence of functional structure.

Structuring participles (2008)

Alexiadou, Artemis ; Anagnostopoulou, Elena

In this paper we discuss three types of adjectival participles in Greek, ending in -tos and –menos, and provide a further argument for the view that finer distinctions are necessary in the domain of participles (Kratzer 2001, Embick 2004). We further compare Greek stative participles to their German (and English) counterparts. We propose that a number of semantic as well as syntactic differences shown by these derive from differences in their respective morpho-syntactic composition.

Agent, causer and instrument PPs in Greek : implications for verbal structure (2008)

Alexiadou, Artemis ; Anagnostopoulou, Elena

In this paper we investigate the distribution of PPs related to external arguments (agent, causer, instrument, causing event) in Greek. We argue that their distribution supports an analysis, according to which agentive/instrument and causer PPs are licensed by distinct functional heads, respectively. We argue against a conceivable alternative analysis, which links agentivity and causation to the prepositions themselves. We furthermore identify a particular type of Voice head in Greek anticausative realised by non-active Voice morphology.

From hierarchies to features : person splits and direct-inverse alternations (2006)

Alexiadou, Artemis ; Anagnostopoulou, Elena

In the recent literature there is growing interest in the morpho-syntactic encoding of hierarchical effects. The paper investigates one domain where such effects are attested: ergative splits conditioned by person. This type of splits is then compared to hierarchical effects in direct-inverse alternations. On the basis of two case studies (Lummi instantiating an ergative split person language and Passamaquoddy an inverse language) we offer an account that makes no use of hierarchies as a primitive. We propose that the two language types differ as far as the location of person features is concerned. In inverse systems person features are located exclusively in T, while in ergative systems, they are located in T and a particular type of v. A consequence of our analysis is that Case checking in split and inverse systems is guided by the presence/absence of specific phi-features. This in turn provides evidence for a close connection between Case and phi-features, reminiscent of Chomsky’s (2000, 2001) Agree.

On the role of syntactic locality in morphological processes : the case of (Greek) derived nominals (2008)

Alexiadou, Artemis

The paper is structured as follows. In section 2, I briefly summarize the facts on English and Greek nominalizations. In section 3, I discuss English nominal derivation in some detail. In section 4, I turn to the question of licensing of AS in nominals. In section 5, I turn to the issue of the optionality of licensing of AS in the nominal system.

The properties of anticausatives crosslinguistically (2006)

Alexiadou, Artemis ; Anagnostopoulou, Elena ; Schäfer, Florian

The causative/anticausative alternation has been the topic of much typological and theoretical discussion in the linguistic literature. This alternation is characterized by verbs with transitive and intransitive uses, such that the transitive use of a verb V means roughly "cause to Vintransitive" (see Levin 1993). The discussion revolves around two issues: the first one concerns the similarities and differences between the anticausative and the passive, and the second one concerns the derivational relationship, if any, between the transitive and intransitive variant. With respect to the second issue, a number of approaches have been developed. Judging the approach conceptually unsatisfactory, according to which each variant is assigned an independent lexical entry, it was concluded that the two variants have to be derivationally related. The question then is which one of the two is basic and where this derivation takes place in the grammar. Our contribution to this discussion is to argue against derivational approaches to the causative / anticausative alternation. We focus on the distribution of PPs related to external arguments (agent, causer, instrument, causing event) in passives and anticausatives of English, German and Greek and the set of verbs undergoing the causative/anticausative alternation in these languages. We argue that the crosslinguistic differences in these two domains provide evidence against both causativization and detransitivization analyses of the causative / anticausative alternation. We offer an approach to this alternation which builds on a syntactic decomposition of change of state verbs into a Voice and a CAUS component. Crosslinguistic variation in passives and anticausatives depends on properties of Voice and its combinations with CAUS and various types of roots.

On the distribution of adjectives in Romanian : the cel construction (2008)

Alexiadou, Artemis ; Marchis, Mihaela

This paper deals with the variable position of adjectives in the Romanian DP. As all other Romance languages, Romanian allows for adjectives to appear in both prenominal and post-nominal position. In addition, however, Romanian has a third pattern: the so-called cel construction, in which the adjective in the post-nominal position is preceded by a determiner-like element, cel. This pattern is superficially similar to Determiner Spreading in Greek. In this paper we contrast the cel construction to Greek DS and discuss the similarities and differences between the two. We then present an analysis of cel as involving an appositive specification clause, building on de Vries (2002). We argue that the same structure is also involved in the context of nominal ellipsis, the second environment in which cel is found.

Open Access

Linguistik

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

3001 search hits