OPUS 4 | Linguistik

Modifying (the grammar of) adjuncts : an introduction (2003)

Lang, Ewald ; Maienborn, Claudia ; Fabricius-Hansen, Cathrine

One aspect of the progress being made is that the focus of attention has widened. Adverbials, though still the heart of the matter, now form part of a much larger set of constituent types subsumed under the general syntactic label of adjunct; while modifier has become the semantic counterpart on the same level of generality. So one of the readings of Modifying Adjuncts stands for the focus on this intersection. Moreover, recent years have seen a number of studies which attest an increasing interest in adjunct issues. There is an impressive number of monographs, e.g. Alexiadou (1997), Laenzlinger (1998), Cinque (1999), Pittner (1999), Ernst (2002), which, by presenting in-depth analyses of the syntax of adjuncts, have sharpened the debate on syntactic theorizing. Serious attempts to gain a broader view on adjuncts are witnessed by several collections, see Alexiadou and Svenonius (2000), Austin, Engelberg and Rauh (in progress); of particular importance are the contributions to vol. 12.1 of the Italian Journal of Linguistics (2000), a special issue on adverbs, the Introductions to which by Corver and Delfitto (2000) and Delfitto (2000) may be seen as the best state-of-the-art article on adverbs and adverbial modification currently on the market. To try and test a fresh view on adjuncts was the leitmotif of the Oslo Conference “Approaching the Grammar of Adjuncts” (Sept 22–25, 1999), which provided the initial forum for the papers contained in this volume and initiated a period of discussion and continuing interaction among the contributors, from which the versions published here have greatly profited. The aim of the Oslo conference, and hence the focus of the present volume, was to encourage syntacticians and semanticists to open their minds to a more integrative approach to adjuncts, thereby paying attention to, and attempting to account for, the various interfaces that the grammar of adjuncts crucially embodies. From this perspective, the present volume is to be conceived of as an interim balance of current trends in modifying the views on adjuncts. In introducing the papers, we will refrain from rephrasing the abstracts, but will instead offer a guided tour through the major problem areas they are tackling. Assessed by thematic convergence and mutual reference, the contributions form four groups, which led us to arrange them into subparts of the book. Our commenting on these is intended (i) to provide a first glance at the contents, (ii) to reveal some of the reasons why adjuncts indeed are, and certainly will remain, a challenging issue, and thereby (iii) to show some facets of what we consider novel and promising approaches.

Eventualities and different things : a reply (2005)

Maienborn, Claudia

“Comments are very welcome!” This basic attitude and the many ways of implementing it contribute immensely to the fascination of engaging in scientific research. I am grateful to Theoretical Linguistics for providing a public platform for this kind of scholarly exchange and I thank all commentators for their thoughtful, stimulating, and often challenging contributions to my target article. My response will address two main issues that are raised by the commentaries. The first issue is shaped by a cluster of questions relating to ontology. The second issue concerns questions of methodology pertaining in particular to the problem of judging data.

Event-internal modifiers : semantic underspecification and conceptual interpretation (2003)

Maienborn, Claudia

The article offers evidence that there are two variants of adverbial modification that differ with respect to the way in which a modifier is linked to the verbs eventuality argument. So-called event-external modifiers relate to the full eventuality, whereas event-internal modifiers relate to some integral part of it. The choice between external and internal modification is shown to be dependent on the modifiers syntactic base position. Event-external modifiers are base-generated at the VP periphery, whereas event-internal modifiers are base-generated at the V periphery. These observations are accounted for by a refined version of the standard Davidsonian approach to adverbial modification according to which modification is mediated by a free variable. In the case of external modification, the grammar takes responsibility for identifying the free variable with the verbs eventuality argument, whereas in the case of internal modification, a value for the free variable is determined by the conceptual system on the basis of contextually salient world knowledge. For the intriguing problem that certain locative modifiers occasionally seem to have nonlocative (instrumental, positional, or manner) readings, the advocated approach can provide a rather simple solution.

Against a Davidsonian analysis of copula sentences (2003)

Maienborn, Claudia

Semantic research over the past three decades has provided impressive confirmation of Donald Davidsons famous claim that “there is a lot of language we can make systematic sense of if we suppose events exist” (Davidson 1980:137). Nowadays, Davidsonian event arguments are no longer reserved only for action verbs (as Davidson originally proposed) or even only for the category of verbs, but instead are widely assumed to be associated with any kind of predicate (e.g. Higginbotham 2000, Parsons 2000).1 The following quotation from Higginbotham and Ramchand (1997) illustrates the reasoning that motivates this move: "Once we assume that predicates (or their verbal, etc. heads) have a position for events, taking the many consequences that stem therefrom, as outlined in publications originating with Donald Davidson (1967), and further applied in Higginbotham (1985, 1989), and Terence Parsons (1990), we are not in a position to deny an event-position to any predicate; for the evidence for, and applications of, the assumption are the same for all predicates. (Higginbotham and Ramchand 1997:54)" In fact, since Davidson’s original proposal the burden of proof for postulating event arguments seems to have shifted completely, leading Raposo and Uriagereka (1995), for example, to the following verdict: "it is unclear what it means for a predicate not to have a Davidsonian argument (Raposo and Uriagereka 1995:182)" That is, Davidsonian eventuality arguments apparently have become something like a trademark for predicates in general. The goal of the present paper is to subject this view of the relationship between predicates and events to real scrutiny. By taking a closer look at the simplest independent predicational structure – viz. copula sentences – I will argue that current Davidsonian approaches tend to stretch the notion of events too far, thereby giving up much of its linguistic and ontological usefulness. More specifically, the paper will tackle the following three questions: 1. Do copula sentences support the current view of the inherent event-relatedness of predicates? 2. If not, what is a possible alternative to an event-based analysis of copula sentences? 3. What does this tell us about Davidsonian events? The paper is organized as follows: Section 2 first reviews current event-based analyses of copula sentences and then gives a brief summary of the Davidsonian notion of events. Section 3 examines the behavior of copula sentences with respect to some standard (as well as some new) eventuality diagnostics. Copula expressions will turn out to fail all eventuality tests. They differ sharply from state verbs like stand, sit, sleep in this respect. (The latter pass all eventuality tests and therefore qualify as true “Davidsonian state” expressions.) On the basis of these observations, section 4 provides an alternative account of copula sentences that combines Kim’s (1969, 1976) notion of property exemplifications with Ashers (1993, 2000) conception of abstract objects. Specifically, I will argue that the copula introduces a referential argument for a temporally bound property exemplification (= “Kimian state”). The proposal is implemented within a DRT framework. Finally, section 5 offers some concluding remarks and suggests that supplementing Davidsonian eventualities by Kimian states not only yields a more adequate analysis for copula expressions and the like but may also improve our treatment of events.

A pragmatic explanation of the stage level/individual level contrast in combination with locatives (2004)

Maienborn, Claudia

One important difference between stage level predicates (SLPs) and individual level predicates (ILPs) is their behavior with respect to locative modifiers. It is commonly assumed that SLPs but not ILPs combine with locatives. The present study argues against a semantic account for this behavior (as advanced by e.g. Kratzer 1995, Chierchia 1995) and proposes a genuinely pragmatic explanation of the observed stage level/individual level contrast instead. The proposal is spelled out using Blutners (1998, 2000) optimality theoretic version of the Gricean maxims. Building on the observation that the respective locatives are not event-related but frame-setting modifiers, the preference for main predicates that express temporary properties is explained as a side-effect of “synchronizing” the main predicate with the locative frame in the course of finding an optimal interpretation. By emphasizing the division of labor between grammar and pragmatics, the proposed solution takes a considerable load off of semantics.

A discourse-based account of Spanish ser/estar (2005)

Maienborn, Claudia

The study offers a discourse-based account of the Spanish copula forms ser and estar, which are generally considered to be lexical exponents of the stage-level/individual-level contrast. It argues against the popular view that the distinction between SLPs and ILPs rests on a fundamental cognitive division of the world that is reflected in the grammar. As it happens, conceptual oppositions like “temporary vs. permanent” or “arbitrary vs. essential“ provide only a preference for the interpretation of estar and ser. In addition, the evidence for an SLP/ILP impact on the grammar turns out to be far less conclusive than is currently assumed. The study argues against event-based accounts of the ser/estar contrast in particular, showing that ser and estar pattern alike in failing all of the standard eventuality tests. The discourse-based account proposed instead assumes that ser and estar both display the same lexical semantics (which is identical to the semantics of English be, German sein, etc.); estar differs from ser only in presupposing a relation to a specific discourse situation. By using estar a speaker restricts his or her claim to a specific discourse situation, whereas by using ser, the speaker makes no such restriction. The preference for interpreting estar predications as denoting temporary properties and ser predications as denoting permanent properties follows from economy principles driving the pragmatic legitimation of estars discourse dependence. The analysis proposed in this paper can also account for the observation that ser predications do not give rise to thetic judgements. The proposal is couched in terms of the framework of DRT.

Proceedings of the LREC workshop on partial parsing : between chunk parsing and deep parsing (2008)

Kübler, Sandra ; Piskorski, Jakub ; Przepiorkowski, Adam

Why is German dependency parsing more reliable than constituent parsing? (2006)

Kübler, Sandra ; Prokic, Jelena

In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used. Another direction in parsing research is the development of dependency parsers. Dependency parsing profits from the non-hierarchical nature of dependency relations, thus lexical information can be included in the parsing process in a much more natural way. Especially machine learning based approaches are very successful (cf. e.g.). The results achieved by these dependency parsers are very competitive although comparisons are difficult because of the differences in annotation. For English, the Penn Treebank has been converted to dependencies. For this version, Nivre et al. report an accuracy rate of 86.3%, as compared to an F-score of 92.1 for Charniaks parser. The Penn Chinese Treebank is also available in a constituent and a dependency representations. The best results reported for parsing experiments with this treebank give an F-score of 81.8 for the constituent version and 79.8% accuracy for the dependency version. The general trend in comparisons between constituent and dependency parsers is that the dependency parser performs slightly worse than the constituent parser. The only exception occurs for German, where F-scores for constituent plus grammatical function parses range between 51.4 and 75.3, depending on the treebank, NEGRA or TüBa-D/Z. The dependency parser based on a converted version of Tüba-D/Z, in contrast, reached an accuracy of 83.4%, i.e. 12 percent points better than the best constituent analysis including grammatical functions.

What linguists always wanted to know about german and did not know how to estimate (2006)

Hinrichs, Erhard ; Kübler, Sandra

This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogues, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.

Treebank profiling of spoken and written German (2005)

Hinrichs, Erhard ; Kübler, Sandra

This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogs, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper ´die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.

Towards case-based parsing : are chunks reliable indicators for syntax trees? (2006)

Kübler, Sandra

This paper presents an approach to the question whether it is possible to construct a parser based on ideas from case-based reasoning. Such a parser would employ a partial analysis of the input sentence to select a (nearly) complete syntax tree and then adapt this tree to the input sentence. The experiments performed on German data from the Tüba-D/Z treebank and the KaRoPars partial parser show that a wide range of levels of generality can be reached, depending on which types of information are used to determine the similarity between input sentence and training sentences. The results are such that it is possible to construct a case-based parser. The optimal setting out of those presented here need to be determined empirically.

Towards a dependency-oriented evaluation for partial parsing (2002)

Kübler, Sandra ; Telljohann, Heike

Quantitative evaluation of parsers has traditionally centered around the PARSEVAL measures of crossing brackets, (labeled) precision, and (labeled) recall. However, it is well known that these measures do not give an accurate picture of the quality of the parsers output. Furthermore, we will show that they are especially unsuited for partial parsers. In recent years, research has concentrated on dependencybased evaluation measures. We will show in this paper that such a dependency-based evaluation scheme is particularly suitable for partial parsers. TüBa-D, the treebank used here for evaluation, contains all the necessary dependency information so that the conversion of trees into a dependency structure does not have to rely on heuristics. Therefore, the dependency representations are not only reliable, they are also linguistically motivated and can be used for linguistic purposes.

The Tüba-D/Z treebank : annotating German with a context-free backbone (2004)

Telljohann, Heike ; Hinrichs, Erhard ; Kübler, Sandra

The purpose of this paper is to describe the TüBa-D/Z treebank of written German and to compare it to the independently developed TIGER treebank (Brants et al., 2002). Both treebanks, TIGER and TüBa-D/Z, use an annotation framework that is based on phrase structure grammar and that is enhanced by a level of predicate-argument structure. The comparison between the annotation schemes of the two treebanks focuses on the different treatments of free word order and discontinuous constituents in German as well as on differences in phrase-internal annotation.

The earliest Gullah/AAVE texts : a case of 19th century mesolectal variation (2003)

Troike, Rudolph C.

The earliest known extensive texts in Gullah (and perhaps African American Vernacular English as well) to appear in print were published in The Riverside Magazine for Young People in November, 1868, under the title "Negro Fables" (p. 505-507). These are four animal stories, which the editor of the magazine, Horace Elisha Scudder, described in his column only as having been "taken down from the lips of an old negro, in the vicinity of Charleston" (see Appendix for the editor´s comments and the full text of the stories).2 The Story-Teller was evidently a genuine "man of words" (Abrahams, 1983), a true raconteur who could artistically embellish a simple traditional account (perhaps further embellished by the transcriber) in a variety of ways. That he commanded a certain range of Gullah is evident from particular signature features in the texts, but the absence of other typical Gullah features and the presence of shared Gullah/African American Vernacular English usages, together with the periodic appearance of standard English forms, demonstrate that these texts provide perhaps the earliest actual documentation (apart from early tertiary comments, cited e.g. in Feagin, 1997, p. 128-129) of register variation or style/code-switching among Gullah speakers. ...

The PaGe 2008 shared task on parsing German (2008)

Kübler, Sandra

The ACL 2008 Workshop on Parsing German features a shared task on parsing German. The goal of the shared task was to find reasons for the radically different behavior of parsers on the different treebanks and between constituent and dependency representations. In this paper, we describe the task and the data sets. In addition, we provide an overview of the test results and a first analysis.

The CoNLL 2007 shared task on dependency parsing (2007)

Nivre, Joakim ; Hall, Johan ; Kübler, Sandra ; McDonald, Ryan ; Nilsson, Jens ; Riedel, Sebastian ; Yuret, Deniz

The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In this paper, we define the tasks of the different tracks and describe how the data sets were created from existing treebanks for ten languages. In addition, we characterize the different approaches of the participating systems, report the test results, and provide a first analysis of these results.

Sometimes less is more : Romanian word sense disambiguation revisited (2007)

Dinu, Georgiana ; Kübler, Sandra

Recent approaches to Word Sense Disambiguation (WSD) generally fall into two classes: (1) information-intensive approaches and (2) information-poor approaches. Our hypothesis is that for memory-based learning (MBL), a reduced amount of data is more beneficial than the full range of features used in the past. Our experiments show that MBL combined with a restricted set of features and a feature selection method that minimizes the feature set leads to competitive results, outperforming all systems that participated in the SENSEVAL-3 competition on the Romanian data. Thus, with this specific method, a tightly controlled feature set improves the accuracy of the classifier, reaching 74.0% in the fine-grained and 78.7% in the coarse-grained evaluation.

Recent developments in linguistic annotations of the TüBa-D/Z treebank (2004)

Hinrichs, Erhard ; Kübler, Sandra ; Naumann, Karin ; Telljohann, Heike ; Trushkina, Julia

The purpose of this paper is to describe recent developments in the morphological, syntactic, and semantic annotation of the TüBa-D/Z treebank of German. The TüBa-D/Z annotation scheme is derived from the Verbmobil treebank of spoken German [4, 10], but has been extended along various dimensions to accommodate the characteristics of written texts. TüBa-D/Z uses as its data source the "die tageszeitung" (taz) newspaper corpus. The Verbmobil treebank annotation scheme distinguishes four levels of syntactic constituency: the lexical level, the phrasal level, the level of topological fields, and the clausal level. The primary ordering principle of a clause is the inventory of topological fields, which characterize the word order regularities among different clause types of German, and which are widely accepted among descriptive linguists of German [3, 6]. The TüBa-D/Z annotation relies on a context-free backbone (i.e. proper trees without crossing branches) of phrase structure combined with edge labels that specify the grammatical function of the phrase in question. The syntactic annotation scheme of the TüBa-D/Z is described in more detail in [12, 11]. TüBa-D/Z currently comprises approximately 15 000 sentences, with approximately 7 000 sentences being in the correction phase. The latter will be released along with an updated version of the existing treebank before the end of this year. The treebank is available in an XML format, in the NEGRA export format [1] and in the Penn treebank bracketing format. The XML format contains all types of information as described above, the NEGRA export format contains all sentenceinternal information while the Penn treebank format includes only those layers of information that can be expressed as pure tree structures. Over the course of the last year, more fine grained linguistic annotations have been added along the following dimensions: 1. the basic Stuttgart-Tübingen tagset, STTS, [9] labels have been enriched by relevant features of inflectional morphology, 2. named entity information has been encoded as part of the syntactic annotation, and 3. a set of anaphoric and coreference relations has been added to link referentially dependent noun phrases. In the following sections, we will describe each of these innovations in turn and will demonstrate how the additional annotations can be incorporated into one comprehensive annotation scheme.

Parsing without grammar - using complete trees instead (2003)

Kübler, Sandra

The definition of similarity between sentences is formulated on the levels of words, POS tags, and chunks (Abney 91; Abney 96). The evaluation of this approach shows that while precision and recall based on the PARSEVAL measures (Black et al. 91) do not reach state of the art Parsers yet (F1=87.19 on syntactic constituents, F1=77.78 including functionargument structure), the parser shows a very reliable performance where function-argument structure is concerned (F1=96.52). The lower F-scores are very often due to unattached constituents.

Memory-based vocalization of Arabic (2008)

Kübler, Sandra ; Mohamed, Emad

The problem of vocalization, or diacritization, is essential to many tasks in Arabic NLP. Arabic is generally written without the short vowels, which leads to one written form having several pronunciations with each pronunciation carrying its own meaning(s). In the experiments reported here, we define vocalization as a classification problem in which we decide for each character in the unvocalized word whether it is followed by a short vowel. We investigate the importance of different types of context. Our results show that the combination of using memory-based learning with only a word internal context leads to a word error rate of 6.64%. If a lexical context is added, the results deteriorate slowly.

Open Access

Linguistik

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

1397 search hits