Linguistik
Refine
Year of publication
- 2009 (44) (remove)
Document Type
- Part of a Book (12)
- Preprint (11)
- Article (6)
- Conference Proceeding (6)
- Report (5)
- Doctoral Thesis (2)
- Part of Periodical (1)
- Working Paper (1)
Language
- English (44) (remove)
Has Fulltext
- yes (44) (remove)
Is part of the Bibliography
- no (44)
Keywords
- Pragmatik (6)
- Optimalitätstheorie (5)
- Informationsstruktur (3)
- Phonetik (3)
- Semantik (3)
- Sinotibetische Sprachen (3)
- Spieltheorie (3)
- Tibetobirmanische Sprachen (3)
- Baltoslawische Sprachen (2)
- Deutsch (2)
- Evolutionstheorie (2)
- Konversation (2)
- Linguistik (2)
- Multicomponent Tree Adjoining Grammar (2)
- Nungisch (2)
- Postulat (2)
- Slawische Sprachen (2)
- Syntaktische Analyse (2)
- Syntax (2)
- Algorithmus (1)
- Artikulation (1)
- Chinesisch (1)
- Coreference annotation (1)
- Diskursanalyse (1)
- Englisch (1)
- Etymologie (1)
- Formale Semantik (1)
- German (1)
- Grammatikalisation (1)
- Illokutiver Akt (1)
- Intensionale Logik (1)
- Keltische Sprachen (1)
- Kognitive Linguistik (1)
- Konjugation (1)
- Konnektionismus (1)
- Kontrastive Pragmatik (1)
- LFG (1)
- Laryngal (1)
- Lautwandel (1)
- Lehnwort (1)
- Lexikologie (1)
- MCTAG (1)
- Modalpartikel (1)
- Morphem (1)
- Mögliche-Welten-Semantik (1)
- Niederländisch (1)
- Nominalisierung (1)
- PCFG (1)
- Perspektivierung (1)
- Polnisch (1)
- Präsupposition (1)
- Range Concatenation Grammar (1)
- Reibelaut (1)
- Simple Range Concatenation Grammar (1)
- Sloppiness (1)
- Sprachwandel (1)
- Sprechakte (1)
- Sprechaktklassifikation (1)
- Spreech Akte (1)
- Symposium (1)
- Tree Adjoining Grammar (1)
- Vagueness (1)
- Wissenschaftsgeschichte (1)
- explicit performatives (1)
- explizite Performative (1)
- grammar acquisistion (1)
- lexical-functional grammar (1)
- multicomponent rewriting (1)
- parsing (1)
- pragmatics (1)
- semantics (1)
- speech acts (1)
- treebanks (1)
Institute
Papers on pragmasemantics
(2009)
Optimality theory as used in linguistics (Prince & Smolensky, 1993/2004; Smolensky & Legendre, 2006) and cognitive psychology (Gigerenzer & Selten, 2001) is a theoretical framework that aims to integrate constraint based knowledge representation systems, generative grammar, cognitive skills, and aspects of neural network processing. In the last years considerable progress was made to overcome the artificial separation between the disciplines of linguistic on the one hand which are mainly concerned with the description of natural language competences and the psychological disciplines on the other hand which are interested in real language performance.
The semantics and pragmatics of natural language is a research topic that is asking for an integration of philosophical, linguistic, psycholinguistic aspects, including its neural underpinning. Especially recent work on experimental pragmatics (e.g. Noveck & Sperber, 2005; Garrett & Harnish, 2007) has shown that real progress in the area of pragmatics isn’t possible without using data from all available domains including data from language acquisition and actual language generation and comprehension performance. It is a conceivable research programme to use the optimality theoretic framework in order to realize the integration.
Game theoretic pragmatics is a relatively young development in pragmatics. The idea to view communication as a strategic interaction between speaker and hearer is not new. It is already present in Grice' (1975) classical paper on conversational implicatures. What game theory offers is a mathematical framework in which strategic interaction can be precisely described. It is a leading paradigm in economics as witnessed by a series of Nobel prizes in the field. It is also of growing importance to other disciplines of the social sciences. In linguistics, its main applications have been so far pragmatics and theoretical typology. For pragmatics, game theory promises a firm foundation, and a rigor which hopefully will allow studying pragmatic phenomena with the same precision as that achieved in formal semantics.
The development of game theoretic pragmatics is closely connected to the development of bidirectional optimality theory (Blutner, 2000). It can be easily seen that the game theoretic notion of a Nash equilibrium and the optimality theoretic notion of a strongly optimal form-meaning pair are closely related to each other. The main impulse that bidirectional optimality theory gave to research on game theoretic pragmatics stemmed from serious empirical problems that resulted from interpreting the principle of weak optimality as a synchronic interpretation principle.
In this volume, we have collected papers that are concerned with several aspects of game and optimality theoretic approaches to pragmatics.
Manual development of deep linguistic resources is time-consuming and costly and therefore often described as a bottleneck for traditional rule-based NLP. In my PhD thesis I present a treebank-based method for the automatic acquisition of LFG resources for German. The method automatically creates deep and rich linguistic presentations from labelled data (treebanks) and can be applied to large data sets. My research is based on and substantially extends previous work on automatically acquiring wide-coverage, deep, constraint-based grammatical resources from the English Penn-II treebank (Cahill et al.,2002; Burke et al., 2004; Cahill, 2004). Best results for English show a dependency f-score of 82.73% (Cahill et al., 2008) against the PARC 700 dependency bank, outperforming the best hand-crafted grammar of Kaplan et al. (2004). Preliminary work has been carried out to test the approach on languages other than English, providing proof of concept for the applicability of the method (Cahill et al., 2003; Cahill, 2004; Cahill et al., 2005). While first results have been promising, a number of important research questions have been raised. The original approach presented first in Cahill et al. (2002) is strongly tailored to English and the datastructures provided by the Penn-II treebank (Marcus et al., 1993). English is configurational and rather poor in inflectional forms. German, by contrast, features semi-free word order and a much richer morphology. Furthermore, treebanks for German differ considerably from the Penn-II treebank as regards data structures and encoding schemes underlying the grammar acquisition task. In my thesis I examine the impact of language-specific properties of German as well as linguistically motivated treebank design decisions on PCFG parsing and LFG grammar acquisition. I present experiments investigating the influence of treebank design on PCFG parsing and show which type of representations are useful for the PCFG and LFG grammar acquisition tasks. Furthermore, I present a novel approach to cross-treebank comparison, measuring the effect of controlled error insertion on treebank trees and parser output from different treebanks. I complement the cross-treebank comparison by providing a human evaluation using TePaCoC, a new testsuite for testing parser performance on complex grammatical constructions. Manual evaluation on TePaCoC data provides new insights on the impact of flat vs. hierarchical annotation schemes on data-driven parsing. I present treebank-based LFG acquisition methodologies for two German treebanks. An extensive evaluation along different dimensions complements the investigation and provides valuable insights for the future development of treebanks.
The main concern of this article is to discuss some recent findings concerning the psychological reality of optimality-theoretic pragmatics and its central part – bidirectional optimization. A present challenge is to close the gap between experimental pragmatics and neo-Gricean theories of pragmatics. I claim that OT pragmatics helps to overcome this gap, in particular in connection with the discussion of asymmetries between natural language comprehension and production. The theoretical debate will be concentrated on two different ways of interpreting bidirection: first, bidirectional optimization as a psychologically realistic online mechanism; second, bidirectional optimization as an offline phenomenon of fossilizing optimal form-meaning pairs. It will be argued that neither of these extreme views fits completely with the empirical data when taken per se.
Ever since the discovery of neural networks, there has been a controversy between two modes of information processing. On the one hand, symbolic systems have proven indispensable for our understanding of higher intelligence, especially when cognitive domains like language and reasoning are examined. On the other hand, it is a matter of fact that intelligence resides in the brain, where computation appears to be organized by numerical and statistical principles and where a parallel distributed architecture is appropriate. The present claim is in line with researchers like Paul Smolensky and Peter Gärdenfors and suggests that this controversy can be resolved by a unified theory of cognition – one that integrates both aspects of cognition and assigns the proper roles to symbolic computation and numerical neural computation.
The overall goal in this contribution is to discuss formal systems that are suitable for grounding the formal basis for such a unified theory. It is suggested that the instruments of modern logic and model theoretic semantics are appropriate for analyzing certain aspects of dynamical systems like inferring and learning in neural networks. Hence, I suggest that an active dialogue between the traditional symbolic approaches to logic, information and language and the connectionist paradigm is possible and fruitful. An essential component of this dialogue refers to Optimality Theory (OT) – taken as a theory that likewise aims to overcome the gap between symbolic and neuronal systems. In the light of the proposed logical analysis notions like recoverability and bidirection are explained, and likewise the problem of founding a strict constraint hierarchy is discussed. Moreover, a claim is made for developing an "embodied" OT closing the gap between symbolic representation and embodied cognition.
The paper investigates the origins of the German/Dutch particle toch/doch) in the hope of shedding light on a puzzle with respect to doch/toch and to shed some light on two theoretical issues. The puzzle is the nearly opposite meaning of the stressed and unstressed versions of the particle which cannot be accounted for in standard theories of the meaning of stress. One theoretical issue concerns the meaning of stress: whether it is possible to reduce the semantic contribution of a stressed item to the meaning of the item and the meaning of stress. The second issue is whether the complex use of a particle like doch/toch can be seen as an instance of spread or whether it has to be seen as having a core meaning which is differentiated by pragmatics operating in different contexts.
We use the etymology of doch and doch as to+u+h (that+ question marker+ emphatic marker) to argue for an origin as a question tag checking a hearer opinion. Stress on the tag indicates an opposite opinion (of the common ground or the speaker) and this sets apart two groups of uses spreading in different directions. This solves the puzzle, indicates that the assumption of spread is useful and offers a subtle correction of the interpretation of stress. While stress always means contrast with a contrasting item, if the particle use is due to spread, it is not guaranteed that the unstressed particle has a corresponding use (or inversely).
In this paper, we outline the foundations of a theory of implicatures. It divides into two parts. The first part contains the base model. It introduces signalling games, optimal answer models, and a general definition of implicatures in terms of natural information. The second part contains a refinement in which we consider noisy communication with efficient clarification requests. Throughout, we assume a fully cooperative speaker who knows the information state of the hearer. The purpose of this paper is not the study of examples. Our concern is the framework for doing these studies.
The article aims to give an overview about the application of Optimality Theory (OT) to the domain of pragmatics. In the introductory part we discuss different ways to view the division of labor between semantics and pragmatics. Rejecting the doctrine of literal meaning we conform to (i) semantic underdetermination and (ii) contextualism (the idea that the mechanism of pragmatic interpretation is crucial both for determining what the speaker says and what he means). Taking the assumptions (i) and (ii) as essential requisites for a natural theory of pragmatic interpretation, section 2 introduces the three main views conforming to these assumptions: Relevance theory, Levinson’s theory of presumptive meanings, and the Neo-Gricean approach. In section 3 we explain the general paradigm of OT and the idea of bidirectional optimization. We show how the idea of optimal interpretation can be used to restructure the core ideas of these three different approaches. Further, we argue that bidirectional OT has the potential to account both for the synchronic and the diachronic perspective on pragmatic interpretation. Section 4 lists relevant examples of using the framework of bidirectional optimization in the domain of pragmatics. Section 5 provides some general conclusions. Modeling both for the synchronic and the diachronic perspective on pragmatics opens the way for a deeper understanding of the idea of naturalization and (cultural) embodiment in the context of natural language interpretation.
To some, the relation between bidirectional optimality theory and game theory seems obvious: strong bidirectional optimality corresponds to Nash equilibrium in a strategic game (Dekker and van Rooij 2000). But in the domain of pragmatics this formally sound parallel is conceptually inadequate: the sequence of utterance and its interpretation cannot be modelled reasonably as a strategic game, because this would mean that speakers choose formulations independently of a meaning that they want to express, and that hearers choose an interpretation irrespective of an utterance that they have observed. Clearly, the sequence of utterance and interpretation requires a dynamic game model. One such model, and one that is widely studied and of manageable complexity, is a signaling game. This paper is therefore concerned with an epistemic interpretation of bidirectional optimality, both strong and weak, in terms of beliefs and strategies of players in a signaling game. In particular, I suggest that strong optimality may be regarded as a process of internal self-monitoring and that weak optimality corresponds to an iterated process of such self-monitoring. This latter process can be derived by assuming that agents act rationally to (possibly partial) beliefs in a self-monitoring opponent.
The present paper offers a summary of the results of two earlier experiments (Nawrocki and Gonet 2004; Nawrocki 2004), in which acoustic properties of the voiceless velar fricative phoneme /x/ in Southern Polish were investigated.
As is found in both studies (Nawrocki and Gonet 2004; Nawrocki 2004), speakers of both genders favour glottal articulation, with partial or full voicing. Word final contexts are decisively in favour of [x]. The word initial, prevocalic positions seem to allow quite a number of allophonic variants of /x/ . These are: [x], [ɦ], [ç] and, additionally, the voiceless glottal, the pharyngeal or the epiglottal [h]/[ħ]/[ʜ]. Another factor taken into account is the coarticulation effect of the vocalic context on the choice of articulation. Based on the results of the experiments, a reformulated allophonic composition is proposed for Polish /x/. It makes room for previously unconsidered pharyngeal and glottal allophones.
In order to inspect the acoustic properties of the allophones of Polish /x/ further, their static and dynamic spectral features are compared to those of phonetically similar sounds in other languages where they have the status of independent phonemes. Special attention is paid to the distribution of spectral peaks and their intensity. The fact that in Polish there are no 'back' fricative phonemes that would contrast with /x/ creates a wide range of acceptable allophonic articulations that cannot be challenged from either articulatory or perceptual points of view.
Horn's division of pragmatic labour (Horn, 1984) is a universal property of language, and amounts to the pairing of simple meanings to simple forms, and deviant meanings to complex forms. This division makes sense, but a community of language users that do not know it makes sense will still develop it after a while, because it gives optimal communication at minimal costs. This property of the division of pragmatic labour is shown by formalising it and applying it to a simple form of signalling games, which allows computer simulations to corroborate intuitions. The division of pragmatic labour is a stable communicative strategy that a population of communicating agents will converge on, and it cannot be replaced by alternative strategies once it is in place.
Contributing to NABE News - Guidelines for Writers 2 ; Letter from the President 4 ; Memories of Title VII - Dr. Josefina Villamil Tinajero 5 ; Asian and Pacific Islanders Developing a Chinese Dual Language Program in Elementary Schools: Be Responsive to Language Characteristics - Dr. Ping Liu 13 ; The State of ELLs Education NABE 2009 Conference Proceedings 16 ; Indigenous Bilingual Education BIE Leading Indian Students to Failure 20 ; Fact Sheet on Supreme Court’s Decision In HORNE V. FLORES - David Hinojosa 21 ; Reflections on a White House Visit - By Francisco Guajardo 22
The focus of this paper is the perspectivization of thematic roles generally and the recipient role specifically. Whereas perspective is defined here as the representation of something for someone from a given position (Sandig 1996: 37), perspectivization refers to the verbalization of a situation in the speech generation process (Storrer 1996: 233). In a prototypical act of giving, for example, the focus of perception (the attention of the external observer) may be on the person who gives (agent), the transferred object (patient) or the person who receives the transferred object (recipient). The languages of the world provide differing linguistic means to perspectivize such an act of giving, or better: to perspectivize the participants of such an action. In this article, the linguistic means of three selected continental West Germanic languages –German, Dutch and Luxembourgish– will be taken into consideration, with an emphasis on the perspectivization of the recipient role.
Nominalization in Rawang
(2009)
This paper discusses the types of relative clause and noun complement structures found in the Rawang language, a Tibeto-Burman language of northern Myanmar, as well as their origin and uses, with data taken mainly from naturally occurring texts. Two types are preposed relative clauses, but in one the relative clause is nominalized, and in the other it is not. The non-nominalized form with a general head led to the development of nominalizing suffixes and one type of nominalized relative clause structure. As the nominalized form is a nominal itself, it can be postposed to the head in an appositional structure. There is also discussion of the Rawang structures in the context of Tibeto-Burman and the development of relative clause structures in the language family.
Many linguists in China and the West have talked about Chinese as a topic-comment language, that is, a language in which the structure of the clause takes the form of a topic, about which something is to be said, and a comment, which is what is said about the topic, rather than being a language with a subject-predicate structure like that of English. Y. R. Chao (1968), for example, said that all Chinese clauses have topic-comment structure and there are no exceptions.
Language contact has become a major focus of inquiry in historical and typological linguistics in the last twenty years, spurred in a large part by the publication of Thomason & Kaufman (1988), which tried to make sense of a large amount of language contact data. They argued that there was a direct relationship between the degree or intensity of language contact and the amount and type of influence the contact would have on one or more of the languages involved. Essentially, the greater the degree of bilingualism, the greater the degree of contact influence (see also Thomason 2001); if the contact and bilingualism was minimal, then there might just be a few loanwords adapted to the borrowing language's phonology and grammatical system, but if the contact and bilingualism was of a greater degree there would be influence in the grammar and phonology of the affected language. As more linguists came to take language contact more seriously, they came to realize how common language contact phenomena are.
There is every reason to welcome the revised edition (2009) of Thomas Olander’s dissertation (2006), which I have criticized elsewhere (2006). The book is very well written and the author has a broad command of the scholarly literature. I have not found any mistakes in Olander’s rendering of other people’s views. This makes the book especially useful as an introduction to the subject. It must be hoped that the easy access to a complex set of problems which this book offers will have a stimulating effect on the study of Balto-Slavic accentology.
All's well that ends well
(2009)
A few years ago, Jasanoff adopted the central tenet of my accentological theory, viz. that the Balto-Slavic acute was a stød or glottal stop, not a rising tone (cf. Kortlandt 1975, 1977, 2004, Jasanoff 2004a). Of course, nobody will believe Jasanoff’s claim that he arrived at the same result independently thirty years after I published it and ten years after we discussed it when he came to Leiden to visit us. Though at the time he haughtily dismissed “the tangle of secondary hypotheses and “laws” that clutter the ground in the field of Balto-Slavic accentology” (Jasanoff 2004b: 171), he has now recognized the importance of Pedersen’s law, Hirt’s law, Winter’s law, Meillet’s law, Dolobko’s law, Dybo’s law and Stang’s law and largely accepted my relative chronology of these accent laws, including the loss of the acute shortly before Stang’s law (cf. Jasanoff 2008). He has also accepted my split of Pedersen’s law into a Balto-Slavic and a Slavic phase (to which a Lithuanian phase must be added), my thesis that the tonal contours of Baltic and Slavic languages are post-Balto-Slavic innovations (cf. Jasanoff 2008: 344, fn. 10), and the rise of a tonal distinction on non-acute initial syllables before Dybo’s law which I discussed at some length in my review (1978) of Garde’s monograph (1976). This is great progress.
West Slavic accentuation
(2009)
At the time of the earliest reconstructible dialectal divergences, which belong to the Late Middle Slavic period of my chronology (stages 7.0 - 8.0 of Kortlandt 1989a, 2003, 2008), the West Slavic languages represented the most conservative part of the Slavic dialects (cf. Kortlandt 1982b: 191 and 2003: 231).
It appears that the complexity of Slavic historical accentology is prohibitive for most non-specialists in the field. It may therefore be useful to approach the subject from a number of different angles in order to render it more accessible to a wider audience. In the following I shall discuss the separate accent paradigms and their development from the Late Balto-Slavic system, which is structurally similar to that of modern Lithuanian, up to the end of the Proto-Slavic period, when the system resembled what we find in modern Serbo-Croatian. The numbering of the stages 1.0 through 10.12 is the same as in my earlier publications (1989, 2003, 2005, 2006a, 2008b). For the rise and development of the accentual system up to the end of the Balto-Slavic period I may refer to my discussion (2006b, 2008a) of Olander’s dissertation (2006). It resulted in a system of four major and two minor accent types.
Word formation in Distributed Morphology (see Arad 2005, Marantz 2001, Embick 2008): 1. Language has atomic, non-decomposable, elements = roots. 2. Roots combine with the functional vocabulary and build larger elements. 3. Roots are category neutral. They are then categorized by combining with category defining functional heads.
This paper investigates the class of Tree-Tuple MCTAG with Shared Nodes, TT-MCTAG for short, an extension of Tree Adjoining Grammars that has been proposed for natural language processing, in particular for dealing with discontinuities and word order variation in languages such as German. It has been shown that the universal recognition problem for this formalism is NP-hard, but so far it was not known whether the class of languages generated by TT-MCTAG is included in PTIME. We provide a positive answer to this question, using a new characterization of TT-MCTAG.
We present a CYK and an Earley-style algorithm for parsing Range Concatenation Grammar (RCG), using the deductive parsing framework. The characteristic property of the Earley parser is that we use a technique of range boundary constraint propagation to compute the yields of non-terminals as late as possible. Experiments show that, compared to previous approaches, the constraint propagation helps to considerably decrease the number of items in the chart.
Multicomponent Tree Adjoining Grammars (MCTAGs) are a formalism that has been shown to be useful for many natural language applications. The definition of non-local MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. Looking only at the result of a derivation (i.e., the derived tree and the derivation tree), this simultaneity is no longer visible and therefore cannot be checked. I.e., this way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. In this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees (in the underlying TAG) the MCTAG licences. We provide similar characterizations for various types of MCTAG. These characterizations give a better understanding of the formalisms, they allow a more systematic comparison of different types of MCTAG, and, furthermore, they can be exploited for parsing.
This article discusses the divergent status of the two particles lé and lá in the grammar of Konkomba, a Gur language (Niger-Congo) of the Gurma subgroup. While previous studies claim that both particles are focus markers, this author argues that only the particle lá should be analyzed as a pure pragmatic device. Distributional studies suggest that the use of particle lé, on the other hand, is only required under specific focus conditions, and primarily represents a syntactic device.
The main tenet of the present paper is the thesis that nominalization – like other cases of derivational morphology – is an essentially lexical phenomenon with well defined syntactic (and semantic) conditions and consequences. More specifically, it will be argued that the relation between a verb and the noun derived from it is subject to both systematic and idiosyncratic conditions with respect to lexical as well as syntactic aspects.
Experimental data shows that adult learners of an artificial language with a phonotactic restriction learned this restriction better when being trained on word types (e.g. when they were presented with 80 different words twice each) than when being trained on word tokens (e.g. when presented with 40 different words four times each) (Hamann & Ernestus submitted). These findings support Pierrehumbert’s (2003) observation that phonotactic co-occurrence restrictions are formed across lexical entries, since only lexical levels of representation can be sensitive to type frequencies.
We show that loanword adaptation can be understood entirely in terms of phonological and phonetic comprehension and production mechanisms in the first language. We provide explicit accounts of several loanword adaptation phenomena (in Korean) in terms of an Optimality-Theoretic grammar model with the same three levels of representation that are needed to describe L1 phonology: the underlying form, the phonological surface form, and the auditory-phonetic form. The model is bidirectional, i.e., the same constraints and rankings are used by the listener and by the speaker. These constraints and rankings are the same for L1 processing and loanword adaptation.
The present study argues that variation across listeners in the perception of a non-native contrast is due to two factors: the listener-specic weighting of auditory dimensions and the listener-specic construction of new segmental representations. The interaction of both factors is shown to take place in the perception grammar, which can be modelled within an OT framework. These points are illustrated with the acquisition of the Dutch three-member labiodental contrast [V v f] by German learners of Dutch, focussing on four types of learners from the perception study by Hamann and Sennema (2005a).
In this paper, we argue that difficulties in the definition of coreference itself contribute to lower inter-annotator agreement in certain cases. Data from a large referentially annotated corpus serves to corroborate this point, using a quantitative investigation to assess which effects or problems are likely to be the most prominent. Several examples where such problems occur are discussed in more detail, and we then propose a generalisation of Poesio, Reyle and Stevenson’s Justified Sloppiness Hypothesis to provide a unified model for these cases of disagreement and argue that a deeper understanding of the phenomena involved allows to tackle problematic cases in a more principled fashion than would be possible using only pre-theoretic intuitions.
Traditionally, parsers are evaluated against gold standard test data. This can cause problems if there is a mismatch between the data structures and representations used by the parser and the gold standard. A particular case in point is German, for which two treebanks (TiGer and TüBa-D/Z) are available with highly different annotation schemes for the acquisition of (e.g.) PCFG parsers. The differences between the TiGer and TüBa-D/Z annotation schemes make fair and unbiased parser evaluation difficult [7, 9, 12]. The resource (TEPACOC) presented in this paper takes a different approach to parser evaluation: instead of providing evaluation data in a single annotation scheme, TEPACOC uses comparable sentences and their annotations for 5 selected key grammatical phenomena (with 20 sentences each per phenomena) from both TiGer and TüBa-D/Z resources. This provides a 2 times 100 sentence comparable testsuite which allows us to evaluate TiGer-trained parsers against the TiGer part of TEPACOC, and TüBa-D/Z-trained parsers against the TüBa-D/Z part of TEPACOC for key phenomena, instead of comparing them against a single (and potentially biased) gold standard. To overcome the problem of inconsistency in human evaluation and to bridge the gap between the two different annotation schemes, we provide an extensive error classification, which enables us to compare parser output across the two different treebanks. In the remaining part of the paper we present the testsuite and describe the grammatical phenomena covered in the data. We discuss the different annotation strategies used in the two treebanks to encode these phenomena and present our error classification of potential parser errors.
In the recent literature the phenomenon of long distance agreement has become the focus of several studies as it seems to violate certain locality conditions which require that agreeing elements in general stand in clause-mate relationships. In particular, it involves a verb agreeing with a constituent which is located in the verb's clausal complement and hence poses a challenge for theories that assume a strictly local relationship for agreement. In this paper we present empirical evidence from Greek and Romanian for the reality of long distance agreement. Specifically, we focus on raising constructions in these two languages and we show that they do not involve movement but rather instantiate long distance agreement. We further argue that subjunctives allowing long distance agreement lack both a CP layer and semantic Tense. However, since the embedded verb also bears phi-features, these constructions pose a further problem for assumptions that view the presence of phi-features as evidence for the presence of a C layer. Finally, we raise the question of the common properties that these languages have that lead to the presence of long distance agreement.
Distributional approximations to lexical semantics are very useful not only in helping the creation of lexical semantic resources (Kilgariff et al., 2004; Snow et al., 2006), but also when directly applied in tasks that can benefit from large-coverage semantic knowledge such as coreference resolution (Poesio et al., 1998; Gasperin and Vieira, 2004; Versley, 2007), word sense disambiguation (Mc- Carthy et al., 2004) or semantical role labeling (Gordon and Swanson, 2007). We present a model that is built from Webbased corpora using both shallow patterns for grammatical and semantic relations and a window-based approach, using singular value decomposition to decorrelate the feature space which is otherwise too heavily influenced by the skewed topic distribution of Web corpora.
Parsing coordinations
(2009)
The present paper is concerned with statistical parsing of constituent structures in German. The paper presents four experiments that aim at improving parsing performance of coordinate structure: 1) reranking the n-best parses of a PCFG parser, 2) enriching the input to a PCFG parser by gold scopes for any conjunct, 3) reranking the parser output for all possible scopes for conjuncts that are permissible with regard to clause structure. Experiment 4 reranks a combination of parses from experiments 1 and 3. The experiments presented show that n- best parsing combined with reranking improves results by a large margin. Providing the parser with different scope possibilities and reranking the resulting parses results in an increase in F-score from 69.76 for the baseline to 74.69. While the F-score is similar to the one of the first experiment (n-best parsing and reranking), the first experiment results in higher recall (75.48% vs. 73.69%) and the third one in higher precision (75.43% vs. 73.26%). Combining the two methods results in the best result with an F-score of 76.69.
The aim of this paper is to address two main counterarguments raised in Landau (2007) against the movement analysis of Control, and especially against the phenomenon of Backward Control. The paper shows that unlike the situation described in Tsez (Polinsky & Potsdam 2002), Landau's objections do not hold for Greek and Romanian, where all obligatory control verbs exhibit Backward Control. Our results thus provide stronger empirical support for a theoretical approach to Control in terms of Movement, as defended in Hornstein (1999 and subsequent work).
If we want to develop a semantic analysis for explicit performatives such as I promise you to free Willy, we are faced with the following puzzle: In order to account for the speech act expressed by the performative verb, one can assume that the so-called performative clause is purely performative and provides the illocutionary force of the speech act whose content is given by the semantic object denoted by the complement clause. Yet under this perspective, the performative clause that is, next to the performative verb, the indexicals I and you that refer to the speaker and to the addressee of the utterance context is semantically invisible and does not contribute compositionally its meaning to the meaning of the entire explicit performative sentence. Conversely, if we account for the truth conditional contribution of the performative clause and deny that the meaning of the performative verb is purely performative, then we have to find a way to account for the speech act expressed by the performative verb. Of course, there is already the widely accepted and very appealing indirectness account for explicit performative utterances developed by Bach & Harnish (1979). Roughly, Bach and Harnish solve this puzzle in deriving the performativity by means of a pragmatic inference process. According to them, the important speech act performed by means of the utterance of the explicit performative sentence is a kind of the conventionalized indirect speech act. However, the boundary between semantics and pragmatics can be drawn in many various ways. Therefore, I think there could be other perspectives regarding the interface between the truth-functional treatment of the declarative explicit performative sentences and the speech acts performed with their utterances and which are expressed by the performative verbs. Hence, this thesis consists in the experiment to develop a further analysis and to check out its consequences with respect to the semantics and pragmatics of explicit performative utterances and the new interface emerged. Briefly, the experiment runs as follows: First, I develop an analysis for explicit performative sentences framed by parenthetical structures such as in (1)(a). In a second step, this parenthetical analysis is applied to the proper Austinian explicit performative sentences in (1)(b). (1) a. Tomorrow, I promise you this, I will teach them Tyrolean songs. b. I promise you that I will teach them Tyrolean songs. To analyze at first explicit performatives framed by parenthetical structures bears the convenience that we are faced with two utterances of two main clauses. In (1)(a) there is the utterance of the host sentence Tomorrow I will teach them Tyrolean songs, and the utterance of the explicit parenthetical I promise you this, where the demonstrative this refers to the utterance of Tomorrow I will teach them Tyrolean songs. Since speakers perform speech acts with utterances of main clauses, I assume that the meaning of the explicit parenthetical I promise you this specifies that the actual illocutionary force of the utterance of Tomorrow I will teach them Tyrolean songs is the illocutionary force of a promise. Hence, instead of deriving an indirect illocutionary force by means of a pragmatic inference schema, we can deal with an ordinary direct speech act that is performed with the utterance of the host sentence. This kind of analysis stresses the particular discourse function of explicit performative utterances. Performative verbs are used whenever the contextual information is not sufficient to determine the illocutionary force of the corresponding implicit speech act. The resulting consequences of the parenthetical analysis are interesting since they cast a different light on performative verbs. Surprisingly, the performative verbs are not performative at all. They do not constitute the execution of a speech act, but are execution supporting. Instead of constituting the particular illocutionary force, they merely specify the illocutionary force of the utterance of the host sentence. For instance, the speaker utters the explicit parenthetical I promise you this for specifying what he is simultaneously doing. Hence the speaker does not succeed in performing the promise simply because he is uttering I promise you this. Rather, by means of the information conveyed by the utterance of I promise you this, the potential illocutionary forces of the utterance of the host sentence are disambiguated. Thus, it is not the case that explicit parentheticals are trivially true when uttered. Their function is more complex. Their self-verifying property (‘saying so makes it so’) is explained by means of disambiguation. Furthermore, according to the parenthetical analysis, instead of being purely performative, the performative verbs contribute compositionally their meanings to the truth conditions of the entire explicit performative sentence. Together with its consequences, this analysis is applied to the proper Austinian performatives, which display subordination. I assume that regardless of their structure, explicit performatives always semantically and pragmatically behave as the parenthetical analysis predicts.