Linguistik
Refine
Year of publication
- 2009 (13) (remove)
Document Type
- Preprint (13) (remove)
Has Fulltext
- yes (13)
Is part of the Bibliography
- no (13)
Keywords
- Multicomponent Tree Adjoining Grammar (2)
- Syntaktische Analyse (2)
- Algorithmus (1)
- Analyse syntaxique déductive (1)
- Coreference annotation (1)
- Deutsch (1)
- Kommunikationsforschung (1)
- Lehnwort (1)
- MCTAG (1)
- Phonetik (1)
Institute
- Extern (8)
In this paper, we argue that difficulties in the definition of coreference itself contribute to lower inter-annotator agreement in certain cases. Data from a large referentially annotated corpus serves to corroborate this point, using a quantitative investigation to assess which effects or problems are likely to be the most prominent. Several examples where such problems occur are discussed in more detail, and we then propose a generalisation of Poesio, Reyle and Stevenson’s Justified Sloppiness Hypothesis to provide a unified model for these cases of disagreement and argue that a deeper understanding of the phenomena involved allows to tackle problematic cases in a more principled fashion than would be possible using only pre-theoretic intuitions.
Nous présentons ici différents algorithmes d’analyse pour grammaires à concaténation d’intervalles (Range Concatenation Grammar, RCG), dont un nouvel algorithme de type Earley, dans le paradigme de l’analyse déductive. Notre travail est motivé par l’intérêt porté récemment à ce type de grammaire, et comble un manque dans la littérature existante.
Die drei Bereiche, die hier verglichen werden sollen, entsprechen in etwa der überkommenen Trias von Literatur, Musik und bildender Kunst, einer Gliederung, die im Medienzeitalters mit Videos, CDs, Installationen oder Happenings eigentlich obsolet ist. Allerdings geht es hier nur um die Eigenart der Zeichensysteme, auf denen die verschiedenen Bereiche beruhen, nicht um die Werke, die dadurch möglich werden, obgleich natürlich auch die Kunstwerke im emphatischen Sinn, die bedeutenden und die banalen, die großen und die misslungenen Gestaltungen nur möglich und verstehbar sind aufgrund der Zeichen, auf denen sie beruhen.
Parsing coordinations
(2009)
The present paper is concerned with statistical parsing of constituent structures in German. The paper presents four experiments that aim at improving parsing performance of coordinate structure: 1) reranking the n-best parses of a PCFG parser, 2) enriching the input to a PCFG parser by gold scopes for any conjunct, 3) reranking the parser output for all possible scopes for conjuncts that are permissible with regard to clause structure. Experiment 4 reranks a combination of parses from experiments 1 and 3. The experiments presented show that n- best parsing combined with reranking improves results by a large margin. Providing the parser with different scope possibilities and reranking the resulting parses results in an increase in F-score from 69.76 for the baseline to 74.69. While the F-score is similar to the one of the first experiment (n-best parsing and reranking), the first experiment results in higher recall (75.48% vs. 73.69%) and the third one in higher precision (75.43% vs. 73.26%). Combining the two methods results in the best result with an F-score of 76.69.
The aim of this paper is to address two main counterarguments raised in Landau (2007) against the movement analysis of Control, and especially against the phenomenon of Backward Control. The paper shows that unlike the situation described in Tsez (Polinsky & Potsdam 2002), Landau's objections do not hold for Greek and Romanian, where all obligatory control verbs exhibit Backward Control. Our results thus provide stronger empirical support for a theoretical approach to Control in terms of Movement, as defended in Hornstein (1999 and subsequent work).
We show that loanword adaptation can be understood entirely in terms of phonological and phonetic comprehension and production mechanisms in the first language. We provide explicit accounts of several loanword adaptation phenomena (in Korean) in terms of an Optimality-Theoretic grammar model with the same three levels of representation that are needed to describe L1 phonology: the underlying form, the phonological surface form, and the auditory-phonetic form. The model is bidirectional, i.e., the same constraints and rankings are used by the listener and by the speaker. These constraints and rankings are the same for L1 processing and loanword adaptation.
In the recent literature the phenomenon of long distance agreement has become the focus of several studies as it seems to violate certain locality conditions which require that agreeing elements in general stand in clause-mate relationships. In particular, it involves a verb agreeing with a constituent which is located in the verb's clausal complement and hence poses a challenge for theories that assume a strictly local relationship for agreement. In this paper we present empirical evidence from Greek and Romanian for the reality of long distance agreement. Specifically, we focus on raising constructions in these two languages and we show that they do not involve movement but rather instantiate long distance agreement. We further argue that subjunctives allowing long distance agreement lack both a CP layer and semantic Tense. However, since the embedded verb also bears phi-features, these constructions pose a further problem for assumptions that view the presence of phi-features as evidence for the presence of a C layer. Finally, we raise the question of the common properties that these languages have that lead to the presence of long distance agreement.
Distributional approximations to lexical semantics are very useful not only in helping the creation of lexical semantic resources (Kilgariff et al., 2004; Snow et al., 2006), but also when directly applied in tasks that can benefit from large-coverage semantic knowledge such as coreference resolution (Poesio et al., 1998; Gasperin and Vieira, 2004; Versley, 2007), word sense disambiguation (Mc- Carthy et al., 2004) or semantical role labeling (Gordon and Swanson, 2007). We present a model that is built from Webbased corpora using both shallow patterns for grammatical and semantic relations and a window-based approach, using singular value decomposition to decorrelate the feature space which is otherwise too heavily influenced by the skewed topic distribution of Web corpora.
We present a CYK and an Earley-style algorithm for parsing Range Concatenation Grammar (RCG), using the deductive parsing framework. The characteristic property of the Earley parser is that we use a technique of range boundary constraint propagation to compute the yields of non-terminals as late as possible. Experiments show that, compared to previous approaches, the constraint propagation helps to considerably decrease the number of items in the chart.