74 search hits
-
A comunicação lingüística de uma perspectiva da Fenomenologia de E. Husserl
(2009)
-
Raquel Cardoso de Castro
Murilo Cardoso de Castro
João Cardoso de Castro
-
A situação da alfabetização dos falantes de línguas de imigração no contexto brasileiro
(2009)
-
Cristiane Horst
-
O ensino/aprendizagem da metafonia do português como língua estrangeira por aprendizes alemães
(2009)
-
Marcelo Jacó Krug
-
O Aprendizado do alemão-padrão por alunos bilíngües : pesquisas e ações
(2009)
-
Karen Pupp Spinassé
-
Deutsche Sprache, schwere Sprache? : Einsichten aus Spracherwerbsforschung und Sprachförderung
(2009)
-
Ulrich Labonté
Angela Grimm
Anja Kersten
Barbara Kleissendorf
Geeske Strecker
Petra Schulz
-
Parsing coordinations
(2009)
-
Sandra Kübler
Erhard Hinrichs
Wolfgang Maier
Eva Klett
- The present paper is concerned with statistical parsing of constituent structures in German. The paper presents four experiments that aim at improving parsing performance of coordinate structure: 1) reranking the n-best parses of a PCFG parser, 2) enriching the input to a PCFG parser by gold scopes for any conjunct, 3) reranking the parser output for all possible scopes for conjuncts that are permissible with regard to clause structure. Experiment 4 reranks a combination of parses from experiments 1 and 3. The experiments presented show that n- best parsing combined with reranking improves results by a large margin. Providing the parser with different scope possibilities and reranking the resulting parses results in an increase in F-score from 69.76 for the baseline to 74.69. While the F-score is similar to the one of the first experiment (n-best parsing and reranking), the first experiment results in higher recall (75.48% vs. 73.69%) and the third one in higher precision (75.43% vs. 73.26%). Combining the two methods results in the best result with an F-score of 76.69.
-
Decorrelation and shallow semantic patterns for distributional clustering of nouns and verbs
(2009)
-
Yannick Versley
- Distributional approximations to lexical semantics are very useful not only in helping the creation of lexical semantic resources (Kilgariff et al., 2004; Snow et al., 2006), but also when directly applied in tasks that can benefit from large-coverage semantic knowledge such as coreference resolution (Poesio et al., 1998; Gasperin and Vieira, 2004; Versley, 2007), word sense disambiguation (Mc- Carthy et al., 2004) or semantical role labeling (Gordon and Swanson, 2007). We present a model that is built from Webbased corpora using both shallow patterns for grammatical and semantic relations and a window-based approach, using singular value decomposition to decorrelate the feature space which is otherwise too heavily influenced by the skewed topic distribution of Web corpora.
-
A Testsuite for Testing Parser Performance onComplex German Grammatical Constructions
(2009)
-
Sandra Kübler
Ines Rehbein
Josef van Genabith
- Traditionally, parsers are evaluated against gold standard test data. This can cause problems if there is a mismatch between the data structures and representations used by the parser and the gold standard. A particular case in point is German, for which two treebanks (TiGer and TüBa-D/Z) are available with highly different annotation schemes for the acquisition of (e.g.) PCFG parsers. The differences between the TiGer and TüBa-D/Z annotation schemes make fair and unbiased parser evaluation difficult [7, 9, 12]. The resource (TEPACOC) presented in this paper takes a different approach to parser evaluation: instead of providing evaluation data in a single annotation scheme, TEPACOC uses comparable sentences and their annotations for 5 selected key grammatical phenomena (with 20 sentences each per phenomena) from both TiGer and TüBa-D/Z resources. This provides a 2 times 100 sentence comparable testsuite which allows us to evaluate TiGer-trained parsers against the TiGer part of TEPACOC, and TüBa-D/Z-trained parsers against the TüBa-D/Z part of TEPACOC for key phenomena, instead of comparing them against a single (and potentially biased) gold standard. To overcome the problem of inconsistency in human evaluation and to bridge the gap between the two different annotation schemes, we provide an extensive error classification, which enables us to compare parser output across the two different treebanks. In the remaining part of the paper we present the testsuite and describe the grammatical phenomena covered in the data. We discuss the different annotation strategies used in the two treebanks to encode these phenomena and present our error classification of potential parser errors.
-
Modelling the formation of phonotactic restrictions across the mental lexicon
(2009)
-
Silke Hamann
Diana Apoussidou
Paul Boersma
- Experimental data shows that adult learners of an artificial language with a phonotactic restriction learned this restriction better when being trained on word types (e.g. when they were presented with 80 different words twice each) than when being trained on word tokens (e.g. when presented with 40 different words four times each) (Hamann & Ernestus submitted). These findings support Pierrehumbert’s (2003) observation that phonotactic co-occurrence restrictions are formed across lexical entries, since only lexical levels of representation can be sensitive to type frequencies.
-
Nominalization – lexical and syntactic aspects
(2009)
-
Manfred Bierwisch
- The main tenet of the present paper is the thesis that nominalization – like other cases of derivational morphology – is an essentially lexical phenomenon with well defined syntactic (and semantic) conditions and consequences. More specifically, it will be argued that the relation between a verb and the noun derived from it is subject to both systematic and idiosyncratic conditions with respect to lexical as well as syntactic aspects.