OPUS 4 | Linguistik

Is it really that difficult to parse German? (2006)

Kübler, Sandra ; Hinrichs, Erhard ; Maier, Wolfgang

This paper presents a comparative study of probabilistic treebank parsing of German, using the Negra and TüBa-D/Z treebanks. Experiments with the Stanford parser, which uses a factored PCFG and dependency model, show that, contrary to previous claims for other parsers, lexicalization of PCFG models boosts parsing performance for both treebanks. The experiments also show that there is a big difference in parsing performance, when trained on the Negra and on the TüBa-D/Z treebanks. Parser performance for the models trained on TüBa-D/Z are comparable to parsing results for English with the Stanford parser, when trained on the Penn treebank. This comparison at least suggests that German is not harder to parse than its West-Germanic neighbor language English.

Annotation compatibility working group report (2006)

This report explores the question of compatibility between annotation projects including translating annotation formalisms to each other or to common forms. Compatibility issues are crucial for systems that use the results of multiple annotation projects. We hope that this report will begin a concerted effort in the field to track the compatibility of annotation schemes for part of speech tagging, time annotation, treebanking, role labeling and other phenomena.

Disagreement dissected : vagueness as a source of ambiguity in nominal (co-)reference (2006)

Versley, Yannick

Using a qualitative analysis of disagreements from a referentially annotated newspaper corpus, we show that, in coreference annotation, vague referents are prone to greater disagreement. We show how potentially problematic cases can be dealt with in a way that is practical even for larger-scale annotation, considering a real-world example from newspaper text.

From surface dependencies towards deeper semantic representations [Semantic representations] (2006)

Versley, Yannick ; Zinsmeister, Heike

In the past, a divide could be seen between ’deep’ parsers on the one hand, which construct a semantic representation out of their input, but usually have significant coverage problems, and more robust parsers on the other hand, which are usually based on a (statistical) model derived from a treebank and have larger coverage, but leave the problem of semantic interpretation to the user. More recently, approaches have emerged that combine the robustness of datadriven (statistical) models with more detailed linguistic interpretation such that the output could be used for deeper semantic analysis. Cahill et al. (2002) use a PCFG-based parsing model in combination with a set of principles and heuristics to derive functional (f-)structures of Lexical-Functional Grammar (LFG). They show that the derived functional structures have a better quality than those generated by a parser based on a state-of-the-art hand-crafted LFG grammar. Advocates of Dependency Grammar usually point out that dependencies already are a semantically meaningful representation (cf. Menzel, 2003). However, parsers based on dependency grammar normally create underspecified representations with respect to certain phenomena such as coordination, apposition and control structures. In these areas they are too "shallow" to be directly used for semantic interpretation. In this paper, we adopt a similar approach to Cahill et al. (2002) using a dependency-based analysis to derive functional structure, and demonstrate the feasibility of this approach using German data. A major focus of our discussion is on the treatment of coordination and other potentially underspecified structures of the dependency data input. F-structure is one of the two core levels of syntactic representation in LFG (Bresnan, 2001). Independently of surface order, it encodes abstract syntactic functions that constitute predicate argument structure and other dependency relations such as subject, predicate, adjunct, but also further semantic information such as the semantic type of an adjunct (e.g. directional). Normally f-structure is captured as a recursive attribute value matrix, which is isomorphic to a directed graph representation. Figure 5 depicts an example target f-structure. As mentioned earlier, these deeper-level dependency relations can be used to construct logical forms as in the approaches of van Genabith and Crouch (1996), who construct underspecified discourse representations (UDRSs), and Spreyer and Frank (2005), who have robust minimal recursion semantics (RMRS) as their target representation. We therefore think that f-structures are a suitable target representation for automatic syntactic analysis in a larger pipeline of mapping text to interpretation. In this paper, we report on the conversion from dependency structures to fstructure. Firstly, we evaluate the f-structure conversion in isolation, starting from hand-corrected dependencies based on the TüBa-D/Z treebank and Versley (2005)´s conversion. Secondly, we start from tokenized text to evaluate the combined process of automatic parsing (using Foth and Menzel (2006)´s parser) and f-structure conversion. As a test set, we randomly selected 100 sentences from TüBa-D/Z which we annotated using a scheme very close to that of the TiGer Dependency Bank (Forst et al., 2004). In the next section, we sketch dependency analysis, the underlying theory of our input representations, and introduce four different representations of coordination. We also describe Weighted Constraint Dependency Grammar (WCDG), the dependency parsing formalism that we use in our experiments. Section 3 characterises the conversion of dependencies to f-structures. Our evaluation is presented in section 4, and finally, section 5 summarises our results and gives an overview of problems remaining to be solved.

Linguosomatische Sprachdidaktik : Wissenschaftstheoretische Grundlegung (2006)

Herrmann, Wolfgang

Die Theorie des sprachlichen Lernens und Lehrens ist bis in die siebziger Jahre des 20. Jahrhunderts hinein eine "Meisterlehre" (Müller-Michaels 1980) gewesen. Große Vorbilder eines Volkes (z.B. Mose), Leiter philosophischer Schulen (z.B. Platon) oder Äbte von Klöstern (z.B. Augustinus) und schließlich staatlich geprüfte Oberstudiendirektoren (z.B. Ulshöfer) beschrieben den jüngeren Kollegen, was sich beim Lehren der Sprache über Jahrzehnte bewährt habe: wie man am besten den Sprachunterricht erteile (Müller 1922, Seidemann 1973, Ulshöfer 1968, Essen 1968). Mit der Etablierung der Sprachdidaktiken an den Universitäten ist das Konzept der "norm-setzenden Handlungswissenschaften" Müller-Michaels 1980, Ivo 1975) entwickelt worden. Der Forscher (nicht mehr als Meister der Praxis ausgewiesen) untersucht die Prozesse des sprachlichen Lehrens und Lernens, indem er im "Feld" des Praktikers Erhebungen anstellt, um anschließend die erhobenen Daten einer Hypothesenprüfung zu unterziehen. Als Handlungsfeld wird besonders die Schule berücksichtigt. Die Methoden der Forschung sind vorwiegend "quasi-experimentell". In der Nachfolge der Sprachtheorie Chomsky´s (Chomsky 1965) sind die experimentellen Ansätze zur Untersuchung des Spracherwerbs, der Spracherwerbsstörung und der betreffenden Interventionen entwickelt worden (de Villiers/ de Villiers 1970, Hörmann 1978). Ort der Untersuchung ist das Labor. Das Design dieser Sprachdidaktik (bzw. Psycholinguistik, Kognitionswissenschaften etc.) ist experimentell (z.B. Herrmann 2004). Alle drei Konzepte stehen sich in vielerlei Hinsicht antagonistisch gegenüber. Sie auseinander zu halten - und andererseits mit Gewinn aufeinander zu beziehen -, gehört zu den Basis-Fähigkeiten der linguosomatischen Berufe und ihrer zugrundeliegenden Theorie (Beispiel Sprachlehrberufe, Phoniatrie, Sprachheil-Sonderpädagogik, psychosomatische Sprachtherapien). Daher sind die signifikanten Gegensätze der drei Konzepte herauszuarbeiten und ihre widerstrebenden Konsequenzen aufeinander zu beziehen.

Constraint-based computational semantics : a comparison between LTAG and LRS (2006)

Kallmeyer, Laura ; Richter, Frank

This paper compares two approaches to computational semantics, namely semantic unification in Lexicalized Tree Adjoining Grammars (LTAG) and Lexical Resource Semantics (LRS) in HPSG. There are striking similarities between the frameworks that make them comparable in many respects. We will exemplify the differences and similarities by looking at several phenomena. We will show, first of all, that many intuitions about the mechanisms of semantic computations can be implemented in similar ways in both frameworks. Secondly, we will identify some aspects in which the frameworks intrinsically differ due to more general differences between the approaches to formal grammar adopted by LTAG and HPSG.

Quantifier scope in German : an MCTAG analysis (2006)

Kallmeyer, Laura ; Romero, Maribel

Relative quantifier scope in German depends, in contrast to English, very much on word order. The scope possibilities of a quantifier are determined by its surface position, its base position and the type of the quantifier. In this paper we propose a multicomponent analysis for German quantifiers computing the scope of the quantifier, in particular its minimal nuclear scope, depending on the syntactic configuration it occurs in.

Instrument subjects are agents or causers (2006)

Alexiadou, Artemis ; Schäfer, Florian

It has often been noticed that one syntactic argument position can be realized by elements which seem to realize different thematic roles. This is notably the case with the external argument position of verbs of change of state which licenses volitional agents, instruments or natural forces/causers, showing the generality and abstractness of the external argument relation. (1) a. John broke the window (Agent) b. The hammer broke the window (Instrument) c. The storm broke the window (Causer) In order to capture this generality, Van Valin & Wilkins (1996) and Ramchand (2003) among others have proposed that the thematic role of the external argument position is in fact underspecified. The relevant notion is that of an effector (in Van Valin & Wilkins) or of an abstract causer/initiator (in Ramchand). In this paper we argue against a total underspecification of the external argument relation. While we agree that (1b) does not instantiate an instrument theta role in subject position, we argue that a complete underspecification of the external theta-position is not feasible, but that two types of external theta roles have to be distinguished, Agents and Causers. Our arguments are based on languages where Agents and Causers show morpho-syntactic independence (section 2.1) and the behavior of instrument subjects in English, Dutch, German and Greek (section 2.2 and 3). We show that instrument subjects are either Agent or Causer like. In section (4) we give an analysis how arguments realizing these thematic notions are introduced into syntax.

From hierarchies to features : person splits and direct-inverse alternations (2006)

Alexiadou, Artemis ; Anagnostopoulou, Elena

In the recent literature there is growing interest in the morpho-syntactic encoding of hierarchical effects. The paper investigates one domain where such effects are attested: ergative splits conditioned by person. This type of splits is then compared to hierarchical effects in direct-inverse alternations. On the basis of two case studies (Lummi instantiating an ergative split person language and Passamaquoddy an inverse language) we offer an account that makes no use of hierarchies as a primitive. We propose that the two language types differ as far as the location of person features is concerned. In inverse systems person features are located exclusively in T, while in ergative systems, they are located in T and a particular type of v. A consequence of our analysis is that Case checking in split and inverse systems is guided by the presence/absence of specific phi-features. This in turn provides evidence for a close connection between Case and phi-features, reminiscent of Chomsky’s (2000, 2001) Agree.

Resumptive prolepsis : a study in indirect A'-dependencies (2006)

Salzmann, Martin

Open Access

Linguistik

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

3538 search hits