OPUS 4 | Linguistik

The CoNLL 2007 shared task on dependency parsing (2007)

Nivre, Joakim ; Hall, Johan ; Kübler, Sandra ; McDonald, Ryan ; Nilsson, Jens ; Riedel, Sebastian ; Yuret, Deniz

The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In this paper, we define the tasks of the different tracks and describe how the data sets were created from existing treebanks for ten languages. In addition, we characterize the different approaches of the participating systems, report the test results, and provide a first analysis of these results.

Sometimes less is more : Romanian word sense disambiguation revisited (2007)

Dinu, Georgiana ; Kübler, Sandra

Recent approaches to Word Sense Disambiguation (WSD) generally fall into two classes: (1) information-intensive approaches and (2) information-poor approaches. Our hypothesis is that for memory-based learning (MBL), a reduced amount of data is more beneficial than the full range of features used in the past. Our experiments show that MBL combined with a restricted set of features and a feature selection method that minimizes the feature set leads to competitive results, outperforming all systems that participated in the SENSEVAL-3 competition on the Romanian data. Thus, with this specific method, a tightly controlled feature set improves the accuracy of the classifier, reaching 74.0% in the fine-grained and 78.7% in the coarse-grained evaluation.

Combining dependency parsing with PP attachment (2007)

Kübler, Sandra ; Ivanova, Steliana ; Klett, Eva

Prepositional phrase (PP) attachment is one of the major sources for errors in traditional statistical parsers. The reason for that lies in the type of information necessary for resolving structural ambiguities. For parsing, it is assumed that distributional information of parts-of-speech and phrases is sufficient for disambiguation. For PP attachment, in contrast, lexical information is needed. The problem of PP attachment has sparked much interest ever since Hindle and Rooth (1993) formulated the problem in a way that can be easily handled by machine learning approaches: In their approach, PP attachment is reduced to the decision between noun and verb attachment; and the relevant information is reduced to the two possible attachment sites (the noun and the verb) and the preposition of the PP. Brill and Resnik (1994) extended the feature set to the now standard 4-tupel also containing the noun inside the PP. Among many publications on the problem of PP attachment, Volk (2001; 2002) describes the only system for German. He uses a combination of supervised and unsupervised methods. The supervised method is based on the back-off model by Collins and Brooks (1995), the unsupervised part consists of heuristics such as ”If there is a support verb construction present, choose verb attachment”. Volk trains his back-off model on the Negra treebank (Skut et al., 1998) and extracts frequencies for the heuristics from the ”Computerzeitung”. The latter also serves as test data set. Consequently, it is difficult to compare Volk’s results to other results for German, including the results presented here, since not only he uses a combination of supervised and unsupervised learning, but he also performs domain adaptation. Most of the researchers working on PP attachment seem to be satisfied with a PP attachment system; we have found hardly any work on integrating the results of such approaches into actual parsers. The only exceptions are Mehl et al. (1998) and Foth and Menzel (2006), both working with German data. Mehl et al. report a slight improvement of PP attachment from 475 correct PPs out of 681 PPs for the original parser to 481 PPs. Foth and Menzel report an improvement of overall accuracy from 90.7% to 92.2%. Both integrate statistical attachment preferences into a parser. First, we will investigate whether dependency parsing, which generally uses lexical information, shows the same performance on PP attachment as an independent PP attachment classifier does. Then we will investigate an approach that allows the integration of PP attachment information into the output of a parser without having to modify the parser: The results of an independent PP attachment classifier are integrated into the parse of a dependency parser for German in a postprocessing step.

Reflexives and reciprocals in LTAG (2007)

Kallmeyer, Laura ; Romero, Maribel

This paper presents an LTAG analysis of reflexives like himself and reciprocals like each other. These items need to find a c-commanding antecedent from which they retrieve (part of) their own denotation and with which they syntactically agree. The relation between anaphoric item and antecendent must satisfy the following important locality conditions (Chomsky (1981)).

Feature logic-based semantic composition : a comparison between LRS and LTAG (2007)

Richter, Frank ; Kallmeyer, Laura

In this paper we will explore the similarities and differences between two feature logic-based approaches to the composition of semantic representations. The first approach is formulated for Lexicalized Tree Adjoining Grammar (LTAG, Joshi and Schabes 1997), the second is Lexical Ressource Semantics (LRS, Richter and Sailer 2004) and was first defined in Head-driven Phrase Structure Grammar. The two frameworks have several common characteristics that make them easy to compare: 1 They use languages of two sorted type theory for semantic representations. 2. They allow underspecification. LTAG uses scope constraints while LRS provides component-of contraints. 3 They use feature logics for computing semantic representations. 4. they are designed for computational applications. By comparing the two frameworks we will also point outsome characteristics and advantages of feature logic-based semantic computation in genereal.

XMG : eXtending MetaGrammars to MCTAG (2007)

Parmentier, Yannick ; Kallmeyer, Laura ; Lichte, Timm ; Maier, Wolfgang

In this paper, we introduce an extension of the XMG system (eXtensibleMeta-Grammar) in order to allow for the description of Multi-Component Tree Adjoining Grammars. In particular, we introduce the XMG formalism and its implementation, and show how the latter makes it possible to extend the system relatively easily to different target formalisms, thus opening the way towards multi-formalism.

A declarative characterization of different types of multicomponent tree adjoining grammars (2007)

Kallmeyer, Laura

Multicomponent Tree Adjoining Grammars (MCTAG) is a formalism that has been shown to be useful for many natural language applications. The definition of MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. This way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. In this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees (in the underlying TAG) the MCTAG licences. This definition gives a better understanding of the formalism, it allows a more systematic comparison of different types of MCTAG, and, furthermore, it can be exploited for parsing.

Perfects, resultatives and auxiliaries in early English (2007)

McFadden, Thomas ; Alexiadou, Artemis

In this paper, we will argue for a novel analysis of the auxiliary alternation in Early English, its development and subsequent loss which has broader consequences for the way that auxiliary selection is looked at cross-linguistically. We will present evidence that the choice of auxiliaries accompanying past participles in Early English differed in several significant respects from that in the familiar modern European languages. Specifically, while the construction with have became a full-fledged perfect by some time in the ME period, that with be was actually a stative resultative, which it remained until it was lost. We will show that this accounts for some otherwise surprising restrictions on the distribution of BE in Early English and allows a better understanding of the spread of HAVE through late ME and EModE. Perhaps more importantly, the Early English facts also provide insight into the genesis of the kind of auxiliary selection found in German, Dutch and Italian. Our analysis of them furthermore suggests a promising strategy for explaining cross-linguistic variation in auxiliary selection in terms of variation in the syntactico-semantic structure of the perfect. In this introductory section, we will first provide some background on the historical situation we will be discussing, then we will lay out the main claims for which we will be arguing in the paper.

The subject-in-situ generalization revisited (2007)

Alexiadou, Artemis ; Anagnostopoulou, Elena

The goal of this paper is to re-examine the status of the condition in (1) proposed in Alexiadou and Anagnostopoulou (2001; henceforth A&A 2001), in view of recent developments in syntactic theory. (1) The subject-in-situ generalization (SSG) By Spell-Out, vP can contain only one argument with a structural Case feature. We argue that (1) is a more general condition than previously recognized, and that the domain of its application is parametrized. More specifically, based on a comparison between Indo-European (IE) and Khoisan languages, we argue that (1) supports an interpretation of the EPP as a general principle, and not as a property of T. Viewed this way, the SSG is a condition that forces dislocation of arguments as a consequence of a constraint on Case checking.

Typisch männlich? Eindeutig weiblich? : Über die Wechselwirkung zwischen Wirklichkeit und Klischee ; wie Werbesprache Stereotype fortschreibt (2007)

Motschenbacher, Heiko

Webliteralität : Lesen und Schreiben im World Wide Web (2007)

Dieter, Jörg

Worum geht es in dieser Arbeit? Dies ist eine Arbeit über Websites. Darüber, wie sie gelesen und geschrieben werden und wie man das lernen kann. Da es in dieser Arbeit um Lesen, Schreiben und Lernen geht, fließen in sie sowohl Aspekte der Sprachwissenschaft als auch der Sprachdidaktik ein. Was will diese Arbeit? Diese Arbeit hat zwei Ziele, ein sprachwissenschaftliches und ein sprachdidaktisches. In sprachwissenschaftlicher Hinsicht sollen, auf der Grundlage einer gründlichen Analyse seiner Eigenschaften, die Besonderheiten des Lesens und Schreibens im World Wide Web herausgearbeitet werden. Aufbauend auf dieser Analyse sollen im sprachdidaktischen Teil der Arbeit die Kompetenzen ermittelt und in Beziehung zueinander gesetzt werden, die zur Erstellung von Websites notwendig sind. Das so entstehende Kompetenzmodell bildet die Basis für eine zielgerichtete, effektive und evaluierbare Umsetzung der Gestaltung von Websites in der Schule und die Grundlage für weiterführende empirische Arbeiten. Wie ist die Arbeit aufgebaut? Im ersten Kapitel der Arbeit wird die Entwicklung der technischen und strukturellen Formate geschildert, welche die Grundlage des Websiteformats bilden. Darauf aufbauend werden seine wichtigsten Eigenschaften beschrieben. Im zweiten Kapitel wird das Websiteformat von anderen kommunikativen Formaten abgegrenzt und mit Hilfe der besonderen Charakteristika, die es besitzt, sein überwältigender Erfolg erklärt. Im dritten Kapitel wird unter Rückgriff auf Ergebnisse der Leseforschung und empirische Untersuchungen zum Lesen im World Wide Web erarbeitet, welchen Einfluss das Websiteformat auf das Lesen von Texten hat und welche Unterschiede es zum Lesen von Texten in anderen kommunikativen Formaten gibt. Auf dieser Grundlage wird ein Bewertungs- und Analyseraster für die Lesbarkeit von Texten im Websiteformat entwickelt. Im vierten Kapitel wird auf der Grundlage verschiedener Modelle des Schreibprozesses dargestellt, was das Schreiben für das Websiteformat vom Schreiben für andere Formate unterscheidet, was dabei besonders beachtet werden muss und welche Entwicklungen für die Zukunft zu erwarten sind. Dabei werden, unter Berücksichtigung des in Kapitel drei erarbeiteten Bewertungs- und Analyserasters, Hinweise für eine sinnvolle Vorgehensweise bei der Gestaltung von Websites gegeben. Im fünften Kapitel wird vor dem Hintergrund der aktuellen bildungspolitischen Diskussion ein Kompetenzmodell für die Gestaltung von Websites entwickelt, das als Basis für die Festlegung von Bildungsstandards und die Beschreibung der Rahmenbedingungen dient, unter denen diese in der Schule verwirklicht werden können. In einer abschließenden Diskussion werden die wichtigsten Ergebnisse nochmals herausgearbeitet und es wird auf Perspektiven für zukünftige sprachwissenschaftliche und sprachdidaktische Forschungsvorhaben hingewiesen.

Open Access

Linguistik

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

211 search hits