OPUS 4 | Search

A constraint-based approach to noun phrase coreference resolution in German newspaper text (2006)

In this paper, we investigate the usefulness of a wide range of features for their usefulness in the resolution of nominal coreference, both as hard constraints (i.e. completely removing elements from the list of possible candidates) as well as soft constraints (where a cumulation of violations of soft constraints will make it less likely that a candidate is chosen as the antecedent). We present a state of the art system based on such constraints and weights estimated with a maximum entropy model, using lexical information to resolve cases of coreferent bridging.

A Thousand and One Nights between Orient and Occident (2006)

Genz, Julia

No other country is influenced in its political, social and cultural structures by both western and eastern mentality such as Lebanon, and hardly any other country has such a pivotal function. In this mediator function it can be compared with a literary work, that merits its role in world literature as hardly any other piece of literature in regard to the co-operation of Orient and Occident. I am thinking of the collection of "A Thousand and One Nights", or with its original title "Alf Laila wa-Laila".

Advantageous fragmentation? : reimagining metropolitan governance and spatial planning in Rhine-Main (2006)

Hoyler, Michael ; Freytag, Tim ; Mager, Christoph

This paper traces the latest round of debates about appropriate scales and scopes of government and governance in Rhine-Main - an economically highly integrated but politically, territorially and emotionally divided region. We identify a downscaling of political power from the regional to the municipal level, and an upscaling of informal networking and image building to an extended regional scale. These countertrends are signs of a more complex geographical rearrangement in municipal and institutional relations. The inherent contradictions in the rescaling and reimagining of Rhine-Main are evident in the Strategic Vision for Frankfurt/Rhein-Main 2020. Its new conceptualization of Rhine-Main postulates complementary polycentricity as a competitive asset but remains firmly grounded in an institutional territorial logic that contravenes its own economically-driven agenda.

Annotation compatibility working group report (2006)

This report explores the question of compatibility between annotation projects including translating annotation formalisms to each other or to common forms. Compatibility issues are crucial for systems that use the results of multiple annotation projects. We hope that this report will begin a concerted effort in the field to track the compatibility of annotation schemes for part of speech tagging, time annotation, treebanking, role labeling and other phenomena.

Auxiliary selection and counterfactuality in the history of English and Germanic (2006)

McFadden, Thomas ; Alexiadou, Artemis

The retreat of BE as perfect auxiliary in the history of English is examined. Corpus data are presented showing that the initial advance of HAVE was most closely connected to a restriction against BE in past counterfactuals. Other factors which have been reported to favor the spread of HAVE are either dependent on the counterfactual effect, or significantly weaker in comparison. It is argued that the effect can be traced to the semantics of the BE perfect, which denoted resultativity rather than anteriority proper. Related data from other older Germanic and Romance languages are presented, and finally implications for existing theories of auxiliary selection stemming from the findings presented are discussed.

Bearding Ritter von Köchel in his lair (2006)

Zaslaw, Neal

As editor of the next iteration of the Köchel Catalogue, I have to deal with the current (sixth) edition’s Appendix C, devoted to "Doubtful and Misattributed Works." My goal is to reduce the potentially vast dimensions of that appendix to only those works for which some connection to Mozart cannot be ruled out. In the decades since 1964, when the current edition of Köchel was published, many of the works listed in Appendix C have been convincingly attributed to other composers. Other works therein can confidently be dismissed as never having had any meaningful connection to Mozart. Yet even after removing the reattributed and trivially misattributed works from the appendix, we are left with a handful of works that may possibly have had something to do with Mozart, even if clear evidence one way or the other remains elusive. One must, of course, be cautious in removing questionable and doubtful works from the catalogue, as the present case-study will illustrate. The work under consideration, catalogued as K6 Anh. C 9.07, is an unaccompanied piece for three or four voices with the text "Venerabilis barba capucinorum." ...

Breeding success of Black-tailed Godwits Limosa limosa under 'mosaic management' : an experimental agrienvironment scheme in The Netherlands (2006)

Schekkerman, Hans ; Teunissen, Wolf ; Oosterveld, Ernst

Black-tailed Godwits (Limosa limosa) have been declining for decades in The Netherlands and so far this has not been slowed by conservation measures. A new form of agri-environment scheme was tried out in 2003-2005 at 6 sites where a ‘grassland mosaic’ (200-300 ha) was created by collectives of farmers through a diverse use of fields including postponed and staggered mowing, (early) grazing, creating ‘refuge strips’ during mowing, and active nest protection. We measured breeding success of godwits in each of the experimental sites and nearby, paired controls. Breeding success was higher (0.28 chicks fledged / pair) in mosaics than in controls, but due to lower agricultural nest losses only. Chick survival was 11 % in both mosaics and controls. The amount of late-mown and other grassland suitable for chicks hardly differed between treatments during the fledging period, mainly due to rainfall delaying postponed mowing in all sites. Chick survival was however positively correlated with site variation in the amount of high grass (>18 cm). Breeding success was high enough to compensate for adult mortality (ca. 0.6) in only one mosaic site. Chick survival was lower than in previous Godwit studies, indicating that additional loss factors have increased. Predation (50-80 % of chicks, mostly by birds) is a candidate, but changes in the suitability of late-mown grassland (insect abundance and sward density in grass monocultures) may also play a role. Consequently a higher management investment is needed to achieve a self-sustaining population.

Changes in the fledging success over time with increasing population size in the Northern Lapwing Vanellus vanellus on Wangerooge Island (Lower Saxony, Germany) (2006)

Schröder, Julia ; Heckroth, Mathias ; Clemens, Thomas

In this study, we report the results of a long-term investigation on changes in population size and fledging success of Northern Lapwing on Wangerooge, a German Wadden Sea island. This population is increasing over a period of 34 years in contrast to numerous populations in North-western Europe. The reproductive success however declines over time and also with population density. Both effects cannot be considered separately due to autocorrelation. However, it is noted that the population on Wangerooge is not sustained by local recruitment only. This outcome is even more alarming as coastal areas and islands are considered as rare high quality meadow bird habitats. According to the present results Wangerooge cannot be considered as a source habitat for Northern Lapwings in North-western Germany.

Comparing lexicalized grammar formalisms in an empirically adequate way : the notion of generative attachment capacity (2006)

Kallmeyer, Laura

The work presented here addresses the question of how to determine whether a grammar formalism is powerful enough to describe natural languages. The expressive power of a formalism can be characterized in terms of i) the string languages it generates (weak generative capacity (WGC)) or ii) the tree languages it generates (strong generative capacity (SGC)). The notion of WGC is not enough to determine whether a formalism is adequate for natural languages. We argue that even SGC is problematic since the sets of trees a grammar formalism for natural languages should be able to generate is difficult to determine. The concrete syntactic structures assumed for natural languages depend very much on theoretical stipulations and empirical evidence for syntactic structures is rather hard to obtain. Therefore, for lexicalized formalisms, we propose to consider the ability to generate certain strings together with specific predicate argument dependencies as a criterion for adequacy for natural languages.

Constraint-based computational semantics : a comparison between LTAG and LRS (2006)

Kallmeyer, Laura ; Richter, Frank

This paper compares two approaches to computational semantics, namely semantic unification in Lexicalized Tree Adjoining Grammars (LTAG) and Lexical Resource Semantics (LRS) in HPSG. There are striking similarities between the frameworks that make them comparable in many respects. We will exemplify the differences and similarities by looking at several phenomena. We will show, first of all, that many intuitions about the mechanisms of semantic computations can be implemented in similar ways in both frameworks. Secondly, we will identify some aspects in which the frameworks intrinsically differ due to more general differences between the approaches to formal grammar adopted by LTAG and HPSG.

Disagreement dissected : vagueness as a source of ambiguity in nominal (co-)reference (2006)

Versley, Yannick

Using a qualitative analysis of disagreements from a referentially annotated newspaper corpus, we show that, in coreference annotation, vague referents are prone to greater disagreement. We show how potentially problematic cases can be dealt with in a way that is practical even for larger-scale annotation, considering a real-world example from newspaper text.

Effects of BPA in snails (2006)

Dietrich, Daniel R. ; O'Brien, Evelyn ; Hoffmann, Sebastian ; Balaguer, Patrique ; Nicolas, Jean-Claude ; Seinen, Willem ; Depledge, Michael H.

It is an ethical requirement that new findings be presented in light of and in conjunction with a balanced evaluation of the current knowledge and published literature. We believe that Oehlmann et al. (2006) violated this general principle in several ways. For example, the authors inferred that prosobranch snails have a functional estrogen receptor and therefore a much higher sensitivity to estrogens and endocrine-disrupting compounds (EDCs) than other species previously reported in the literature. We found several other problems in their article...

From surface dependencies towards deeper semantic representations [Semantic representations] (2006)

Versley, Yannick ; Zinsmeister, Heike

In the past, a divide could be seen between ’deep’ parsers on the one hand, which construct a semantic representation out of their input, but usually have significant coverage problems, and more robust parsers on the other hand, which are usually based on a (statistical) model derived from a treebank and have larger coverage, but leave the problem of semantic interpretation to the user. More recently, approaches have emerged that combine the robustness of datadriven (statistical) models with more detailed linguistic interpretation such that the output could be used for deeper semantic analysis. Cahill et al. (2002) use a PCFG-based parsing model in combination with a set of principles and heuristics to derive functional (f-)structures of Lexical-Functional Grammar (LFG). They show that the derived functional structures have a better quality than those generated by a parser based on a state-of-the-art hand-crafted LFG grammar. Advocates of Dependency Grammar usually point out that dependencies already are a semantically meaningful representation (cf. Menzel, 2003). However, parsers based on dependency grammar normally create underspecified representations with respect to certain phenomena such as coordination, apposition and control structures. In these areas they are too "shallow" to be directly used for semantic interpretation. In this paper, we adopt a similar approach to Cahill et al. (2002) using a dependency-based analysis to derive functional structure, and demonstrate the feasibility of this approach using German data. A major focus of our discussion is on the treatment of coordination and other potentially underspecified structures of the dependency data input. F-structure is one of the two core levels of syntactic representation in LFG (Bresnan, 2001). Independently of surface order, it encodes abstract syntactic functions that constitute predicate argument structure and other dependency relations such as subject, predicate, adjunct, but also further semantic information such as the semantic type of an adjunct (e.g. directional). Normally f-structure is captured as a recursive attribute value matrix, which is isomorphic to a directed graph representation. Figure 5 depicts an example target f-structure. As mentioned earlier, these deeper-level dependency relations can be used to construct logical forms as in the approaches of van Genabith and Crouch (1996), who construct underspecified discourse representations (UDRSs), and Spreyer and Frank (2005), who have robust minimal recursion semantics (RMRS) as their target representation. We therefore think that f-structures are a suitable target representation for automatic syntactic analysis in a larger pipeline of mapping text to interpretation. In this paper, we report on the conversion from dependency structures to fstructure. Firstly, we evaluate the f-structure conversion in isolation, starting from hand-corrected dependencies based on the TüBa-D/Z treebank and Versley (2005)´s conversion. Secondly, we start from tokenized text to evaluate the combined process of automatic parsing (using Foth and Menzel (2006)´s parser) and f-structure conversion. As a test set, we randomly selected 100 sentences from TüBa-D/Z which we annotated using a scheme very close to that of the TiGer Dependency Bank (Forst et al., 2004). In the next section, we sketch dependency analysis, the underlying theory of our input representations, and introduce four different representations of coordination. We also describe Weighted Constraint Dependency Grammar (WCDG), the dependency parsing formalism that we use in our experiments. Section 3 characterises the conversion of dependencies to f-structures. Our evaluation is presented in section 4, and finally, section 5 summarises our results and gives an overview of problems remaining to be solved.

Generating and visualizing a soccer knowledge base (2006)

Buitelaar, Paul ; Eigner, Thomas ; Gulrajani, Greg ; Schutz, Alexander ; Siegel, Melanie ; Weber, Nicolas ; Cimiano, Philipp ; Ladwig, Günter ; Mantel, Matthias ; Zhu, Honggang

This demo abstract describes the SmartWeb Ontology-based Information Extraction System (SOBIE). A key feature of SOBIE is that all information is extracted and stored with respect to the SmartWeb ontology. In this way, other components of the systems, which use the same ontology, can access this information in a straightforward way. We will show how information extracted by SOBIE is visualized within its original context, thus enhancing the browsing experience of the end user.

Is it really that difficult to parse German? (2006)

Kübler, Sandra ; Hinrichs, Erhard ; Maier, Wolfgang

This paper presents a comparative study of probabilistic treebank parsing of German, using the Negra and TüBa-D/Z treebanks. Experiments with the Stanford parser, which uses a factored PCFG and dependency model, show that, contrary to previous claims for other parsers, lexicalization of PCFG models boosts parsing performance for both treebanks. The experiments also show that there is a big difference in parsing performance, when trained on the Negra and on the TüBa-D/Z treebanks. Parser performance for the models trained on TüBa-D/Z are comparable to parsing results for English with the Stanford parser, when trained on the Penn treebank. This comparison at least suggests that German is not harder to parse than its West-Germanic neighbor language English.

JACY - a grammar for annotating syntax, semantics and pragmatics of written and spoken japanese for NLP application purposes (2006)

Siegel, Melanie

In this text, we describe the development of a broad coverage grammar for Japanese that has been built for and used in different application contexts. The grammar is based on work done in the Verbmobil project (Siegel 2000) on machine translation of spoken dialogues in the domain of travel planning. The second application for JACY was the automatic email response task. Grammar development was described in Oepen et al. (2002a). Third, it was applied to the task of understanding material on mobile phones available on the internet, while embedded in the project DeepThought (Callmeier et al. 2004, Uszkoreit et al. 2004). Currently, it is being used for treebanking and ontology extraction from dictionary definition sentences by the Japanese company NTT (Bond et al. 2004).

Language and mediality : on the medial status of "everyday language" (2006)

Schneider, Jan Georg

The medium of (oral) language is mostly disregarded (or overlooked) in contemporary media theories. This "ignoring of language" in media studies is often accompanied by an inadequate transport model of communication, and it converges with an "ignoring of mediality" in mentalistic theories of language. In the present article it will be argued that this misleading opposition of language and media can only be overcome if one already regards oral language, not just written language, as a medium of the human mind. In my argumentation I fall back on Wittgenstein’s conception of language games to try to show how Wittgenstein’s ideas can help us to clear up the problem of the mediality of language and also to show to what extent the mentalistic conception of Chomskyan provenance cannot be adequate to the phenomenon of language.

Lizards of Ethiopia (Reptilia Sauria) : an annotated checklist, bibliography, gazetteer and identification key (2006)

Largen, Malcolm ; Spawls, Stephen

This review lists Agama smithii Boulenger 1896 as a synonym of Agama agama (Linnaeus 1758), Agama trachypleura Peters 1982 as a synonym of Acanthocercus phillipsii (Boulenger 1895) and describes for the first time Acanthocercus guentherpetersi n. sp. Without more convincing evidence, Chamaeleon ruspolii Boettger 1893 cannot be accepted as specifically distinct from Chamaeleo dilepis Leach 1819, nor Chamaeleo calcaricarens Böhme 1985 from C. africanus Laurenti 1768. Consequently, 101 species of lizard are currently recognised in Ethiopia, of which some 40% appear to be denizens of the Somali-arid zone. This significant proportion is attributable in part to the importance of the Horn of Africa as a centre for reptilian diversification and endemicity, in part to the fact that this lowland fauna was rather extensively sampled during the 1930s, but also to the conspicuous neglect of lizards in other regions of the country. Mountain and forested habitats are widespread in Ethiopia, so it seems extraordinary to record only five saurian species which are believed to be endemic in such environments. The inference that there are many more still to be discovered has important implications for conservation, because montane forest is known to be among the most threatened of Ethiopian biomes and there is clearly an urgent need for its herpetofauna to be more thoroughly researched and documented.

New perspectives on Müller, Meyer, Schmidt : computer-based surname geography and the German Surname Atlas Project (2006)

Nübling, Damaris ; Kunze, Konrad

In order to understand the specific structures and features of the German surnames the most important facts about their emergence and history should be outlined and, at the same time, be compared with the Swedish surnames because there are considerable differences (for further details cf. Nubling 1997 a, b). First of all, surnames in Germany emerged rather early, with the first instances occurring in the 11th century in southern Germany; by the 16th century surnames were common all over Germany. Differences are related to geography (from south to north), social class (from the upper to the lower classes) und urban versus rural areas.

Ontology-based Information Extraction with SOBA (2006)

Buitelaar, Paul ; Cimiano, Philipp ; Racioppa, Stefania ; Siegel, Melanie

In this paper we describe SOBA, a sub-component of the SmartWeb multi-modal dialog system. SOBA is a component for ontologybased information extraction from soccer web pages for automatic population of a knowledge base that can be used for domainspecific question answering. SOBA realizes a tight connection between the ontology, knowledge base and the information extraction component. The originality of SOBA is in the fact that it extracts information from heterogeneous sources such as tabular structures, text and image captions in a semantically integrated way. In particular, it stores extracted information in a knowledge base, and in turn uses the knowledge base to interpret and link newly extracted information with respect to already existing entities.

Predation on meadowbirds in The Netherlands : results of a four-year study (2006)

Teunissen, Wolf ; Schekkerman, Hans ; Willems, Frank

Meadowbird populations in The Netherlands are under great pressure. Recently, predation is named increasingly often as one of the key factors in contributing to the declines. A four-year research project (2001-2005) aimed to collect (as yet mostly nonexisting) data to provide a factual basis for this discussion. A country-wide inventory based on data for wader nests found by volunteers who mark nests for their protection from grazing/mowing indicated that above-average predation losses are found predominantly in the half-open landscapes of northern and eastern Netherlands, but also locally in the low-lying open grasslands which are the key areas for meadowbirds. Nest predation has increased in recent years, but the same is true for agricultural losses, at least in areas where no nest-protection takes place. At a local scale, predation losses vary greatly from area to area and from year to year. Temperature loggers in nest showed that diurnal and nocturnal predators contribute equally in total predation losses up to 50%, but higher predation losses are mainly caused by nocturnal predators. As many as 10 animal species were identified as nest predators on nests under surveillance with video cameras. Chick survival, investigated using radiotelemetry, was very low. About 60-80% were lost by predation, 5-15% by agricultural activities and 10-15% to all kind of other losses. At least 15 predator species were implied, with an apparently larger share taken by birds (notably Buzzard (16%) and Grey Heron (7-18%)) than mammals, with one exception: stoat (16%). Of the most-discussed predator species, Carrion Crows were W. Teunissen et al. Osnabrücker Naturwiss. Mitt. 32 2006 138 remarkably rarely involved in both nest and chick predation, while Red Foxes take a large toll of clutches in some areas, but not in others. Of all losses during the reproductive cycle about 75% and 60% was due to predation in Lapwing and Black-tailed Godwit respectively. Predation on chicks by birds had the largest effect on total breeding success, but at the same time elimination of this loss factor (if at all possible) alone would not be sufficient to establish a self-sustaining population. Predation seems to have become a factor of importance in some areas, in combination with already existing other losses. Our findings suggest that solutions to predation problems probably have to be found in locally/regionally targeted, specific action on multiple fronts rather than countrywide measures.

Quantifier scope in German : an MCTAG analysis (2006)

Kallmeyer, Laura ; Romero, Maribel

Relative quantifier scope in German depends, in contrast to English, very much on word order. The scope possibilities of a quantifier are determined by its surface position, its base position and the type of the quantifier. In this paper we propose a multicomponent analysis for German quantifiers computing the scope of the quantifier, in particular its minimal nuclear scope, depending on the syntactic configuration it occurs in.

Schweig und tanze! : Elektra, by Hugo von Hofmannsthal and Richard Strauss (2006)

Sheinberg, Esti

The last scene of Richard Strauss’ Elektra builds tension towards Elektra’s dance of victory and joy, leading to her ecstatic abdication in death. Indeed, the opera’s librettist Hugo von Hofmannsthal describes this scene in one of his sketches as Elektra’s "dissolution of self."

The importance of early breeding in Black-tailed Godwits (Limosa limosa) (2006)

Schröder, Julia ; Hooijmeijer, Jos ; Both, Christiaan ; Piersma, Theunis

Human impacts on the landscape have increased the penalties for Black-tailed Godwits laying their eggs too late, especially in the very intensive agricultural landscapes of The Netherlands. Thus, godwits have experienced a dramatic change of their fitness landscape, because the advance in mowing date made late clutches worthless destroying either eggs or chicks. To determine the driving forces of the recent population decline we study the individual variation in timing of breeding with respect to reproductive success in a population unaffected by mowing. Our results show that even in a low intensity agricultural area it is very important for godwits to breed early in the season.

The superstable marker as an indicator of categorial weakness? (2006)

Dammel, Antje ; Nübling, Damaris

In this article we examine and "exapt" Wurzel's concept of superstable markers in an innovative manner. We develop an extended view of superstability through a critical discussion of Wurzel's original definition and the status of marker-superstability versus allomorphy in Natural Morphology: As we understand it, superstability is - above and beyond a step towards uniformity - mainly a symptom for the weakening of the category affected (cf. 1.,2. and 4.). This view is exemplified in four short case studies on superstability in different grammatical categories of four Germanic languages: genitive case in Mainland Scandinavian and English (3.1), plural formation in Dutch (3.2), second person singular ending -st in German (3.3), and ablaut generalisation in Luxembourgish (3.4).

Towards case-based parsing : are chunks reliable indicators for syntax trees? (2006)

Kübler, Sandra

This paper presents an approach to the question whether it is possible to construct a parser based on ideas from case-based reasoning. Such a parser would employ a partial analysis of the input sentence to select a (nearly) complete syntax tree and then adapt this tree to the input sentence. The experiments performed on German data from the Tüba-D/Z treebank and the KaRoPars partial parser show that a wide range of levels of generality can be reached, depending on which types of information are used to determine the similarity between input sentence and training sentences. The results are such that it is possible to construct a case-based parser. The optimal setting out of those presented here need to be determined empirically.

Verb derivation in modern Greek inside alternation classes (2006)

Charitōnidēs, Charitōn Ch.

In this paper I present five alternations of the verb system of Modern Greek, which are recurrently mapped on the syntactic frame NPi__NP. The actual claim is that only the participation in alternations and/or the allocation to an alternation variant can reliably determine the relation between a verb derivative and its base. In the second part, the conceptual structures and semantic/situational fields of a large number of “-ízo” derivatives appearing inside alternation classes are presented. The restricted character of the conceptual and situational preferences inside alternations classes suggests the dominant character of the alternations component.

What linguists always wanted to know about german and did not know how to estimate (2006)

Hinrichs, Erhard ; Kübler, Sandra

This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogues, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.

Why is German dependency parsing more reliable than constituent parsing? (2006)

Kübler, Sandra ; Prokic, Jelena

In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used. Another direction in parsing research is the development of dependency parsers. Dependency parsing profits from the non-hierarchical nature of dependency relations, thus lexical information can be included in the parsing process in a much more natural way. Especially machine learning based approaches are very successful (cf. e.g.). The results achieved by these dependency parsers are very competitive although comparisons are difficult because of the differences in annotation. For English, the Penn Treebank has been converted to dependencies. For this version, Nivre et al. report an accuracy rate of 86.3%, as compared to an F-score of 92.1 for Charniaks parser. The Penn Chinese Treebank is also available in a constituent and a dependency representations. The best results reported for parsing experiments with this treebank give an F-score of 81.8 for the constituent version and 79.8% accuracy for the dependency version. The general trend in comparisons between constituent and dependency parsers is that the dependency parser performs slightly worse than the constituent parser. The only exception occurs for German, where F-scores for constituent plus grammatical function parses range between 51.4 and 75.3, depending on the treebank, NEGRA or TüBa-D/Z. The dependency parser based on a converted version of Tüba-D/Z, in contrast, reached an accuracy of 83.4%, i.e. 12 percent points better than the best constituent analysis including grammatical functions.

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

29 search hits