Refine
Year of publication
Document Type
- Article (181)
- Part of Periodical (69)
- Preprint (62)
- Book (37)
- Part of a Book (28)
- Conference Proceeding (26)
- Working Paper (15)
- Report (8)
- Doctoral Thesis (4)
- Other (3)
Language
- English (437) (remove)
Has Fulltext
- yes (437) (remove)
Is part of the Bibliography
- no (437) (remove)
Keywords
- Computerlinguistik (28)
- Deutsch (20)
- Syntax (16)
- Japanisch (15)
- new species (11)
- Grammatik (10)
- Multicomponent Tree Adjoining Grammar (9)
- Optimalitätstheorie (9)
- Maschinelle Übersetzung (8)
- Syntaktische Analyse (8)
Institute
- Extern (437) (remove)
We adopt Markert and Nissim (2005)’s approach of using the World Wide Web to resolve cases of coreferent bridging for German and discuss the strength and weaknesses of this approach. As the general approach of using surface patterns to get information on ontological relations between lexical items has only been tried on English, it is also interesting to see whether the approach works for German as well as it does for English and what differences between these languages need to be accounted for. We also present a novel approach for combining several patterns that yields an ensemble that outperforms the best-performing single patterns in terms of both precision and recall.
Multicomponent Tree Adjoining Grammars (MCTAG) is a formalism that has been shown to be useful for many natural language applications. The definition of MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. This way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. In this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees (in the underlying TAG) the MCTAG licences. This definition gives a better understanding of the formalism, it allows a more systematic comparison of different types of MCTAG, and, furthermore, it can be exploited for parsing.
In this paper, we investigate the usefulness of a wide range of features for their usefulness in the resolution of nominal coreference, both as hard constraints (i.e. completely removing elements from the list of possible candidates) as well as soft constraints (where a cumulation of violations of soft constraints will make it less likely that a candidate is chosen as the antecedent). We present a state of the art system based on such constraints and weights estimated with a maximum entropy model, using lexical information to resolve cases of coreferent bridging.
The work presented here addresses the question of how to determine whether a grammar formalism is powerful enough to describe natural languages. The expressive power of a formalism can be characterized in terms of i) the string languages it generates (weak generative capacity (WGC)) or ii) the tree languages it generates (strong generative capacity (SGC)). The notion of WGC is not enough to determine whether a formalism is adequate for natural languages. We argue that even SGC is problematic since the sets of trees a grammar formalism for natural languages should be able to generate is difficult to determine. The concrete syntactic structures assumed for natural languages depend very much on theoretical stipulations and empirical evidence for syntactic structures is rather hard to obtain. Therefore, for lexicalized formalisms, we propose to consider the ability to generate certain strings together with specific predicate argument dependencies as a criterion for adequacy for natural languages.
This review lists Agama smithii Boulenger 1896 as a synonym of Agama agama (Linnaeus 1758), Agama trachypleura Peters 1982 as a synonym of Acanthocercus phillipsii (Boulenger 1895) and describes for the first time Acanthocercus guentherpetersi n. sp. Without more convincing evidence, Chamaeleon ruspolii Boettger 1893 cannot be accepted as specifically distinct from Chamaeleo dilepis Leach 1819, nor Chamaeleo calcaricarens Böhme 1985 from C. africanus Laurenti 1768. Consequently, 101 species of lizard are currently recognised in Ethiopia, of which some 40% appear to be denizens of the Somali-arid zone. This significant proportion is attributable in part to the importance of the Horn of Africa as a centre for reptilian diversification and endemicity, in part to the fact that this lowland fauna was rather extensively sampled during the 1930s, but also to the conspicuous neglect of lizards in other regions of the country. Mountain and forested habitats are widespread in Ethiopia, so it seems extraordinary to record only five saurian species which are believed to be endemic in such environments. The inference that there are many more still to be discovered has important implications for conservation, because montane forest is known to be among the most threatened of Ethiopian biomes and there is clearly an urgent need for its herpetofauna to be more thoroughly researched and documented.
The medium of (oral) language is mostly disregarded (or overlooked) in contemporary media theories. This "ignoring of language" in media studies is often accompanied by an inadequate transport model of communication, and it converges with an "ignoring of mediality" in mentalistic theories of language. In the present article it will be argued that this misleading opposition of language and media can only be overcome if one already regards oral language, not just written language, as a medium of the human mind. In my argumentation I fall back on Wittgenstein’s conception of language games to try to show how Wittgenstein’s ideas can help us to clear up the problem of the mediality of language and also to show to what extent the mentalistic conception of Chomskyan provenance cannot be adequate to the phenomenon of language.
This paper presents an approach to the question whether it is possible to construct a parser based on ideas from case-based reasoning. Such a parser would employ a partial analysis of the input sentence to select a (nearly) complete syntax tree and then adapt this tree to the input sentence. The experiments performed on German data from the Tüba-D/Z treebank and the KaRoPars partial parser show that a wide range of levels of generality can be reached, depending on which types of information are used to determine the similarity between input sentence and training sentences. The results are such that it is possible to construct a case-based parser. The optimal setting out of those presented here need to be determined empirically.
This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogues, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.
In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used. Another direction in parsing research is the development of dependency parsers. Dependency parsing profits from the non-hierarchical nature of dependency relations, thus lexical information can be included in the parsing process in a much more natural way. Especially machine learning based approaches are very successful (cf. e.g.). The results achieved by these dependency parsers are very competitive although comparisons are difficult because of the differences in annotation. For English, the Penn Treebank has been converted to dependencies. For this version, Nivre et al. report an accuracy rate of 86.3%, as compared to an F-score of 92.1 for Charniaks parser. The Penn Chinese Treebank is also available in a constituent and a dependency representations. The best results reported for parsing experiments with this treebank give an F-score of 81.8 for the constituent version and 79.8% accuracy for the dependency version. The general trend in comparisons between constituent and dependency parsers is that the dependency parser performs slightly worse than the constituent parser. The only exception occurs for German, where F-scores for constituent plus grammatical function parses range between 51.4 and 75.3, depending on the treebank, NEGRA or TüBa-D/Z. The dependency parser based on a converted version of Tüba-D/Z, in contrast, reached an accuracy of 83.4%, i.e. 12 percent points better than the best constituent analysis including grammatical functions.
This paper presents a comparative study of probabilistic treebank parsing of German, using the Negra and TüBa-D/Z treebanks. Experiments with the Stanford parser, which uses a factored PCFG and dependency model, show that, contrary to previous claims for other parsers, lexicalization of PCFG models boosts parsing performance for both treebanks. The experiments also show that there is a big difference in parsing performance, when trained on the Negra and on the TüBa-D/Z treebanks. Parser performance for the models trained on TüBa-D/Z are comparable to parsing results for English with the Stanford parser, when trained on the Penn treebank. This comparison at least suggests that German is not harder to parse than its West-Germanic neighbor language English.
This report explores the question of compatibility between annotation projects including translating annotation formalisms to each other or to common forms. Compatibility issues are crucial for systems that use the results of multiple annotation projects. We hope that this report will begin a concerted effort in the field to track the compatibility of annotation schemes for part of speech tagging, time annotation, treebanking, role labeling and other phenomena.
Using a qualitative analysis of disagreements from a referentially annotated newspaper corpus, we show that, in coreference annotation, vague referents are prone to greater disagreement. We show how potentially problematic cases can be dealt with in a way that is practical even for larger-scale annotation, considering a real-world example from newspaper text.
In the past, a divide could be seen between ’deep’ parsers on the one hand, which construct a semantic representation out of their input, but usually have significant coverage problems, and more robust parsers on the other hand, which are usually based on a (statistical) model derived from a treebank and have larger coverage, but leave the problem of semantic interpretation to the user. More recently, approaches have emerged that combine the robustness of datadriven (statistical) models with more detailed linguistic interpretation such that the output could be used for deeper semantic analysis. Cahill et al. (2002) use a PCFG-based parsing model in combination with a set of principles and heuristics to derive functional (f-)structures of Lexical-Functional Grammar (LFG). They show that the derived functional structures have a better quality than those generated by a parser based on a state-of-the-art hand-crafted LFG grammar. Advocates of Dependency Grammar usually point out that dependencies already are a semantically meaningful representation (cf. Menzel, 2003). However, parsers based on dependency grammar normally create underspecified representations with respect to certain phenomena such as coordination, apposition and control structures. In these areas they are too "shallow" to be directly used for semantic interpretation. In this paper, we adopt a similar approach to Cahill et al. (2002) using a dependency-based analysis to derive functional structure, and demonstrate the feasibility of this approach using German data. A major focus of our discussion is on the treatment of coordination and other potentially underspecified structures of the dependency data input. F-structure is one of the two core levels of syntactic representation in LFG (Bresnan, 2001). Independently of surface order, it encodes abstract syntactic functions that constitute predicate argument structure and other dependency relations such as subject, predicate, adjunct, but also further semantic information such as the semantic type of an adjunct (e.g. directional). Normally f-structure is captured as a recursive attribute value matrix, which is isomorphic to a directed graph representation. Figure 5 depicts an example target f-structure. As mentioned earlier, these deeper-level dependency relations can be used to construct logical forms as in the approaches of van Genabith and Crouch (1996), who construct underspecified discourse representations (UDRSs), and Spreyer and Frank (2005), who have robust minimal recursion semantics (RMRS) as their target representation. We therefore think that f-structures are a suitable target representation for automatic syntactic analysis in a larger pipeline of mapping text to interpretation. In this paper, we report on the conversion from dependency structures to fstructure. Firstly, we evaluate the f-structure conversion in isolation, starting from hand-corrected dependencies based on the TüBa-D/Z treebank and Versley (2005)´s conversion. Secondly, we start from tokenized text to evaluate the combined process of automatic parsing (using Foth and Menzel (2006)´s parser) and f-structure conversion. As a test set, we randomly selected 100 sentences from TüBa-D/Z which we annotated using a scheme very close to that of the TiGer Dependency Bank (Forst et al., 2004). In the next section, we sketch dependency analysis, the underlying theory of our input representations, and introduce four different representations of coordination. We also describe Weighted Constraint Dependency Grammar (WCDG), the dependency parsing formalism that we use in our experiments. Section 3 characterises the conversion of dependencies to f-structures. Our evaluation is presented in section 4, and finally, section 5 summarises our results and gives an overview of problems remaining to be solved.
This paper compares two approaches to computational semantics, namely semantic unification in Lexicalized Tree Adjoining Grammars (LTAG) and Lexical Resource Semantics (LRS) in HPSG. There are striking similarities between the frameworks that make them comparable in many respects. We will exemplify the differences and similarities by looking at several phenomena. We will show, first of all, that many intuitions about the mechanisms of semantic computations can be implemented in similar ways in both frameworks. Secondly, we will identify some aspects in which the frameworks intrinsically differ due to more general differences between the approaches to formal grammar adopted by LTAG and HPSG.
Relative quantifier scope in German depends, in contrast to English, very much on word order. The scope possibilities of a quantifier are determined by its surface position, its base position and the type of the quantifier. In this paper we propose a multicomponent analysis for German quantifiers computing the scope of the quantifier, in particular its minimal nuclear scope, depending on the syntactic configuration it occurs in.
In order to understand the specific structures and features of the German surnames the most important facts about their emergence and history should be outlined and, at the same time, be compared with the Swedish surnames because there are considerable differences (for further details cf. Nubling 1997 a, b). First of all, surnames in Germany emerged rather early, with the first instances occurring in the 11th century in southern Germany; by the 16th century surnames were common all over Germany. Differences are related to geography (from south to north), social class (from the upper to the lower classes) und urban versus rural areas.
As editor of the next iteration of the Köchel Catalogue, I have to deal with the current (sixth) edition’s Appendix C, devoted to "Doubtful and Misattributed Works." My goal is to reduce the potentially vast dimensions of that appendix to only those works for which some connection to Mozart cannot be ruled out. In the decades since 1964, when the current edition of Köchel was published, many of the works listed in Appendix C have been convincingly attributed to other composers. Other works therein can confidently be dismissed as never having had any meaningful connection to Mozart. Yet even after removing the reattributed and trivially misattributed works from the appendix, we are left with a handful of works that may possibly have had something to do with Mozart, even if clear evidence one way or the other remains elusive. One must, of course, be cautious in removing questionable and doubtful works from the catalogue, as the present case-study will illustrate. The work under consideration, catalogued as K6 Anh. C 9.07, is an unaccompanied piece for three or four voices with the text "Venerabilis barba capucinorum." ...
Advantageous fragmentation? : reimagining metropolitan governance and spatial planning in Rhine-Main
(2006)
This paper traces the latest round of debates about appropriate scales and scopes of government and governance in Rhine-Main - an economically highly integrated but politically, territorially and emotionally divided region. We identify a downscaling of political power from the regional to the municipal level, and an upscaling of informal networking and image building to an extended regional scale. These countertrends are signs of a more complex geographical rearrangement in municipal and institutional relations. The inherent contradictions in the rescaling and reimagining of Rhine-Main are evident in the Strategic Vision for Frankfurt/Rhein-Main 2020. Its new conceptualization of Rhine-Main postulates complementary polycentricity as a competitive asset but remains firmly grounded in an institutional territorial logic that contravenes its own economically-driven agenda.
Meadowbird populations in The Netherlands are under great pressure. Recently, predation is named increasingly
often as one of the key factors in contributing to the declines. A four-year research project (2001-2005) aimed to
collect (as yet mostly nonexisting) data to provide a factual basis for this discussion. A country-wide inventory based
on data for wader nests found by volunteers who mark nests for their protection from grazing/mowing indicated that
above-average predation losses are found predominantly in the half-open landscapes of northern and eastern Netherlands,
but also locally in the low-lying open grasslands which are the key areas for meadowbirds. Nest predation has increased in recent years, but the same is true for agricultural losses, at least in areas where no nest-protection takes
place. At a local scale, predation losses vary greatly from area to area and from year to year. Temperature loggers in nest showed that diurnal and nocturnal predators contribute equally in total predation losses up to 50%, but higher predation losses are mainly caused by nocturnal predators. As many as 10 animal species were identified as nest predators
on nests under surveillance with video cameras. Chick survival, investigated using radiotelemetry, was very low. About 60-80% were lost by predation, 5-15% by agricultural activities and 10-15% to all kind of other losses. At least 15
predator species were implied, with an apparently larger share taken by birds (notably Buzzard (16%) and Grey Heron
(7-18%)) than mammals, with one exception: stoat (16%). Of the most-discussed predator species, Carrion Crows were
W. Teunissen et al. Osnabrücker Naturwiss. Mitt. 32 2006
138 remarkably rarely involved in both nest and chick predation, while Red Foxes take a large toll of clutches in some areas, but not in others. Of all losses during the reproductive cycle about 75% and 60% was due to predation in Lapwing and Black-tailed Godwit respectively. Predation on chicks by birds had the largest effect on total breeding success, but at the same time elimination of this loss factor (if at all possible) alone would not be sufficient to establish a self-sustaining population. Predation seems to have become a factor of importance in some areas, in combination with already existing other losses. Our findings suggest that solutions to predation problems probably have to be found in locally/regionally targeted, specific action on multiple fronts rather than countrywide measures.