Refine
Year of publication
Document Type
- Article (181)
- Part of Periodical (69)
- Preprint (62)
- Book (37)
- Part of a Book (28)
- Conference Proceeding (26)
- Working Paper (15)
- Report (8)
- Doctoral Thesis (4)
- Other (3)
Language
- English (437) (remove)
Has Fulltext
- yes (437) (remove)
Is part of the Bibliography
- no (437) (remove)
Keywords
- Computerlinguistik (28)
- Deutsch (20)
- Syntax (16)
- Japanisch (15)
- new species (11)
- Grammatik (10)
- Multicomponent Tree Adjoining Grammar (9)
- Optimalitätstheorie (9)
- Maschinelle Übersetzung (8)
- Syntaktische Analyse (8)
Institute
- Extern (437) (remove)
The taxonomy, diversity, and distribution of the aquatic insect order Trichoptera, caddisflies, are reviewed. The order is among the most important and diverse of all aquatic taxa. Larvae are vital participants in aquatic food webs and their presence and relative abundance are used in the biological assessment and monitoring of water quality. The species described by Linnaeus are listed. The morphology of all life history stages (adults, larvae, and pupae) is diagnosed and major features of the anatomy are illustrated. Major components of life history and biology are summarized. A discussion of phylogenetic studies within the order is presented, including higher classification of the suborders and superfamilies, based on recent literature. Synopses of each of 45 families are presented, including the taxonomic history of the family, a list of all known genera in each family, their general distribution and relative species diversity, and a short overview of family-level biological features. The order contains 600 genera, and approximately 13,000 species.
This study analyzes storyline structure in three Hausa home videos; Mai Kudi (The Rich Man), Sanafahna (with time truth shall dawn) and Albashi (Salary). The study measures storyline structure in these films against a Hollywood film industry model of story writing “the Hero's Journey”. It uses narrative analysis as its analytical tool, and narrative theory as its framework. After analyzing these videos, the study found that the major elements of storyline structure in Vogler's model formed the framework of the storyline structure in Hausa home videos analyzed. However, in spite of the preponderance of these elements within the storyline structure, there are significant variations to Vogler's model. Specifically, Vogler's model has some twelve stages spread on the universal structure of storytelling, i.e. beginning, middle and end. Few of these stages were found to exist in Hausa narrative structure, perhaps due to cultural differences between Western, Indian and Hausa cultures. The study therefore recommends screenwriters and producers to be aware of the existence of standard models of scriptwriting. It also recommends more training for script writers in the Hausa film industry.
Another accruing and evolving collection holding published university documents (documents made publicly available) and non-official institutional records, plus 'grey literature' and ephemera relating to UB and its forerunner institutions. It includes documents harvested from UB Website. This is an artificially created collection. Some of these records may also exist in the homogenous institutional archive collections and in the BDSC.
The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In this paper, we define the tasks of the different tracks and describe how the data sets were created from existing treebanks for ten languages. In addition, we characterize the different approaches of the participating systems, report the test results, and provide a first analysis of these results.
Recent approaches to Word Sense Disambiguation (WSD) generally fall into two classes: (1) information-intensive approaches and (2) information-poor approaches. Our hypothesis is that for memory-based learning (MBL), a reduced amount of data is more beneficial than the full range of features used in the past. Our experiments show that MBL combined with a restricted set of features and a feature selection method that minimizes the feature set leads to competitive results, outperforming all systems that participated in the SENSEVAL-3 competition on the Romanian data. Thus, with this specific method, a tightly controlled feature set improves the accuracy of the classifier, reaching 74.0% in the fine-grained and 78.7% in the coarse-grained evaluation.
Prepositional phrase (PP) attachment is one of the major sources for errors in traditional statistical parsers. The reason for that lies in the type of information necessary for resolving structural ambiguities. For parsing, it is assumed that distributional information of parts-of-speech and phrases is sufficient for disambiguation. For PP attachment, in contrast, lexical information is needed. The problem of PP attachment has sparked much interest ever since Hindle and Rooth (1993) formulated the problem in a way that can be easily handled by machine learning approaches: In their approach, PP attachment is reduced to the decision between noun and verb attachment; and the relevant information is reduced to the two possible attachment sites (the noun and the verb) and the preposition of the PP. Brill and Resnik (1994) extended the feature set to the now standard 4-tupel also containing the noun inside the PP. Among many publications on the problem of PP attachment, Volk (2001; 2002) describes the only system for German. He uses a combination of supervised and unsupervised methods. The supervised method is based on the back-off model by Collins and Brooks (1995), the unsupervised part consists of heuristics such as ”If there is a support verb construction present, choose verb attachment”. Volk trains his back-off model on the Negra treebank (Skut et al., 1998) and extracts frequencies for the heuristics from the ”Computerzeitung”. The latter also serves as test data set. Consequently, it is difficult to compare Volk’s results to other results for German, including the results presented here, since not only he uses a combination of supervised and unsupervised learning, but he also performs domain adaptation. Most of the researchers working on PP attachment seem to be satisfied with a PP attachment system; we have found hardly any work on integrating the results of such approaches into actual parsers. The only exceptions are Mehl et al. (1998) and Foth and Menzel (2006), both working with German data. Mehl et al. report a slight improvement of PP attachment from 475 correct PPs out of 681 PPs for the original parser to 481 PPs. Foth and Menzel report an improvement of overall accuracy from 90.7% to 92.2%. Both integrate statistical attachment preferences into a parser. First, we will investigate whether dependency parsing, which generally uses lexical information, shows the same performance on PP attachment as an independent PP attachment classifier does. Then we will investigate an approach that allows the integration of PP attachment information into the output of a parser without having to modify the parser: The results of an independent PP attachment classifier are integrated into the parse of a dependency parser for German in a postprocessing step.
We investigate methods to improve the recall in coreference resolution by also trying to resolve those definite descriptions where no earlier mention of the referent shares the same lexical head (coreferent bridging). The problem, which is notably harder than identifying coreference relations among mentions which have the same lexical head, has been tackled with several rather different approaches, and we attempt to provide a meaningful classification along with a quantitative comparison. Based on the different merits of the methods, we discuss possibilities to improve them and show how they can be effectively combined.
This paper presents an LTAG analysis of reflexives like himself and reciprocals like each other. These items need to find a c-commanding antecedent from which they retrieve (part of) their own denotation and with which they syntactically agree. The relation between anaphoric item and antecendent must satisfy the following important locality conditions (Chomsky (1981)).
Intrinsic motivation, the causal mechanism for spontaneous exploration and curiosity, is a central concept in developmental psychology. It has been argued to be a crucial mechanism for open-ended cognitive development in humans, and as such has gathered a growing interest from developmental roboticists in the recent years. The goal of this paper is threefold. First, it provides a synthesis of the different approaches of intrinsic motivation in psychology. Second, by interpreting these approaches in a computational reinforcement learning framework, we argue that they are not operational and even sometimes inconsistent. Third, we set the ground for a systematic operational study of intrinsic motivation by presenting a formal typology of possible computational approaches. This typology is partly based on existing computational models, but also presents new ways of conceptualizing intrinsic motivation. We argue that this kind of computational typology might be useful for opening new avenues for research both in psychology and developmental robotics.
Children […] growing up with highly inflected languages such as Modern Greek will frequently hear different grammatical forms of a given lexeme used in different grammatical and semantic-pragmatic contexts. In spite of the fact that the Greek noun is not as highly inflected as the verb, acquisition of nominal inflection of this inflecting-fusional language is quite complex, comprising the three categories of case, number, and gender. As is usual in this type of language, the formation of case-number forms obeys different patterns that apply to largely arbitrary classes of nominal lexemes partially based on gender. Further, frequency of the occurrence of the three gender classes and case-number forms of nouns greatly differs in spoken Greek, regarding both the types and tokens. […] [A] child learning an inflecting-fusional language like Greek must construct different inflectional patterns depending not only on parts of speech but also on subclasses within a given part of speech, such as gender classes of nouns and inflectional classes within or (exceptionally) across genders. It is therefore to be expected that the early development of case and number distinctions will apply to specific nouns and subclasses of nouns rather than the totality of Greek nouns. The two main theoretical approaches of morphological development that will be discussed in the present paper are the usage-based approach and the pre- and protomorphology approach.
This report arises from research carried out in Iganga and Namutumba districts in late 2006/early 2007 by the Cultural Research Centre (CRC), based in Jinja. Our research focus was to gauge the impact of using Lusoga as a medium of instruction (since 2005 in "pilot" lower primary classes) within and outside the classroom. This initiative was in response to a new set of circumstances in the education sector in Uganda, especially the introduction by Government of teaching in local languages in lower primary countrywide from February 2007. This followed an experimental period, in selected pilot districts, including Iganga, where fifteen pilot schools had been chosen: all these became part of this study.
The special issue of The Linguistic Review on "The Role of Linguistics in Cognitive Science" presents a variety of viewpoints that complement or contrast with the perspective offered in Foundations of Language (Jackendoff 2002a). The present article is a response to the special issue. It discusses what it would mean to integrate linguistics into cognitive science, then shows how the parallel architecture proposed in Foundations seeks to accomplish this goal by altering certain fundamental assumptions of generative grammar. It defends this approach against criticisms both from mainstream generative grammar and from a variety of broader attacks on the generative enterprise, and it reflects on the nature of Universal Grammar. It then shows how the parallel architecture applies directly to processing and defends this construal against various critiques. Finally, it contrasts views in the special issue with that of Foundations with respect to what is unique about language among cognitive capacities, and it conjectures about the course of the evolution of the language faculty.
The impact of naval sonar on beaked whales is of increasing concern. In recent years the presence of gas and fat embolism consistent with decompression sickness (DCS) has been reported through postmortem analyses on beaked whales that stranded in connection with naval sonar exercises. In the present study, we use basic principles of diving physiology to model nitrogen tension and bubble growth in several tissue compartments during normal div ng behavior and for several hypothetical dive profiles to assess the risk of DCS. Assuming that normal diving does not cause nitrogen tensions in excess of those shown to be safe for odontocetes, the modeling indicates that repetitive shallow dives, perhaps as a consequence of an extended avoidance reaction to sonar sound, can indeed pose a risk for DCS and that this risk should increase with the duration of the response. If the model is correct, then limiting the duration of sonar exposure to minimize the duration of any avoidance reaction therefore has the potential to reduce the risk of DCS.
Pope Benedict XVI’s Regensburg lecture has been exposed by some learned voices of 'the Muslim world' as alluding, by the means of one particular quotation, to age-old stereotypes about Islam being an essentially violent creed in which moderation through reason has no legitimate place, and of representing Muhammadas an evil and inhuman man who preached that Islam should be spread by the sword. While none of these presumably 'Muslim' voices deny that the Pope has the right to express his opinions, even when they are plainly wrong in the face of historic facts that show how Islam and Christianity were spread (or were made to spread) across the world, he is criticised for a host of omissions in terms of intellectual honesty and factual accuracy. These omissions, it is argued here, cast an unfortunate light on the compatibility of scientific and religious rationality much advocated by the Pope in his 12 September 2006 lecture. This flagrant 'performative contradiction' (Habermas) leaves room for speculation about the true aim of the speech. Is Benedict XVI's appeal to theology as a legitimate academic discipline a credible attempt to explicate Roman Catholicism's rightful place in a modern world governed by liberal democracy and ethical-political pluralism, or is it a reflection of a move to restore the age-old, intolerant, anti-scientific, and anti-democratic legacy of the pre-Vatican II Catholic Church?
In this paper, we will argue for a novel analysis of the auxiliary alternation in Early English, its development and subsequent loss which has broader consequences for the way that auxiliary selection is looked at cross-linguistically. We will present evidence that the choice of auxiliaries accompanying past participles in Early English differed in several significant respects from that in the familiar modern European languages. Specifically, while the construction with have became a full-fledged perfect by some time in the ME period, that with be was actually a stative resultative, which it remained until it was lost. We will show that this accounts for some otherwise surprising restrictions on the distribution of BE in Early English and allows a better understanding of the spread of HAVE through late ME and EModE. Perhaps more importantly, the Early English facts also provide insight into the genesis of the kind of auxiliary selection found in German, Dutch and Italian. Our analysis of them furthermore suggests a promising strategy for explaining cross-linguistic variation in auxiliary selection in terms of variation in the syntactico-semantic structure of the perfect. In this introductory section, we will first provide some background on the historical situation we will be discussing, then we will lay out the main claims for which we will be arguing in the paper.
In this paper, we introduce an extension of the XMG system (eXtensibleMeta-Grammar) in order to allow for the description of Multi-Component Tree Adjoining Grammars. In particular, we introduce the XMG formalism and its implementation, and show how the latter makes it possible to extend the system relatively easily to different target formalisms, thus opening the way towards multi-formalism.
The following new species are described from the Maghreb: Tapinocyba algirica n. sp. and Walckenaeria heimbergi n. sp. The unknown male of Minicia elegans and the unknown females of Alioranus pauper, Cherserigone graciipes and Entelecara truncatifrons are described. Tmeticus hipponense is transfered to the genus Gongylidiellum and HybocoptliS ericicola is removed from synonymy with H. corrugis and revalidated. The Maghrebian species of the genera Alioranus, Brachycerasphora, Cherserigone, Didectoprocnemis, Entelecara, Eperigone, Erigone, Gnathonarium, Gonatium, Gongylidiellum, Hybocoptus, Lessertia, Maso, Mierargus, Microetenonyx, Minicia, Monocephalus, Nematogmus, Ostearius, Prinerigone, Styloetetor, Tapinocyba, Triehoncoides and Trichoncus are all revised. As a final paper in a series on the Linyphiidae of the Maghreb, all the remaining genera are reviewed. A total of 169 species of Linyphiidae has currently been recorded in the Maghreb.
In the area of the Modern Greek verb, phenomena which consistently appear are headmarking, many potential slots before and/or after the verb root, noun and adverb incorporation, addition of adverbial elements by means of affixes, a large inventory of bound morphemes, verbal words as minimal sentences, etc. These features relate Modern Greek to polysynthesis. The main bulk of this paper is dedicated to the comparison of affixal and incorporation patterns between Modern Greek and the polysynthetic languages Abkhaz, Cayuga, Chukchi, Mohawk, and Nahuatl. Ultimately, a typological outlook for Modern Greek is proposed.
We adopt Markert and Nissim (2005)’s approach of using the World Wide Web to resolve cases of coreferent bridging for German and discuss the strength and weaknesses of this approach. As the general approach of using surface patterns to get information on ontological relations between lexical items has only been tried on English, it is also interesting to see whether the approach works for German as well as it does for English and what differences between these languages need to be accounted for. We also present a novel approach for combining several patterns that yields an ensemble that outperforms the best-performing single patterns in terms of both precision and recall.
Multicomponent Tree Adjoining Grammars (MCTAG) is a formalism that has been shown to be useful for many natural language applications. The definition of MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. This way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. In this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees (in the underlying TAG) the MCTAG licences. This definition gives a better understanding of the formalism, it allows a more systematic comparison of different types of MCTAG, and, furthermore, it can be exploited for parsing.
In this paper, we investigate the usefulness of a wide range of features for their usefulness in the resolution of nominal coreference, both as hard constraints (i.e. completely removing elements from the list of possible candidates) as well as soft constraints (where a cumulation of violations of soft constraints will make it less likely that a candidate is chosen as the antecedent). We present a state of the art system based on such constraints and weights estimated with a maximum entropy model, using lexical information to resolve cases of coreferent bridging.
The work presented here addresses the question of how to determine whether a grammar formalism is powerful enough to describe natural languages. The expressive power of a formalism can be characterized in terms of i) the string languages it generates (weak generative capacity (WGC)) or ii) the tree languages it generates (strong generative capacity (SGC)). The notion of WGC is not enough to determine whether a formalism is adequate for natural languages. We argue that even SGC is problematic since the sets of trees a grammar formalism for natural languages should be able to generate is difficult to determine. The concrete syntactic structures assumed for natural languages depend very much on theoretical stipulations and empirical evidence for syntactic structures is rather hard to obtain. Therefore, for lexicalized formalisms, we propose to consider the ability to generate certain strings together with specific predicate argument dependencies as a criterion for adequacy for natural languages.
This review lists Agama smithii Boulenger 1896 as a synonym of Agama agama (Linnaeus 1758), Agama trachypleura Peters 1982 as a synonym of Acanthocercus phillipsii (Boulenger 1895) and describes for the first time Acanthocercus guentherpetersi n. sp. Without more convincing evidence, Chamaeleon ruspolii Boettger 1893 cannot be accepted as specifically distinct from Chamaeleo dilepis Leach 1819, nor Chamaeleo calcaricarens Böhme 1985 from C. africanus Laurenti 1768. Consequently, 101 species of lizard are currently recognised in Ethiopia, of which some 40% appear to be denizens of the Somali-arid zone. This significant proportion is attributable in part to the importance of the Horn of Africa as a centre for reptilian diversification and endemicity, in part to the fact that this lowland fauna was rather extensively sampled during the 1930s, but also to the conspicuous neglect of lizards in other regions of the country. Mountain and forested habitats are widespread in Ethiopia, so it seems extraordinary to record only five saurian species which are believed to be endemic in such environments. The inference that there are many more still to be discovered has important implications for conservation, because montane forest is known to be among the most threatened of Ethiopian biomes and there is clearly an urgent need for its herpetofauna to be more thoroughly researched and documented.
The medium of (oral) language is mostly disregarded (or overlooked) in contemporary media theories. This "ignoring of language" in media studies is often accompanied by an inadequate transport model of communication, and it converges with an "ignoring of mediality" in mentalistic theories of language. In the present article it will be argued that this misleading opposition of language and media can only be overcome if one already regards oral language, not just written language, as a medium of the human mind. In my argumentation I fall back on Wittgenstein’s conception of language games to try to show how Wittgenstein’s ideas can help us to clear up the problem of the mediality of language and also to show to what extent the mentalistic conception of Chomskyan provenance cannot be adequate to the phenomenon of language.
This paper presents an approach to the question whether it is possible to construct a parser based on ideas from case-based reasoning. Such a parser would employ a partial analysis of the input sentence to select a (nearly) complete syntax tree and then adapt this tree to the input sentence. The experiments performed on German data from the Tüba-D/Z treebank and the KaRoPars partial parser show that a wide range of levels of generality can be reached, depending on which types of information are used to determine the similarity between input sentence and training sentences. The results are such that it is possible to construct a case-based parser. The optimal setting out of those presented here need to be determined empirically.
This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogues, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.
In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used. Another direction in parsing research is the development of dependency parsers. Dependency parsing profits from the non-hierarchical nature of dependency relations, thus lexical information can be included in the parsing process in a much more natural way. Especially machine learning based approaches are very successful (cf. e.g.). The results achieved by these dependency parsers are very competitive although comparisons are difficult because of the differences in annotation. For English, the Penn Treebank has been converted to dependencies. For this version, Nivre et al. report an accuracy rate of 86.3%, as compared to an F-score of 92.1 for Charniaks parser. The Penn Chinese Treebank is also available in a constituent and a dependency representations. The best results reported for parsing experiments with this treebank give an F-score of 81.8 for the constituent version and 79.8% accuracy for the dependency version. The general trend in comparisons between constituent and dependency parsers is that the dependency parser performs slightly worse than the constituent parser. The only exception occurs for German, where F-scores for constituent plus grammatical function parses range between 51.4 and 75.3, depending on the treebank, NEGRA or TüBa-D/Z. The dependency parser based on a converted version of Tüba-D/Z, in contrast, reached an accuracy of 83.4%, i.e. 12 percent points better than the best constituent analysis including grammatical functions.
This paper presents a comparative study of probabilistic treebank parsing of German, using the Negra and TüBa-D/Z treebanks. Experiments with the Stanford parser, which uses a factored PCFG and dependency model, show that, contrary to previous claims for other parsers, lexicalization of PCFG models boosts parsing performance for both treebanks. The experiments also show that there is a big difference in parsing performance, when trained on the Negra and on the TüBa-D/Z treebanks. Parser performance for the models trained on TüBa-D/Z are comparable to parsing results for English with the Stanford parser, when trained on the Penn treebank. This comparison at least suggests that German is not harder to parse than its West-Germanic neighbor language English.
This report explores the question of compatibility between annotation projects including translating annotation formalisms to each other or to common forms. Compatibility issues are crucial for systems that use the results of multiple annotation projects. We hope that this report will begin a concerted effort in the field to track the compatibility of annotation schemes for part of speech tagging, time annotation, treebanking, role labeling and other phenomena.
Using a qualitative analysis of disagreements from a referentially annotated newspaper corpus, we show that, in coreference annotation, vague referents are prone to greater disagreement. We show how potentially problematic cases can be dealt with in a way that is practical even for larger-scale annotation, considering a real-world example from newspaper text.
In the past, a divide could be seen between ’deep’ parsers on the one hand, which construct a semantic representation out of their input, but usually have significant coverage problems, and more robust parsers on the other hand, which are usually based on a (statistical) model derived from a treebank and have larger coverage, but leave the problem of semantic interpretation to the user. More recently, approaches have emerged that combine the robustness of datadriven (statistical) models with more detailed linguistic interpretation such that the output could be used for deeper semantic analysis. Cahill et al. (2002) use a PCFG-based parsing model in combination with a set of principles and heuristics to derive functional (f-)structures of Lexical-Functional Grammar (LFG). They show that the derived functional structures have a better quality than those generated by a parser based on a state-of-the-art hand-crafted LFG grammar. Advocates of Dependency Grammar usually point out that dependencies already are a semantically meaningful representation (cf. Menzel, 2003). However, parsers based on dependency grammar normally create underspecified representations with respect to certain phenomena such as coordination, apposition and control structures. In these areas they are too "shallow" to be directly used for semantic interpretation. In this paper, we adopt a similar approach to Cahill et al. (2002) using a dependency-based analysis to derive functional structure, and demonstrate the feasibility of this approach using German data. A major focus of our discussion is on the treatment of coordination and other potentially underspecified structures of the dependency data input. F-structure is one of the two core levels of syntactic representation in LFG (Bresnan, 2001). Independently of surface order, it encodes abstract syntactic functions that constitute predicate argument structure and other dependency relations such as subject, predicate, adjunct, but also further semantic information such as the semantic type of an adjunct (e.g. directional). Normally f-structure is captured as a recursive attribute value matrix, which is isomorphic to a directed graph representation. Figure 5 depicts an example target f-structure. As mentioned earlier, these deeper-level dependency relations can be used to construct logical forms as in the approaches of van Genabith and Crouch (1996), who construct underspecified discourse representations (UDRSs), and Spreyer and Frank (2005), who have robust minimal recursion semantics (RMRS) as their target representation. We therefore think that f-structures are a suitable target representation for automatic syntactic analysis in a larger pipeline of mapping text to interpretation. In this paper, we report on the conversion from dependency structures to fstructure. Firstly, we evaluate the f-structure conversion in isolation, starting from hand-corrected dependencies based on the TüBa-D/Z treebank and Versley (2005)´s conversion. Secondly, we start from tokenized text to evaluate the combined process of automatic parsing (using Foth and Menzel (2006)´s parser) and f-structure conversion. As a test set, we randomly selected 100 sentences from TüBa-D/Z which we annotated using a scheme very close to that of the TiGer Dependency Bank (Forst et al., 2004). In the next section, we sketch dependency analysis, the underlying theory of our input representations, and introduce four different representations of coordination. We also describe Weighted Constraint Dependency Grammar (WCDG), the dependency parsing formalism that we use in our experiments. Section 3 characterises the conversion of dependencies to f-structures. Our evaluation is presented in section 4, and finally, section 5 summarises our results and gives an overview of problems remaining to be solved.
This paper compares two approaches to computational semantics, namely semantic unification in Lexicalized Tree Adjoining Grammars (LTAG) and Lexical Resource Semantics (LRS) in HPSG. There are striking similarities between the frameworks that make them comparable in many respects. We will exemplify the differences and similarities by looking at several phenomena. We will show, first of all, that many intuitions about the mechanisms of semantic computations can be implemented in similar ways in both frameworks. Secondly, we will identify some aspects in which the frameworks intrinsically differ due to more general differences between the approaches to formal grammar adopted by LTAG and HPSG.
Relative quantifier scope in German depends, in contrast to English, very much on word order. The scope possibilities of a quantifier are determined by its surface position, its base position and the type of the quantifier. In this paper we propose a multicomponent analysis for German quantifiers computing the scope of the quantifier, in particular its minimal nuclear scope, depending on the syntactic configuration it occurs in.
In order to understand the specific structures and features of the German surnames the most important facts about their emergence and history should be outlined and, at the same time, be compared with the Swedish surnames because there are considerable differences (for further details cf. Nubling 1997 a, b). First of all, surnames in Germany emerged rather early, with the first instances occurring in the 11th century in southern Germany; by the 16th century surnames were common all over Germany. Differences are related to geography (from south to north), social class (from the upper to the lower classes) und urban versus rural areas.
As editor of the next iteration of the Köchel Catalogue, I have to deal with the current (sixth) edition’s Appendix C, devoted to "Doubtful and Misattributed Works." My goal is to reduce the potentially vast dimensions of that appendix to only those works for which some connection to Mozart cannot be ruled out. In the decades since 1964, when the current edition of Köchel was published, many of the works listed in Appendix C have been convincingly attributed to other composers. Other works therein can confidently be dismissed as never having had any meaningful connection to Mozart. Yet even after removing the reattributed and trivially misattributed works from the appendix, we are left with a handful of works that may possibly have had something to do with Mozart, even if clear evidence one way or the other remains elusive. One must, of course, be cautious in removing questionable and doubtful works from the catalogue, as the present case-study will illustrate. The work under consideration, catalogued as K6 Anh. C 9.07, is an unaccompanied piece for three or four voices with the text "Venerabilis barba capucinorum." ...
Advantageous fragmentation? : reimagining metropolitan governance and spatial planning in Rhine-Main
(2006)
This paper traces the latest round of debates about appropriate scales and scopes of government and governance in Rhine-Main - an economically highly integrated but politically, territorially and emotionally divided region. We identify a downscaling of political power from the regional to the municipal level, and an upscaling of informal networking and image building to an extended regional scale. These countertrends are signs of a more complex geographical rearrangement in municipal and institutional relations. The inherent contradictions in the rescaling and reimagining of Rhine-Main are evident in the Strategic Vision for Frankfurt/Rhein-Main 2020. Its new conceptualization of Rhine-Main postulates complementary polycentricity as a competitive asset but remains firmly grounded in an institutional territorial logic that contravenes its own economically-driven agenda.
Meadowbird populations in The Netherlands are under great pressure. Recently, predation is named increasingly
often as one of the key factors in contributing to the declines. A four-year research project (2001-2005) aimed to
collect (as yet mostly nonexisting) data to provide a factual basis for this discussion. A country-wide inventory based
on data for wader nests found by volunteers who mark nests for their protection from grazing/mowing indicated that
above-average predation losses are found predominantly in the half-open landscapes of northern and eastern Netherlands,
but also locally in the low-lying open grasslands which are the key areas for meadowbirds. Nest predation has increased in recent years, but the same is true for agricultural losses, at least in areas where no nest-protection takes
place. At a local scale, predation losses vary greatly from area to area and from year to year. Temperature loggers in nest showed that diurnal and nocturnal predators contribute equally in total predation losses up to 50%, but higher predation losses are mainly caused by nocturnal predators. As many as 10 animal species were identified as nest predators
on nests under surveillance with video cameras. Chick survival, investigated using radiotelemetry, was very low. About 60-80% were lost by predation, 5-15% by agricultural activities and 10-15% to all kind of other losses. At least 15
predator species were implied, with an apparently larger share taken by birds (notably Buzzard (16%) and Grey Heron
(7-18%)) than mammals, with one exception: stoat (16%). Of the most-discussed predator species, Carrion Crows were
W. Teunissen et al. Osnabrücker Naturwiss. Mitt. 32 2006
138 remarkably rarely involved in both nest and chick predation, while Red Foxes take a large toll of clutches in some areas, but not in others. Of all losses during the reproductive cycle about 75% and 60% was due to predation in Lapwing and Black-tailed Godwit respectively. Predation on chicks by birds had the largest effect on total breeding success, but at the same time elimination of this loss factor (if at all possible) alone would not be sufficient to establish a self-sustaining population. Predation seems to have become a factor of importance in some areas, in combination with already existing other losses. Our findings suggest that solutions to predation problems probably have to be found in locally/regionally targeted, specific action on multiple fronts rather than countrywide measures.
Black-tailed Godwits (Limosa limosa) have been declining for decades in The Netherlands and so far this has not been slowed by conservation measures. A new form of agri-environment scheme was tried out in 2003-2005 at 6 sites where a ‘grassland mosaic’ (200-300 ha) was created by collectives of farmers through a diverse use of fields including postponed and staggered mowing, (early) grazing, creating ‘refuge strips’ during mowing, and active nest protection. We measured breeding success of godwits in each of the experimental sites and nearby, paired controls. Breeding success was higher (0.28 chicks fledged / pair) in mosaics than in controls, but due to lower agricultural nest losses only. Chick survival was 11 % in both mosaics and controls. The amount of late-mown and other grassland suitable for chicks hardly differed between treatments during the fledging period, mainly due to rainfall delaying postponed mowing in all sites. Chick survival was however positively correlated with site variation in the amount of high grass (>18 cm). Breeding success was high enough to compensate for adult mortality (ca. 0.6) in only one mosaic site. Chick survival was lower than in previous Godwit studies, indicating that additional loss factors have increased. Predation (50-80 % of chicks, mostly by birds) is a candidate, but changes in the suitability of late-mown grassland (insect abundance and sward density in grass monocultures) may also play a role. Consequently a higher management investment is needed to achieve a self-sustaining population.
In this study, we report the results of a long-term investigation on changes in population size and fledging success of Northern Lapwing on Wangerooge, a German Wadden Sea island. This population is increasing over a period of 34 years in contrast to numerous populations in North-western Europe. The reproductive success however declines over time and also with population density. Both effects cannot be considered separately due to autocorrelation. However, it is noted that the population on Wangerooge is not sustained by local recruitment only. This outcome is even more alarming as coastal areas and islands are considered as rare high quality meadow bird habitats. According to the present results Wangerooge cannot be considered as a source habitat for Northern Lapwings in North-western Germany.
Human impacts on the landscape have increased the penalties for Black-tailed Godwits laying their eggs too late, especially in the very intensive agricultural landscapes of The Netherlands. Thus, godwits have experienced a dramatic change of their fitness landscape, because the advance in mowing date made late clutches worthless destroying either eggs or chicks. To determine the driving forces of the recent population decline we study the individual variation in timing of breeding with respect to reproductive success in a population unaffected by mowing. Our results show that even in a low intensity agricultural area it is very important for godwits to breed early in the season.
The retreat of BE as perfect auxiliary in the history of English is examined. Corpus data are presented showing that the initial advance of HAVE was most closely connected to a restriction against BE in past counterfactuals. Other factors which have been reported to favor the spread of HAVE are either dependent on the counterfactual effect, or significantly weaker in comparison. It is argued that the effect can be traced to the semantics of the BE perfect, which denoted resultativity rather than anteriority proper. Related data from other older Germanic and Romance languages are presented, and finally implications for existing theories of auxiliary selection stemming from the findings presented are discussed.
In this article we examine and "exapt" Wurzel's concept of superstable markers in an innovative manner. We develop an extended view of superstability through a critical discussion of Wurzel's original definition and the status of marker-superstability versus allomorphy in Natural Morphology: As we understand it, superstability is - above and beyond a step towards uniformity - mainly a symptom for the weakening of the category affected (cf. 1.,2. and 4.). This view is exemplified in four short case studies on superstability in different grammatical categories of four Germanic languages: genitive case in Mainland Scandinavian and English (3.1), plural formation in Dutch (3.2), second person singular ending -st in German (3.3), and ablaut generalisation in Luxembourgish (3.4).
In this text, we describe the development of a broad coverage grammar for Japanese that has been built for and used in different application contexts. The grammar is based on work done in the Verbmobil project (Siegel 2000) on machine translation of spoken dialogues in the domain of travel planning. The second application for JACY was the automatic email response task. Grammar development was described in Oepen et al. (2002a). Third, it was applied to the task of understanding material on mobile phones available on the internet, while embedded in the project DeepThought (Callmeier et al. 2004, Uszkoreit et al. 2004). Currently, it is being used for treebanking and ontology extraction from dictionary definition sentences by the Japanese company NTT (Bond et al. 2004).
In this paper we describe SOBA, a sub-component of the SmartWeb multi-modal dialog system. SOBA is a component for ontologybased information extraction from soccer web pages for automatic population of a knowledge base that can be used for domainspecific question answering. SOBA realizes a tight connection between the ontology, knowledge base and the information extraction component. The originality of SOBA is in the fact that it extracts information from heterogeneous sources such as tabular structures, text and image captions in a semantically integrated way. In particular, it stores extracted information in a knowledge base, and in turn uses the knowledge base to interpret and link newly extracted information with respect to already existing entities.
This demo abstract describes the SmartWeb Ontology-based Information Extraction System (SOBIE). A key feature of SOBIE is that all information is extracted and stored with respect to the SmartWeb ontology. In this way, other components of the systems, which use the same ontology, can access this information in a straightforward way. We will show how information extracted by SOBIE is visualized within its original context, thus enhancing the browsing experience of the end user.
In this paper I present five alternations of the verb system of Modern Greek, which are recurrently mapped on the syntactic frame NPi__NP. The actual claim is that only the participation in alternations and/or the allocation to an alternation variant can reliably determine the relation between a verb derivative and its base. In the second part, the conceptual structures and semantic/situational fields of a large number of “-ízo” derivatives appearing inside alternation classes are presented. The restricted character of the conceptual and situational preferences inside alternations classes suggests the dominant character of the alternations component.
Effects of BPA in snails
(2006)
It is an ethical requirement that new findings be presented in light of and in conjunction with a balanced evaluation of the current knowledge and published literature. We believe that Oehlmann et al. (2006) violated this general principle in several ways. For example, the authors inferred that prosobranch snails have a functional estrogen receptor and therefore a much higher sensitivity to estrogens and endocrine-disrupting compounds (EDCs) than other species previously reported in the literature. We found several other problems in their article...
In the last decade, the Penn treebank has become the standard data set for evaluating parsers. The fact that most parsers are solely evaluated on this specific data set leaves the question unanswered how much these results depend on the annotation scheme of the treebank. In this paper, we will investigate the influence which different decisions in the annotation schemes of treebanks have on parsing. The investigation uses the comparison of similar treebanks of German, NEGRA and TüBa-D/Z, which are subsequently modified to allow a comparison of the differences. The results show that deleted unary nodes and a flat phrase structure have a negative influence on parsing quality while a flat clause structure has a positive influence.
This paper develops a framework for TAG (Tree Adjoining Grammar) semantics that brings together ideas from different recent approaches.Then, within this framework, an analysis of scope is proposed that accounts for the different scopal properties of quantifiers, adverbs, raising verbs and attitude verbs. Finally, including situation variables in the semantics, different situation binding possibilities are derived for different types of quantificational elements.
This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogs, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper ´die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.
Multicomponent Tree Adjoining Grammars (MCTAG) is a formalism that has been shown to be useful for many natural language applications. The definition of MCTAG however is problematic since it refers to the process of the derivation itself: a simultaneity constraint must be respected concerning the way the members of the elementary tree sets are added. Looking only at the result of a derivation (i.e., the derived tree and the derivation tree), this simultaneity is no longer visible and therefore cannot be checked. I.e., this way of characterizing MCTAG does not allow to abstract away from the concrete order of derivation. Therefore, in this paper, we propose an alternative definition of MCTAG that characterizes the trees in the tree language of an MCTAG via the properties of the derivation trees the MCTAG licences.
When a statistical parser is trained on one treebank, one usually tests it on another portion of the same treebank, partly due to the fact that a comparable annotation format is needed for testing. But the user of a parser may not be interested in parsing sentences from the same newspaper all over, or even wants syntactic annotations for a slightly different text type. Gildea (2001) for instance found that a parser trained on the WSJ portion of the Penn Treebank performs less well on the Brown corpus (the subset that is available in the PTB bracketing format) than a parser that has been trained only on the Brown corpus, although the latter one has only half as many sentences as the former. Additionally, a parser trained on both the WSJ and Brown corpora performs less well on the Brown corpus than on the WSJ one. This leads us to the following questions that we would like to address in this paper: - Is there a difference in usefulness of techniques that are used to improve parser performance between the same-corpus and the different-corpus case? - Are different types of parsers (rule-based and statistical) equally sensitive to corpus variation? To achieve this, we compared the quality of the parses of a hand-crafted constraint-based parser and a statistical PCFG-based parser that was trained on a treebank of German newspaper text.
This paper describes the creation and preparation of TUSNELDA, a collection of corpus data built for linguistic research. This collection contains a number of linguistically annotated corpora which differ in various aspects such as language, text sorts / data types, encoded annotation levels, and linguistic theories underlying the annotation. The paper focuses on this variation on the one hand and the way how these heterogeneous data are integrated into one resource on the other hand.
The current study was part of a series of environment related studies of the Jabal Akhdar sponsored by the Sultan Qaboos University, Al Khoud, Sultanate of Oman. The present study aimed to establish the range, habitat, status and population of breeding species in the area, review the historical perspective and list migrant and visitor species noted during the survey.
Trubetzkoy's recognition of a delimitative function of phonology, serving to signal boundaries between morphological units, is expressed in terms of alignment constraints in Optimality Theory, where the relevant constraints require specific morphological boundaries to coincide with phonological structure (Trubetzkoy 1936, 1939, McCarthy & Prince 1993). The approach pursued in the present article is to investigate the distribution of phonological boundary signals to gain insight into the criteria underlying morphological analysis. The evidence from English and Swedish suggests that necessary and sufficient conditions for word-internal morphological analysis concern the recognizability of head constituents, which include the rightmost members of compounds and head affixes. The claim is that the stability of word-internal boundary effects in historical perspective cannot in general be sufficiently explained in terms of memorization and imitation of phonological word form. Rather, these effects indicate a morphological parsing mechanism based on the recognition of word-internal head constituents. Head affixes can be shown to contrast systematically with modifying affixes with respect to syntactic function, semantic content, and prosodic properties. That is, head affixes, which cannot be omitted, often lack inherent meaning and have relatively unmarked boundaries, which can be obscured entirely under specific phonological conditions. By contrast, modifying affixes, which can be omitted, consistently have inherent meaning and have stronger boundaries, which resist prosodic fusion in all phonological contexts. While these correlations are hardly specific to English and Swedish it remains to be investigated to which extent they hold cross-linguistically. The observation that some of the constituents identified on the basis of prosodic evidence lack inherent meaning raises the issue of compositionality. I will argue that certain systematic aspects of word meaning cannot be captured with reference to the syntagmatic level, but require reference to the paradigmatic level instead. The assumption is then that there are two dimensions of morphological analysis: syntagmatic analysis, which centers on the criteria for decomposing words in terms of labelled constituents, and paradigmatic analysis, which centers on the criteria for establishing relations among (whole) words in the mental lexicon. While meaning is intrinsically connected with paradigmatic analysis (e.g. base relations, oppositeness) it is not essential to syntagmatic analysis.
This paper proposes an annotating scheme that encodes honorifics (respectful words). Honorifics are used extensively in Japanese, reflecting the social relationship (e.g. social ranks and age) of the referents. This referential information is vital for resolving zero
pronouns and improving machine translation outputs. Annotating honorifics is a complex task that involves identifying a predicate with honorifics, assigning ranks to referents of the
predicate, calibrating the ranks, and connecting referents with their predicates.
While the sortal constraints associated with Japanese numeral classifiers are well-studied, less attention has been paid to the details of their syntax. We describe an analysis implemented within a broad-coverage HPSG that handles an intricate set of numeral classifier construction types and compositionally relates each to an appropriate semantic representation, using Minimal Recursion Semantics.
The Deep Linguistic Processing with HPSG Initiative (DELH-IN) provides the infrastructure needed to produce open-source semantic transfer-based machine translation systems. We have made available a prototype Japanese-English machine translation system built from existing resources include parsers, generators, bidirectional grammars and a transfer engine.
Articulatory token-to-token variability not only depends on linguistic aspects like the phoneme inventory of a given language but also on speaker specific morphological and motor constraints. As has been noted previously (Perkell (1997), Mooshammer et al. (2004)) , speakers with coronally high "domeshaped" palates exhibit more articulatory variability than speakers with coronally low "flat" palates. One explanation for that is based on perception oriented control by the speaker. The influence of articulatory variation on the cross sectional area and consequently on the acoustics should be greater for flat palates than for domeshaped ones. This should force speakers with flat palates to place their tongue very precisely whereas speakers with domeshaped palates might tolerate a greater variability. A second explanation could be a greater amount of lateral linguo-palatal contact for flat palates holding the tongue in position. In this study both hypotheses were tested.
LTAG semantics for questions
(2004)
This papers presents a compositional semantic analysis of interrogatives clauses in LTAG (Lexicalized Tree Adjoining Grammar) that captures the scopal properties of wh- and nonwh-quantificational elements. It is shown that the present approach derives the correct semantics for examples claimed to be problematic for LTAG semantic approaches based on the derivation tree. The paper further provides an LTAG semantics for embedded interrogatives.
Weak function word shift
(2004)
The fact that object shift only affects weak pronouns in mainland Scandinavian is seen as an instance of a more general observation that can be made in all Germanic languages: weak function words tend to avoid the edges of larger prosodic domains. This generalisation has been formulated within Optimality Theory in terms of alignment constraints on prosodic structure by Selkirk (1996) in explaining thedistribution of prosodically strong and weak forms of English functionwords, especially modal verbs, prepositions and pronouns. But a purely phonological account fails to integrate the syntactic licensing conditions for object shift in an appropriate way. The standard semantico-syntactic accounts of object shift, onthe other hand, fail to explain why it is only weak pronouns that undergo object shift. This paper develops an Optimality theoretic model of the syntax-phonology interface which is based on the interaction of syntactic and prosodic factors. The account can successfully be applied to further related phenomena in English and German.
German dialects vary in which of the possible orders of the verbs in a 3-verb cluster they allow. In a still ongoing empirical investigation that I am undertaking together with Tanja Schmid, University of Stuttgart (Schmid and Vogel (2004)) we already found that each of the six logically possible permutations of the 3-verb cluster in (1) can be found in German dialects.
This paper reports the results of a corpus investigation on case conflicts in German argument free relative constructions. We investigate how corpus frequencies reflect the relative markedness of free relative and correlative constructions, the relative markedness of different case conflict configurations, and the relative markedness of different conflict resolution strategies. Section 1 introduces the conception of markedness as used in Optimality Theory. Section 2 introduces the facts about German free relative clauses, and section 3 presents the results of the corpus study. By and large, markedness and frequency go hand in hand. However, configurations at the highest end of the markedness scale rarely show up in corpus data, and for the configuration at the lowest end we found an unexpected outcome: the more marked structure is preferred.
The purpose of this paper is to describe the TüBa-D/Z treebank of written German and to compare it to the independently developed TIGER treebank (Brants et al., 2002). Both treebanks, TIGER and TüBa-D/Z, use an annotation framework that is based on phrase structure grammar and that is enhanced by a level of predicate-argument structure. The comparison between the annotation schemes of the two treebanks focuses on the different treatments of free word order and discontinuous constituents in German as well as on differences in phrase-internal annotation.
Tree-local MCTAG with shared nodes : an analysis of word order variation in German and Korean
(2004)
Tree Adjoining Grammars (TAG) are known not to be powerful enough to deal with scrambling in free word order languages. The TAG-variants proposed so far in order to account for scrambling are not entirely satisfying. Therefore, an alternative extension of TAG is introduced based on the notion of node sharing. Considering data from German and Korean, it is shown that this TAG-extension can adequately analyse scrambling data, also in combination with extraposition and topicalization.
Transforming constituent-based annotation into dependency-based annotation has been shown to work for different treebanks and annotation schemes (e.g. Lin (1995) has transformed the Penn treebank, and Kübler and Telljohann (2002) the Tübinger Baumbank des Deutschen (TüBa-D/Z)). These ventures are usually triggered by the conflict between theory-neutral annotation, that targets most needs of a wider audience, and theory-specific annotation, that provides more fine-grained information for a smaller audience. As a compromise, it has been pointed out that treebanks can be designed to support more than one theory from the start (Nivre, 2003). We argue that information can also be added to an existing annotation scheme so that it supports additional theory-specific annotations. We also argue that such a transformation is useful for improving and extending the original annotation scheme with respect to both ambiguous annotation and annotation errors. We show this by analysing problems that arise when generating dependency information from the constituent-based TüBa-D/Z.
This paper reports on the SYN-RA (SYNtax-based Reference Annotation) project, an on-going project of annotating German newspaper texts with referential relations. The project has developed an inventory of anaphoric and coreference relations for German in the context of a unified, XML-based annotation scheme for combining morphological, syntactic, semantic, and anaphoric information. The paper discusses how this unified annotation scheme relates to other formats currently discussed in the literature, in particular the annotation graph model of Bird and Liberman (2001) and the pie-in-thesky scheme for semantic annotation.
The purpose of this paper is to describe recent developments in the morphological, syntactic, and semantic annotation of the TüBa-D/Z treebank of German. The TüBa-D/Z annotation scheme is derived from the Verbmobil treebank of spoken German [4, 10], but has been extended along various dimensions to accommodate the characteristics of written texts. TüBa-D/Z uses as its data source the "die tageszeitung" (taz) newspaper corpus. The Verbmobil treebank annotation scheme distinguishes four levels of syntactic constituency: the lexical level, the phrasal level, the level of topological fields, and the clausal level. The primary ordering principle of a clause is the inventory of topological fields, which characterize the word order regularities among different clause types of German, and which are widely accepted among descriptive linguists of German [3, 6]. The TüBa-D/Z annotation relies on a context-free backbone (i.e. proper trees without crossing branches) of phrase structure combined with edge labels that specify the grammatical function of the phrase in question. The syntactic annotation scheme of the TüBa-D/Z is described in more detail in [12, 11]. TüBa-D/Z currently comprises approximately 15 000 sentences, with approximately 7 000 sentences being in the correction phase. The latter will be released along with an updated version of the existing treebank before the end of this year. The treebank is available in an XML format, in the NEGRA export format [1] and in the Penn treebank bracketing format. The XML format contains all types of information as described above, the NEGRA export format contains all sentenceinternal information while the Penn treebank format includes only those layers of information that can be expressed as pure tree structures. Over the course of the last year, more fine grained linguistic annotations have been added along the following dimensions: 1. the basic Stuttgart-Tübingen tagset, STTS, [9] labels have been enriched by relevant features of inflectional morphology, 2. named entity information has been encoded as part of the syntactic annotation, and 3. a set of anaphoric and coreference relations has been added to link referentially dependent noun phrases. In the following sections, we will describe each of these innovations in turn and will demonstrate how the additional annotations can be incorporated into one comprehensive annotation scheme.
This paper is concerned with the tagging of spatial expressions in German newspaper articles, assigning a meaning to the expression and classifying the usages of the spatial expression and linking the derived referent to an event description. In our system, we implemented the activation of concepts in a very simple fashion, a concept is activated once (with a cost depending on the item that activated it) and is left activated thereafter. As an example, a city also activates the nodes for the region and the country it is part of, so that cities from one country are chosen over cities from different countries. A test corpus of 12 German newspaper articles was tested regarding several disambiguation strategies. Disambiguation was carried out via a beam search to find an approximately cost-optimal solution for the conflict set of potential grounding candidates for the tagged spatial expression. Test showed that the disambiguation strategies improved accuracy significantly.
This paper sets up a framework for LTAG (Lexicalized Tree Adjoining Grammar) semantics that brings together ideas from different recent approaches addressing some shortcomings of TAG semantics based on the derivation tree. Within this framework, several sample analyses are proposed, and it is shown that the framework allows to analyze data that have been claimed to be problematic for derivation tree based LTAG semantics approaches.
Dialectal variation in german 3-verb clusters : a surface-oriented optimality theoretic account
(2004)
We present data from an empirical investigation on the dialectal variation in the syntax of German 3-verb clusters, consisting of a temporal auxiliary, a modal verb, and a predicative verb. The ordering possibilities vary greatly among the dialects. Some of the orders that we found occur only under particular stress assignments. We assume that these orders fulfil an information structural purpose and that the reordering processes are changes only in the linear order of the elements which is represented exclusively at the surface syntactic level, PF (Phonetic Form). Our Optimality theoretic account offers a multifactorial perspective on the phenomenon.
CONTENTS: WHITHER THE SOUTH AFRICAN PUBLISHING INDUSTRY ? 4;
APNET MESSAGE TO AFRICAN PUBLISHERS ON WORLD BOOK DAY 11 ;
FUNDING OPPORTUNITIES FOR OPERATORS IN CULTURE-RELATED INDUSTRIES 13;
4TH SALON INTERNATIONAL DU LIVRE D’ABIDJAN (SILA) 2004 16;
THE NIGERIA INTERNATIONAL BOOK FAIR (NIBF) 2004 20;
THE NOMA AWARD 2003 PRESENTATION 22;
A NEW CONSULTANCY FIRM IS FORMED 27;
EDILIS HOLD DEDICATION CEREMONY 30;
LETTERS TO THE EDITOR 34;
NEWS FROM PARTNER ORGANISATIONS 41;
NOTICE 44;
PROMOTIONS 50
When the concept of the auteur was coined in the 1950s and 1960s, it was an initiative to clarify the obscure matters of authorship in cinema. Because a film must necessarily be a collective work, understood as the result of a large number of creative contributions, it was often unclear who the decisive power behind a certain film was, who contributed the "distinctive quality". The control will usually belong to the director, the producer or the star (or all three in combination), but what singles out a given film could also come from the cinematographer, the scriptwriter, from the author of an adapted literary work, or from traditions in the studio or in the genre. Nothing can be taken for granted about a film's authorship, it can only be decided through a thorough analysis of each film's production process, an analysis that, in most cases, will be impossible to make. ...
The aim of this paper is the exploration of an optimality theoretic architecture for syntax that is guided by the concept of "correspondence": syntax is understood as the mechanism of "translating" underlying representations into a surface form. In minimalism, this surface form is called "Phonological Form" (PF). Both semantic and abstract syntactic information are reflected by the surface form. The empirical domain where this architecture is tested are minimal link effects, especially in the case of "wh"-movement. The OT constraints require the surface form to reflect the underlying semantic and syntactic representations as maximally as possible. The means by which underlying relations and properties are encoded are precedence, adjacency, surface morphology and prosodic structure. Information that is not encoded in one of these ways remains unexpressed, and gets lost unless it is recoverable via the context. Different kinds of information are often expressed by the same means. The resulting conflicts are resolved by the relative ranking of the relevant correspondence constraints.
The argument that I tried to elaborate on in this paper is that the conceptual problem behind the traditional competence/performance distinction does not go away, even if we abandon its original Chomskyan formulation. It returns as the question about the relation between the model of the grammar and the results of empirical investigations – the question of empirical verification The theoretical concept of markedness is argued to be an ideal correlate of gradience. Optimality Theory, being based on markedness, is a promising framework for the task of bridging the gap between model and empirical world. However, this task not only requires a model of grammar, but also a theory of the methods that are chosen in empirical investigations and how their results are interpreted, and a theory of how to derive predictions for these particular empirical investigations from the model. Stochastic Optimality Theory is one possible formulation of a proposal that derives empirical predictions from an OT model. However, I hope to have shown that it is not enough to take frequency distributions and relative acceptabilities at face value, and simply construe some Stochastic OT model that fits the facts. These facts first of all need to be interpreted, and those factors that the grammar has to account for must be sorted out from those about which grammar should have nothing to say. This task, to my mind, is more complicated than the picture that a simplistic application of (not only) Stochastic OT might draw.
Japanese is often taken to be strictly head-final in its syntax. In our work on a broad-coverage, precision implemented HPSG for Japanese, we have found that while this is generally true, there are nonetheless a few minor exceptions to the broad trend. In this paper, we describe the grammar engineering project, present the exceptions we have found, and conclude that this kind of phenomenon motivates on the one hand the HPSG type hierarchical approach which allows for the statement of both broad generalizations and exceptions to those generalizations and on the other hand the usefulness of grammar engineering as a means of testing linguistic hypotheses.
While the sortal constraints associated with Japanese numeral classifiers are wellstudied, less attention has been paid to the details of their syntax. We describe an analysis implemented within a broadcoverage HPSG that handles an intricate set of numeral classifier construction types and compositionally relates each to an appropriate semantic representation, using Minimal Recursion Semantics.
Hybrid robust deep and shallow semantic processing for creativity support in document production
(2004)
The research performed in the DeepThought project (http://www.project-deepthought.net) aims at demonstrating the potential of deep linguistic processing if added to existing shallow methods that ensure robustness. Classical information retrieval is extended by high precision concept indexing and relation detection. We use this approach to demonstrate the feasibility of three ambitious applications, one of which is a tool for creativity support in document production and collective brainstorming. This application is described in detail in this paper. Common to all three applications, and the basis for their development is a platform for integrated linguistic processing. This platform is based on a generic software architecture that combines multiple NLP components and on robust minimal recursive semantics (RMRS) as a uniform representation language.
The research performed in the DeepThought project aims at demonstrating the potential of deep linguistic processing if combined with shallow methods for robustness. Classical information retrieval is extended by high precision concept indexing and relation detection. On the basis of this approach, the feasibility of three ambitious applications will be demonstrated, namely: precise information extraction for business intelligence; email response management for customer relationship management; creativity support for document production and collective brainstorming. Common to these applications, and the basis for their development is the XML-based, RMRS-enabled core architecture framework that will be described in detail in this paper. The framework is not limited to the applications envisaged in the DeepThought project, but can also be employed e.g. to generate and make use of XML standoff annotation of documents and linguistic corpora, and in general for a wide range of NLP-based applications and research purposes.
The problematic economic situation in most parts of Russia today is nevertheless the ideal climate for the flourishing of the arts. Especially in St. Petersburg there grows a fascinating new experimental music scene, from Moscow we receive new impulses in literature such as the poet Alina Vituchnovskaja... Russian cinema always had a good reputation, and the new generation of Russian filmmakers clearly tries to keep up with it.
The definition of similarity between sentences is formulated on the levels of words, POS tags, and chunks (Abney 91; Abney 96). The evaluation of this approach shows that while precision and recall based on the PARSEVAL measures (Black et al. 91) do not reach state of the art Parsers yet (F1=87.19 on syntactic constituents, F1=77.78 including functionargument structure), the parser shows a very reliable performance where function-argument structure is concerned (F1=96.52). The lower F-scores are very often due to unattached constituents.
"[...] In 1639, Martin Opitz rescued for us the only complete surviving text of the Annolied (circa 1083), and now Graeme Dunphy has made available a reprint of the Opitz edition and with it Opitz’s prologue and notes, a new English translation, and the translator’s informative notes on the translation and on Opitz’s commentary. In his prologue Opitz expresses the purpose of the edition, which is to demonstrate that the German language was inherited by his contemporaries in an unbroken line from earliest times. This is a strikingly early formulation of the romantic thesis the Grimm brothers developed later. Thus by including Opitz’s prologue and notes on his sources and philological explanations, Dunphy gives us the essential tools to re-invigorate research in three areas: Opitz, who is too frequently thought of as a narrowly focused poeticist, the serious study of philology and history in the sixteenth century, and most importantly, the Annolied itself. [...]" Quelle: Maria Dobozy : http://www.iaslonline.de/index.php?vorgang_id=751
In this paper we propose a compositional semantics for lexicalized tree-adjoining grammar (LTAG). Tree-local multicomponent derivations allow separation of the semantic contribution of a lexical item into one component contributing to the predicate argument structure and a second component contributing to scope semantics. Based on this idea a syntax-semantics interface is presented where the compositional semantics depends only on the derivation structure. It is shown that the derivation structure (and indirectly the locality of derivations) allows an appropriate amount of underspecification. This is illustrated by investigating underspecified representations for quantifier scope ambiguities and related phenomena such as adjunct scope and island constraints.
This paper addresses the problem ofconstraints for relative quantifier sope, in partiular in inverse linking readings wherecertain scope orders are exluded. We show how to account for such restrictions in the Tree Adjoining Grammar (TAG) framework by adopting a notion offlexible composition. In the semantics we use for TAG we introduce quantifier sets that group quantifiers that are "glued" together in the sense that no other quantifieran scopally intervene between them. Theflexible composition approach allows us to obtain the desired quantifier sets and thereby the desiredconstraints for quantifier sope.
This paper argues for a particular architecture of OT syntax. This architecture hasthree core features: i) it is bidirectional, the usual production-oriented optimisation (called ‘first optimisation’ here) is accompanied by a second step that checks the recoverability of an underlying form; ii) this underlying form already contains a full-fledged syntactic specification; iii) especially the procedure checking for recoverability makes crucial use of semantic and pragmatic factors. The first section motivates the basic architecture. The second section shows with two examples, how contextual factors are integrated. The third section examines its implications for learning theory, and the fourth section concludes with a broader discussion of the advantages and disadvantages of the proposed model.
Most systematic discussion of dyad morphemes has focussed on Australian languages, owing to a combination of their relative prevalence there, and the development of a descriptive tradition that investigates them in some depth. In the course of researching this paper, however, I became aware of functionally and semantically similar morphemes in many other parts of the world, almost invariably described in isolation from any typological reference point. I have incorporated such data as far as I am aware of it, in the hope that a systematic study will encourage other investigators to identify, and investigate in detail, similar constructions in a range of languages. The current state of our research, however, as well as some interesting geographical skewings that I discuss below, such that outside Australia dyad constructions almost exclusively employ reciprocal morphology, means that most of this paper will focus on Australian languages.
CONTENTS
NEPAD AND AFRICAN PUBLISHING 2
HISTORY AND CULTURES IN AFRICA : THE MOVEMENT OF BOOKS 4
CURRENT OPPORTUNITIES AND CHALLENGES FACING AFRICAN PUBLISHERS 8
SAFEGUARDS AUTHORS’ WORKS 10
THE INTERNATIONAL CONFERENCE ON PUBLISHING IN THE CARIBBEAN 11
2002 NOMA AWARD WINNER 14
A REPORT OF THE ZIMBABWE INTERNATIONAL BOOK FAIR (ZIBF) 16
THE UNIVERSITY TRAINING COURSE 18
APNET AT THE 2003 NAIROBI INTERNATIONAL BOOK FAIR 21
THE JOMO KENYATTA PRIZE 24
BUISINESS OPPORTUNUITIES 25
REPORT OF THE 4TH FOIRE INTERNATIONALE DU LIVRE DE OUAGADOUGOU 30
APNET’S SECOND STRATEGIC PLAN 32
FIFTH PAN AFRICAN BOOKSELLERS ASSOCIATION CONVENTION 35
NOTICES 37
CHALLENGES AND OPPORTUNUITIES OF INTRA-AFRICAN TRADE IN EAST AFRICA 38
PROMOTIONS 42
Marcus Stiglegger revives a lost Gothic treasure in this brief discussion of Robert Sigl's Laurin—a rare case of German genre film-making and the heir to FW Murnau's legacy. Phantastic genre cinema is very rare in contemporary Germany—especially in the 1980s, the time when Italian horror reached another peak with Dario Argento's Opera (1985). The cliché of the German "easy comedy" ruled mainstream film production at the time, and so it appeared a kind of miracle when 27-year-old writer/director Robert Sigl was awarded the Bavarian Film Prize in 1988 for his debut feature: the Gothic horror fairytale Laurin.
We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats.
Quantitative evaluation of parsers has traditionally centered around the PARSEVAL measures of crossing brackets, (labeled) precision, and (labeled) recall. However, it is well known that these measures do not give an accurate picture of the quality of the parsers output. Furthermore, we will show that they are especially unsuited for partial parsers. In recent years, research has concentrated on dependencybased evaluation measures. We will show in this paper that such a dependency-based evaluation scheme is particularly suitable for partial parsers. TüBa-D, the treebank used here for evaluation, contains all the necessary dependency information so that the conversion of trees into a dependency structure does not have to rely on heuristics. Therefore, the dependency representations are not only reliable, they are also linguistically motivated and can be used for linguistic purposes.
This paper provides an overview of current research on a hybrid and robust parsing architecture for the morphological, syntactic and semantic annotation of German text corpora. The novel contribution of this research lies not in the individual parsing modules, each of which relies on state-of-the-art algorithms and techniques. Rather what is new about the present approach is the combination of these modules into a single architecture. This combination provides a means to significantly optimize the performance of each component, resulting in an increased accuracy of annotation.