OPUS 4 | Linguistik

MAIN: Multilingual Assessment Instrument for Narratives (2016)

Gagarina, Natalʹja Vladimirovna ; Klop, Daleen ; Kunnari, Sari ; Tantele, Koula ; Välimaa, Taina ; Balčiūnienė, Ingrida ; Bohnacker, Ute ; Walters, Joel

The Multilingual Assessment Instrument for Narratives (MAIN) was designed in order to assess narrative skills in children who acquire one or more languages from birth or from early age. MAIN is suitable for children from 3 to 10 years and evaluates both comprehension and production of narratives. Its design allows for the assessment of several languages in the same child, as well as for different elicitation modes: Model Story, Retelling, and Telling. MAIN contains four parallel stories, each with a carefully designed six-picture sequence. The stories are controlled for cognitive and linguistic complexity, parallelism in macrostructure and microstructure, as well as for cultural appropriateness and robustness. The instrument has been developed on the basis of extensive piloting with more than 550 monolingual and bilingual children aged 3 to 10, for 15 different languages and language combinations. Even though MAIN has not been norm-referenced yet, its standardized procedures can be used for evaluation, intervention and research purposes. MAIN is currently available in the following languages: English, Afrikaans, Albanian, Basque, Bulgarian, Croatian, Cypriot Greek, Danish, Dutch, Estonian, Finnish, French, German, Greek, Hebrew, Icelandic, Italian, Lithuanian, Norwegian, Polish, Russian, Spanish, Standard Arabic, Swedish, Turkish, Vietnamese, and Welsh.

IGGSA shared task on source and target extraction from political speeches (2016)

The Shared Task on Source and Target Extraction from Political Speeches (STEPS) first ran in 2014 and is organized by the Interest Group on German Sentiment Analysis (IGGSA). This volume presents the proceedings of the workshop of the second iteration of the shared task. The workshop was held at KONVENS 2016 at Ruhr-University Bochum on September 22, 2016. As in the first edition of the shared task the main focus of STEPS was on fine-grained sentiment analysis and offered a full task as well as two subtasks for the extraction Subjective Expressions and/or their respective Sources and Targets. In order to make the task more accessible, the annotation schema was revised for this year’s edition and an adjudicated gold standard was used for the evaluation. In contrast to the pilot task, this iteration provided training data for the participants, opening the Shared Task for systems based on machine learning approaches. The gold standard1 as well as the evaluation tool2 have been made publicly available to the research community via the STEPS’ website. We would like to thank the GSCL for their financial support in annotating the 2014 test data, which were available as training data in this iteration. A special thanks also goes to Stephanie Köser for her support on preparing and carrying out the annotation of this year’s test data. Finally, we would like to thank all the participants for their contributions and discussions at the workshop.

Handbuch zum Referenzkorpus Mittelhochdeutsch (2016)

Klein, Thomas ; Dipper, Stefanie

NLP4CMC III : 3rd workshop on natural language processing for computer-mediated communication (2016)

The present paper reports the first results of the compilation and annotation of a blog corpus for German. The main aim of the project is the representation of the blog discourse structure and relations between its elements (blog posts, comments) and participants (bloggers, commentators). The data included in the corpus were manually collected from the scientific blog portal SciLogs. The feature catalogue for the corpus annotation includes three types of information which is directly or indirectly provided in the blog or can be construed by means of statistical analysis or computational tools. At this point, only directly available information (e.g., title of the blog post, name of the blogger etc.) has been annotated. We believe, our blog corpus can be of interest for the general study of blog structure or related research questions as well as for the development of NLP methods and techniques (e.g. for authorship detection).

Compounds in early Greek first language acquisition – including an onomasiological approach to lexical typology of Greek and German (2016)

Stephany, Ursula ; Thomadaki, Evangelia

The early acquisition of Greek compounds by two monolingual Greek girls aged between 1;8 and 3;0 years is studied in a usage-based theoretical framework. Special importance is attached to the morphological structure of Greek compound types occurring in child speech and child-directed speech. Greek nominal compound formation does not consist in the mere juxtaposition of words or roots, but involves stems as well as a compound marker. Major questions addressed are the transparency of compounds and productive nominal compound formation. Evidence for productivity of nominal compound formation has been found with only one of the two girls. In contrast to other languages, neoclassical nominal compounds by far exceed endocentric subordinative ones tokenwise in Greek child speech and child-directed speech providing evidence of entrenchment rather than productivity. In a cross-linguistic comparison it is shown that, in spite of the fact that both Standard Modern Greek and German are rich in nominal compounds, their number is much more limited in Greek than in German child speech. An explanation for this apparent paradox is provided by an onomasiological approach to lexical typology based on a sample list of nominal compounds occurring in German child language and their Greek translational equivalents. It has been found that while use of nominal compounds is common in colloquial German including child-centered situations, it is more typical of Greek formal than colloquial registers.

Byproducts and side effects : Nebenprodukte und Nebeneffekte (2015)

The papers collected in this volume have very diverse topics – such as prosodic peculiarities (Meinunger and Hamlaoui & Roussarie), morphological items (McFadden and Steriopolo), or phenomena concerning syntax and its interfaces, such as syntax-morphology (Kamali), syntax-parsing (Winkler), or syntax-pragmatics (Bittner & Dery). The languages considered range from quite prominent German and French via Turkish to very exotic Nuuchahnulth or no longer spoken Old and Middle English. However, all contributions center around structural phenomena and provide analyses in terms of grammatical theory.

Guidelines für die Normalisierung historischer deutscher Texte (2015)

Krasselt, Julia ; Bollmann, Marcel ; Dipper, Stefanie ; Petran, Florian

Das hethitische Phonem /xw/ (2014)

Suter, Edgar

In the Hittite phonological system there was a labialized velar fricative /xw/ beside the plain velar fricative /x/ parallel to the opposition between the velar stops /kw/ and /k/. The frequent syllable /xwa/ was spelled either hu-(u) or hu-wa. Evidence from the frequency of words with initial hu in the lexicon, from spelling variations and from ablaut alternations is presented to demonstrate the existence of /xw/. It is suggested that Hittite /xw/ regularly corresponds to the reflexes of *w in the non-Anatolian Indo-European languages.

Proceedings of the Workshop BantuSynPhonIS : Preverbal Domain(s) (2014)

The papers in this volume take up some aspects of the preverbal domain(s) in Bantu languages. They were originally presented at the Workshop BantuSynPhonIS: Preverbal Domain(s), held at the Center for General Linguistics (ZAS), in Berlin, on 14-15 November 2014. This workshop was coorganized by ZAS (Fatima Hamlaoui & Tonjes Veenstra) and the Humboldt University (Tom Güldemann, Yukiko Morimoto and Ines Fiedler).

Zur Bestimmung der Zählbarkeit deutscher Substantive (2013)

Stadtfeld, Tobias

Diese Arbeit hat als übergeordnete und finale Zielsetzung das Bestreben eine systematische, effiziente und nachvollziehbare Bestimmung der lexikalisierten Zählbarkeit deutscher Substantive zu ermöglichen. Ein Unterfangen, das zu meinem Wissen bisher weder für Substantive des Deutschen, noch des Englischen, in einem größeren Maßstab unternommen wurde. Es gibt zwar einige Lexika, die bereits Einträge für nur im Singular oder nur im Plural auftretende Substantive beinhalten, jedoch ist mir keine Ressource bekannt, die eine qualitativ und quantitativ hochwertige Klassifizierung der lexikalischen Zählbarkeit von Substantiven des Englischen oder des Deutschen bietet. Ein Hinweis auf einen ausschließlich verwendeten Numerus eines Substantivs ist hierbei keineswegs ein zuverlässiges Indiz auf die Zählbarkeit dieses Substantivs, sondern lediglich eines von vielem Merkmalen, dass in Summe das ergibt, was gemeinhin unter dem Begriff Zählbarkeit summiert wird. Auch die Literatur zur Zählbarkeit selbst beschränkt sich fast durchgängig auf einige wenige Substantive, die wie auch bereits in dieser Einleitung geschehen, immer wieder und wieder diskutiert werden. Die Interpretation der Zählbarkeit von Hunden, Katzen und Kaninchen, sowie von Wein, Reis, Möbeln und Schmuck, wird auch in den Beispielen dieser Arbeit immer wieder von Bedeutung sein. Es ist allerdings offenkundig, dass das Deutsche oder Englische weitaus mehr Wörter als die soeben genannten beinhaltet und somit eine Betrachtung über diese Standardbeispiele hinaus sinnvoll ist. Es ist daher mein Bestreben, Tests und Richtlinien zur Bestimmung der lexikalischen Zählbarkeit von Substantiven zu entwickeln, diese auf über 1.000 Lemmata des Deutschen anzuwenden und somit erstmals einen Gold-Standard zu etablieren, der neben qualitativen Betrachtungen auch eine quantitative Untersuchung der Zählbarkeit von Wörtern in einer großen deutschsprachigen Tageszeitung erlaubt.

Open Access

Linguistik

Filtern

Autor*in

Erscheinungsjahr

Dokumenttyp

Sprache

Volltext vorhanden

Gehört zur Bibliographie

Schlagworte

Institut

254 Treffer