Unity in diversity : integrating differing linguistic data in TUSNELDA

  • This paper describes the creation and preparation of TUSNELDA, a collection of corpus data built for linguistic research. This collection contains a number of linguistically annotated corpora which differ in various aspects such as language, text sorts / data types, encoded annotation levels, and linguistic theories underlying the annotation. The paper focuses on this variation on the one hand and the way how these heterogeneous data are integrated into one resource on the other hand.

Download full text files

Export metadata

Author:Andreas Wagner
Parent Title (English):Heterogeneity in focus: creating and using linguistic databases / Dipper, Stefanie, M. Götze and M. Stede (eds.) ; Working Papers of the SFB 632, Interdisciplinary studies on information structure ; Vol. 2
Place of publication:Potsdam
Editor:Stefanie Dipper, Michael Götze, Manfred Stede
Document Type:Part of a Book
Year of Completion:2005
Year of first Publication:2005
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Release Date:2008/11/10
GND Keyword:Jean / Siebenkäs; Erzählperspektive; Ehe <Motiv>; Johann Wolfgang von Goethe; Datenbanksystem; Sprachdaten; Mehrsprachigkeit; Heterogenität; Kongress; Potsdam <2004>
Page Number:20
First Page:1
Last Page:20
Source:http://www.sfb632.uni-potsdam.de/publications/isis02_1wagner.pdf ; (in:) S. Dipper / M. Götze / M. Stede : Heterogeneity in focus : creating and using linguistic satabases, Interdisciplinary Studies on Information Structure (ISIS), 2, 2005, S. 1-20
Institutes:keine Angabe Fachbereich / Extern
Dewey Decimal Classification:4 Sprache / 40 Sprache / 400 Sprache
Licence (German):License LogoDeutsches Urheberrecht