Treebank profiling of spoken and written German
- This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogs, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper ´die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.
Author: | Erhard Hinrichs, Sandra KüblerORCiDGND |
---|---|
URN: | urn:nbn:de:hebis:30-1111304 |
URL: | http://cl.indiana.edu/~skuebler/papers/GermanEstimation.pdf |
Document Type: | Preprint |
Language: | English |
Year of Completion: | 2005 |
Year of first Publication: | 2005 |
Publishing Institution: | Universitätsbibliothek Johann Christian Senckenberg |
Release Date: | 2008/11/03 |
Page Number: | 12 |
Note: | Erschienen in: Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (TLT). Barcelona, Spain, December 2005, S. 65-76 |
Source: | http://jones.ling.indiana.edu/~skuebler/papers/GermanEstimation.pdf ; Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories - Barcelona, Spain. |
HeBIS-PPN: | 206937660 |
Institutes: | keine Angabe Fachbereich / Extern |
Dewey Decimal Classification: | 4 Sprache / 40 Sprache / 400 Sprache |
Sammlungen: | Linguistik |
Linguistik-Klassifikation: | Linguistik-Klassifikation: Computerlinguistik / Computational linguistics |
Licence (German): | Deutsches Urheberrecht |