What linguists always wanted to know about german and did not know how to estimate

This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TüBa-D/S, a treebank of transliterated spontaneous dialogues, and the TüBa-D/Z treebank of newspaper articles published in the German daily newspaper die tageszeitung´(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres.

Export metadata

  • Export Bibtex
  • Export RIS

Additional Services

    Share in Twitter Search Google Scholar
Metadaten
Author:Erhard W. Hinrichs, Sandra Kübler
URN:urn:nbn:de:hebis:30-1111319
Document Type:Article
Language:English
Date of Publication (online):03.11.2008
Year of first Publication:2006
Publishing Institution:Univ.-Bibliothek Frankfurt am Main
Source:http://jones.ling.indiana.edu/~skuebler/papers/karlsson.pdf ; Special Supplement to SKY Journal of Linguistics 19.
HeBIS PPN:206938268
Dewey Decimal Classification:400 Sprache
Sammlungen:Linguistik
Linguistik-Klassifikation:Linguistik-Klassifikation: Computerlinguistik / Computational linguistics
Licence (German):License Logo Veröffentlichungsvertrag für Publikationen ohne Print on Demand

$Rev: 8725 $