Linguistik
Refine
Year of publication
- 2005 (152) (remove)
Document Type
- Part of a Book (54)
- Article (51)
- Conference Proceeding (17)
- Preprint (13)
- Book (7)
- Working Paper (5)
- Report (3)
- diplomthesis (1)
- Other (1)
Language
Has Fulltext
- yes (152)
Is part of the Bibliography
- no (152)
Keywords
- Deutsch (14)
- Artikulation (13)
- Artikulatorische Phonetik (13)
- Phonetik (13)
- Englisch (11)
- Artikulator (9)
- Bedeutungswandel (6)
- Computerlinguistik (6)
- Akustische Phonetik (5)
- Fremdsprachenlernen (5)
Institute
The author presents MASSY, the MODULAR AUDIOVISUAL SPEECH SYNTHESIZER. The system combines two approaches of visual speech synthesis. Two control models are implemented: a (data based) di-viseme model and a (rule based) dominance model where both produce control commands in a parameterized articulation space. Analogously two visualization methods are implemented: an image based (video-realistic) face model and a 3D synthetic head. Both face models can be driven by both the data based and the rule based articulation model.
The high-level visual speech synthesis generates a sequence of control commands for the visible articulation. For every virtual articulator (articulation parameter) the 3D synthetic face model defines a set of displacement vectors for the vertices of the 3D objects of the head. The vertices of the 3D synthetic head then are moved by linear combinations of these displacement vectors to visualize articulation movements. For the image based video synthesis a single reference image is deformed to fit the facial properties derived from the control commands. Facial feature points and facial displacements have to be defined for the reference image. The algorithm can also use an image database with appropriately annotated facial properties. An example database was built automatically from video recordings. Both the 3D synthetic face and the image based face generate visual speech that is capable to increase the intelligibility of audible speech.
Other well known image based audiovisual speech synthesis systems like MIKETALK and VIDEO REWRITE concatenate pre-recorded single images or video sequences, respectively. Parametric talking heads like BALDI control a parametric face with a parametric articulation model. The presented system demonstrates the compatibility of parametric and data based visual speech synthesis approaches.
The goal of our current project is to build a system that can learn to imitate a version of a spoken utterance using an articulatory speech synthesiser. The approach is informed and inspired by knowledge of early infant speech development. Thus we expect our system to reproduce and exploit the utility of infant behaviours such as listening, vocal play, babbling and word imitation. We expect our system to develop a relationship between the sound-making capabilities of its vocal tract and the phonetic/phonological structure of imitated utterances. At the heart of our approach is the learning of an inverse model that relates acoustic and motor representations of speech. The acoustic to auditory mappings uses an auditory filter bank and a self-organizing phase of learning. The inverse model from auditory to vocal tract control parameters is estimated using a babbling phase, in which the vocal tract is essentially driven in a random manner, much like the babbling phase of speech acquisition in infants. The complete system can be used to imitate simple utterances through a direct mapping from sound to control parameters. Our initial results show that this procedure works well for sounds generated by its own voice. Further work is needed to build a phonological control level and achieve better performance with real speech.
It is one of the most highly debated issues in loanword phonology whether loanword adaptations are phonologically or phonetically driven. This paper addresses this issue and aims at demonstrating that only the acceptance of both a phonological as well as a phonetic approximation stance can adequately account for the data found in Japanese. This point will be exemplified with the adaptation of German and French mid front rounded vowels in Japanese. It will be argued that the adaptation of German /oe/ and /ø/ as Japanese /e/ is phonologically grounded, whereas the adaptation of French /oe/ and /ø/ as Japanese /u/ is phonetically grounded. This asymmetry in the adaptation process of German and French mid front rounded vowels and further examples of loans in Japanese lead to the only conclusion that both strategies of loanword adaptation occur in languages. It will be shown that not only perception, but also the influence of orthography, of conventions and the knowledge of the source language play a role in the adaptation process.
The purpose of this dissertation is to defend the idea that the empirical responsibilities of binding theory can be handled in a more psychologically and historically realistic way when assigned to the field of pragmatics. In particular, I wish to show that Optimality Theory (OT) (Prince & Smolensky, 1993), the stochastic OT and Gradual Learning Algorithm of Boersma (1998), the Recoverability of OT of Wilson (2001) and Buchwald et al. (2002), and the bidirectional OT of Blutner (2000b) and Bidirectional Gradual Learning Algorithm of Jäger (2003a) can all participate in a formal framework in which one can formally spell out and justify the idea that the distributional behavior of bound pronouns and reflexivs is a pragmatic phenomenon.
Der Beitrag geht davon aus, dass Phraseologismen zum einen als prototypische Verkörperung des ,,kulturellen Gedächtnisses" einer Diskursgemeinschaft, zum anderen als ein universelles, jeder Sprachkultur immanentes Kulturphänomen angesehen werden können. In diesem Zusammenhang setzt er sieh zum Ziel, das Spannungsfeld der Verflochtenheit von ,Kultur" und "Sprache" mit ihren Ausprägungen und Konsequenzen am Material der Phraseologie im Hinblick auf das Deutsche und das Ungarische analytisch herauszuarbeiten und mehrperspektivisch zu hinterfragen. Denn die - im Titel der Tagung hervorgehobene - Kulturgeschichte und die Phraseologie stellen eine äußerst facettenreiche Thematik dar, die eine Reihe kulturphilosophischer, kultursemiotischer, interkultureller, kognitiv-linguistischer u. a. Fragen aufwirft und sowohl einen synchronen als auch einen diachronen Betrachtungsrahmen effordert. Der vorliegende Aufsatz kann sich jedoch auf lediglich einige aktuelle theoretische, methodologische und empirische Aspekte konzentrieren und möchte in disziplinärer Hinsicht kontrastiv und kontaktlinguistisch - dabei methodologisch phänomen- bzw. belegorientiert und problernbezogen - vorgehen.
This study investigates supralaryngeal mechanisms of the two way voicing contrast among German velar stops and the three way contrast among Korean velar stops, both in intervocalic position. Articulatory data won via electromagnetic articulography of three Korean speakers and acoustic recordings of three Korean and three German speakers are analysed. It was found that in both languages the voicing contrast is created by more than one mechanism. However, one can say that for Korean velar stops in intervocalic position stop closure duration is the most important parameter. For German it is closure voicing. The results support the phonological description proposed by Kohler (1984).
Articulatory token-to-token variability not only depends on linguistic aspects like the phoneme inventory of a given language but also on speaker specific morphological and motor constraints. As has been noted previously (Perkell (1997), Mooshammer et al. (2004)) , speakers with coronally high "domeshaped" palates exhibit more articulatory variability than speakers with coronally low "flat" palates. One explanation for that is based on perception oriented control by the speaker. The influence of articulatory variation on the cross sectional area and consequently on the acoustics should be greater for flat palates than for domeshaped ones. This should force speakers with flat palates to place their tongue very precisely whereas speakers with domeshaped palates might tolerate a greater variability. A second explanation could be a greater amount of lateral linguo-palatal contact for flat palates holding the tongue in position. In this study both hypotheses were tested.
This study investigates supralaryngeal mechanisms of the two way voicing contrast among German velar stops and the three way contrast among Korean velar stops, both in intervocalic position. Articulatory data won via electromagnetic articulography of three Korean speakers and acoustic recordings of three Korean and three German speakers are analysed. It was found that in both languages the voicing contrast is created by more than one mechanism. However, one can say that for Korean velar stops in intervocalic position stop closure duration is the most important parameter. For German it is closure voicing. The results support the phonological description proposed by Kohler (1984).
This article combines a brief introduction into a particular philosophical theory of "time" with a demonstration of how this theory has been implemented in a Literary Studies oriented Humanities Computing project. The aim of the project was to create a model of text-based time cognition and design customized markup and text analysis tools that help to understand ‘‘how time works’’: more precisely, how narratively organised and communicated information motivates readers to generate the mental image of a chronologically organized world. The approach presented is based on the unitary model of time originally proposed by McTaggart, who distinguished between two perspectives onto time, the so-called A- and B-series. The first step towards a functional Humanities Computing implementation of this theoretical approach was the development of TempusMarker—a software tool providing automatic and semi-automatic markup routines for the tagging of temporal expressions in natural language texts. In the second step we discuss the principals underlying TempusParser—an analytical tool that can reconstruct temporal order in events by way of an algorithm-driven process of analysis and recombination of textual segments during which the "time stamp" of each segment as indicated by the temporal tags is interpreted.
Heterogeneity and standardization in data, use, and annotation : a diachronic corpus of German
(2005)
This paper describes the standardization problems that come up in a diachronic corpus: it has to cope with differing standards with regard to diplomaticity, annotation, and header information. Such highly heterogeneous texts must be standardized to allow for comparative research without (too much) loss of information.