ZASPiL 40 = Speech production and perception : Experimental analyses and models
Refine
Year of publication
- 2005 (5)
Document Type
- Part of a Book (5)
Language
- English (5)
Has Fulltext
- yes (5)
Is part of the Bibliography
- no (5)
Keywords
- Artikulatorische Phonetik (5) (remove)
Four speakers repeated 8 times 15 sentences containing 'pVp' syllables (V being /a/, /i/ and /u/). The 'pVp' syllables were located in final, penultimate and antepenultimate position relatively to the Intonational Phrase (IP) boundary. They were embedded in lexical words of 1-3 syllables and were either word-initial or word-final. Results show that the closer the vowel in word-final position is to the IP boundary, the longer the duration and the higher the fundamental frequency of the vowel; it is also characterised by larger lip opening gestures. The potential reduction or coarticulation of vowels in wordinitial position compared to their counterparts in word-final position is discussed.
This paper describes the processing of MRI and CT images needed for developing a 3D linear articulatory model of velum. The 3D surface that defines each organ constitutive of the vocal and nasal tracts is extracted from MRI and CT images recorded on a subject uttering a corpus of artificially sustained French vowels and consonants. First, the 2D contours of the organs have been manually extracted from the corresponding images, expanded into 3D contours, and aligned in a common 3D coordinate system. Then, for each organ, a generic mesh has been chosen and fitted by elastic deformation to each of the 46 3D shapes of the corpus. This has finally resulted in a set of organ surfaces sampled with the same number of 3D vertices for each articulation, which is appropriate for Principal Component Analysis or linear decomposition. The analysis of these data has uncovered two main uncorrelated articulatory degrees of freedom for the velum's movement. The associated parameters are used to control the model. We have in particular investigated the question of a possible correlation between jaw / tongue and velum's movement and have not find more correlation than the one found in the corpus.
The contribution of von Kempelen's "Mechanism of Speech" to the 'phonetic sciences' will be analyzed with respect to his theoretical reasoning on speech and speech production on the one hand and on the other in connection with his practical insights during his struggle in constructing a speaking machine. Whereas in his theoretical considerations von Kempelen's view is focussed on the natural functioning of the speech organs – cf. his membraneous glottis model – in constructing his speaking machine he clearly orientates himself towards the auditory result – cf. the bag pipe model for the sound generator used for the speaking machine instead. Concerning vowel production his theoretical description remains questionable, but his practical insight that vowels and speech sounds in general are only perceived correctly in connection with their surrounding sounds – i.e. the discovery of coarticulation – is clearly a milestone in the development of the phonetic sciences: He therefore dispenses with the Kratzenstein tubes, although they might have been based on more thorough acoustic modelling.
Finally, von Kempelen's model of speech production will be discussed in relation to the discussion of the acoustic nature of vowels afterwards [Willis and Wheatstone as well as von Helmholtz and Hermann in the 19th century and Stumpf, Chiba & Kajiyama as well as Fant and Ungeheuer in the 20th century].
A visual articulatory model and its application to therapy
of speech disorders : a pilot study
(2005)
A visual articulatory model based on static MRI-data of isolated sounds and its application in therapy of speech disorders is described. The model is capable of generating video sequences of articulatory movements or still images of articulatory target positions within the midsagittal plane. On the basis of this model (1) a visual stimulation technique for the therapy of patients suffering from speech disorders and (2) a rating test for visual recognition of speech movements was developed. Results indicate that patients produce recognition rates above level of chance already without any training and that patients are capable of increasing their recognition rate over the time course of therapy significantly.
A fundamental question in the study of speech is about the invariance of the ultimate percepts, or features. The present paper gives an overview of the noninvariance problem and offers some hints towards a solution. Examination of various data on place and voicing perception suggests the following points. Features correspond to natural boundaries between sounds, which are included in the infant's predispositions for speech perception. Adult percepts arise from couplings and contextual interactions between features. Both couplings and interactions contribute to invariance. But this is at the expense of profound qualitative changes in perceptual boundaries implying that features are neither independently nor invariantly perceived. The question then is to understand the principles which guide feature couplings and interactions during perceptual development. The answer might reside in the fact that: (1) adult boundaries converge to a single point of the perceptual space, suggesting a context-free central reference; (2) this point corresponds to the neutral vocoïd, suggesting the reference is related to production; (3) at this point perceptual boundaries correspond to the natural ones, suggesting the reference is anchored in predispositions for feature perception. In sum, perceptual invariance seems to be grounded on a radial representation of the vocal tract around a singular point at which boundaries are context-fee, natural and coincide with the neutral vocoïd.