Linguistik
Refine
Document Type
- Part of a Book (15) (remove)
Language
- English (15)
Has Fulltext
- yes (15)
Is part of the Bibliography
- no (15)
Keywords
- Artikulation (15) (remove)
This paper describes the processing of MRI and CT images needed for developing a 3D linear articulatory model of velum. The 3D surface that defines each organ constitutive of the vocal and nasal tracts is extracted from MRI and CT images recorded on a subject uttering a corpus of artificially sustained French vowels and consonants. First, the 2D contours of the organs have been manually extracted from the corresponding images, expanded into 3D contours, and aligned in a common 3D coordinate system. Then, for each organ, a generic mesh has been chosen and fitted by elastic deformation to each of the 46 3D shapes of the corpus. This has finally resulted in a set of organ surfaces sampled with the same number of 3D vertices for each articulation, which is appropriate for Principal Component Analysis or linear decomposition. The analysis of these data has uncovered two main uncorrelated articulatory degrees of freedom for the velum's movement. The associated parameters are used to control the model. We have in particular investigated the question of a possible correlation between jaw / tongue and velum's movement and have not find more correlation than the one found in the corpus.
Articulatory token-to-token variability not only depends on linguistic aspects like the phoneme inventory of a given language but also on speaker specific morphological and motor constraints. As has been noted previously (Perkell (1997), Mooshammer et al. (2004)), speakers with coronally high "domeshaped" palates exhibit more articulatory variability than speakers with coronally low "flat" palates. One explanation for that is based on perception oriented control by the speaker. The influence of articulatory variation on the cross sectional area and consequently on the acoustics should be greater for flat palates than for domeshaped ones. This should force speakers with flat palates to place their tongue very precisely whereas speakers with domeshaped palates might tolerate a greater variability. A second explanation could be a greater amount of lateral linguo-palatal contact for flat palates holding the tongue in position. In this study both hypotheses were tested.
In order to investigate the influence of the palate shape on the variability of the acoustic output a modelling study was carried out. Parallely, an EPG experiment was conducted in order to investigate the relationship between palate shape, articulatory variability and linguo-palatal contact.
Results from the modelling study suggest that the acoustic variability resulting from a certain amount of articulatory variability is higher for flat palates than for domeshaped ones. Results from the EPG experiment with 20 speakers show that (1.) speakers with a flat palate exhibit a very low articulatory variability whereas speakers with a domeshaped palate vary, (2.) there is less articulatory variability if there is lots of linguo-palatal contact and (3.) there is no relationship between the amount of lateral linguo-palatal contact and palate shape. The results suggest that there is a relationship between token-to-token variability and palate shape, however, it is not that the two parameters correlate, but that speakers with a flat palate always have a low variability because of constraints of the variability range of the acoustic output whereas speakers with a domeshaped palate may choose the degree of variability. Since linguo-palatal contact and variability correlate it is assumed that linguo-palatal contact is a means for reducing the articulatory variability.
Low- dimensional and speaker-independent linear vocal tract parametrizations can be obtained using the 3-mode PARAFAC factor analysis procedure first introduced by Harshman et al. (1977) and discussed in a series of subsequent papers in the Journal of the Acoustical Society of America (Jackson (1988), Nix et al. (1996), Hoole (1999), Zheng et al. (2003)). Nevertheless, some questions of importance have been left unanswered, e.g. none of the papers using this method has provided a consistent interpretation of the terms usually referred to as "speaker weights". This study attempts an exploration of what influences their reliability as a first step towards their consistent interpretation. With this in mind, we undertook a systematic comparison of the classical PARAFAC1 algorithm with a relaxed version, of it, PARAFAC2. This comparison was carried out on two different corpora acquired by the articulograph, which varied in vowel qualities, consonantal contexts, and the paralinguistic features accent and speech rate. The difference between these statistical approaches can grossly be described as follows: In PARAFAC1, observation units pertain to the same set of variables and the observation units are comparable. In PARAFAC2, observations pertain to the same set of variables, but observation units are not comparable. Such a situation can be easily conceived in a situation such as we are describing: The operationalization we took relies on the comparability of fleshpoint data acquired from different speakers, which need not be a good assumption due to influences like sensor placement and morphological conditions.
In particular, the comparison between the two different approaches is carried out by means of so-called "leverages" on different component matrices originating in regression analysis, calculated as v = diag(A(A A)−1A ) and delivering information on how "influential" a particular loading matrix is for the model. This analysis could potentially be carried out component by component, but we confined ourselves to effects on the global factor structure. For vowels, the most influential loadings are those for the tense cognates of non-palatal vowels. For speakers, the most prominent result is the relative absence of effects of the paralinguistic variables. Results generally indicate that there is quite little influence of the model specification (i.e. PARAFAC1 or PARAFAC2) on vowel and subject components. The patterns for the articulators indicate that there are strong differences between speakers with respect to the most influential measurement as revealed by PARAFAC2: In particular, the most influential y-contribution is the tongue-back for some talkers and the tongue-dorsum for other speakers. With respect to the speaker weights, again, the leverage patterns are very similar for both PARAFAC-versions. These patterns converge with the results of the loading plots, where the articulator profiles seem to be most altered by the use of PARAFAC2. These findings, in general, are interpreted as evidence for the reliability of the PARAFAC1 speaker weights.
Rate effects on aerodynamics of intervocalic stops : evidence from real speech data and model data
(2008)
This paper is a first attempt towards a better understanding of the aerodynamic properties during speech production and their potential control. In recent years, studies on intraoral pressure in speech have been rather rare, and more studies concern the air flow development. However, the intraoral pressure is a crucial factor for analysing the production of various sounds.
In this paper, we focus on the intraoral pressure development during the production of intervocalic stops.
Two experimental methodologies are presented and confronted with each other: real speech data recorded for four German native speakers, and model data, obtained by a mechanical replica which allows reproducing the main physical mechanisms occurring during phonation. The two methods are presented and applied to a study on the influence of speech rate on aerodynamic properties.
We focus in this paper on two prosodic phenomena in Chimwiini: vowel length and accent (or High tone). Vowel length is determined in part by a lexical distinction between long and short vowels, and also by various morphophonemic processes that derive long vowels. Accent is penult in the default case, but final under certain morphosyntactic conditions. In order to account for the distribution of vowel length and the location of accents in a Chimwiini sentence, it is necessary to segment sentences into a sequence of phonological phrases. This paper examines the phonological phrasing of both canonical relative clauses and what we refer to as "pseudo-relative" clauses. An account of relative clause phrasing is of critical importance in Chimwiini due to the extensive use of pseudo-relatives in the language. Close examination of the pseudo-relatives reveals that their phrasing is not exactly the same as the phrasing of canonical relative clauses.
This paper examines how questions, both Wh-questions and yes-no questions, are phrased in Chimwiini, a Bantu language spoken in southern Somalia. Questions do not require any special phrasing principles, but Wh-questions do provide much evidence in support of the principle Align-Foc R, which requires that focused or emphasized words/constituents be located at the end of a phonological phrase. Question words and enclitics are always focused and thus appear at the end of a phrase. Although questions do not require any new phrasing principles, they do display complex accentual (tonal) behavior. This paper attempts to provide an account of these accentual phenomena.
Studying kinematic behavior in speech production is an indispensable and fruitful methodology in order to describe for instance phonemic contrasts, allophonic variations, prosodic effects in articulatory movements. More intriguingly, it is also interpreted with respect to its underlying control mechanisms. Several interpretations have been borrowed from motor control studies of arm, eye, and limb movements. They do either explain kinematics with respect to a fine tuned control by the Central Nervous System (CNS) or they take into account a combination of influences arising from motor control strategies at the CNS level and from the complex physical properties of the peripheral speech apparatus. We assume that the latter is more realistic and ecological. The aims of this article are: first, to show, via a literature review related to the so called '1/3 power law' in human arm motor control, that this debate is of first importance in human motor control research in general. Second, to study a number of speech specific examples offering a fruitful framework to address this issue. However, it is also suggested that speech motor control differs from general motor control principles in the sense that it uses specific physical properties such as vocal tract limitations, aerodynamics and biomechanics in order to produce the relevant sounds. Third, experimental and modelling results are described supporting the idea that the three properties are crucial in shaping speech kinematics for selected speech phenomena. Hence, caution should be taken when interpreting kinematic results based on experimental data alone.
The present paper offers a summary of the results of two earlier experiments (Nawrocki and Gonet 2004; Nawrocki 2004), in which acoustic properties of the voiceless velar fricative phoneme /x/ in Southern Polish were investigated.
As is found in both studies (Nawrocki and Gonet 2004; Nawrocki 2004), speakers of both genders favour glottal articulation, with partial or full voicing. Word final contexts are decisively in favour of [x]. The word initial, prevocalic positions seem to allow quite a number of allophonic variants of /x/ . These are: [x], [ɦ], [ç] and, additionally, the voiceless glottal, the pharyngeal or the epiglottal [h]/[ħ]/[ʜ]. Another factor taken into account is the coarticulation effect of the vocalic context on the choice of articulation. Based on the results of the experiments, a reformulated allophonic composition is proposed for Polish /x/. It makes room for previously unconsidered pharyngeal and glottal allophones.
In order to inspect the acoustic properties of the allophones of Polish /x/ further, their static and dynamic spectral features are compared to those of phonetically similar sounds in other languages where they have the status of independent phonemes. Special attention is paid to the distribution of spectral peaks and their intensity. The fact that in Polish there are no 'back' fricative phonemes that would contrast with /x/ creates a wide range of acceptable allophonic articulations that cannot be challenged from either articulatory or perceptual points of view.
The present article illustrates that the specific articulatory and aerodynamic requirements for voiced but not voiceless alveolar or dental stops can cause tongue tip retraction and tongue mid lowering and thus retroflexion of front coronals. This retroflexion is shown to have occurred diachronically in the three typologically unrelated languages Dhao (Malayo-Polynesian), Thulung (Sino-Tibetan), and Afar (East-Cushitic). In addition to the diachronic cases, we provide synchronic data for retroflexion from an articulatory study with four speakers of German, a language usually described as having alveolar stops. With these combined data we supply evidence that voiced retroflex stops (as the only retroflex segments in a language) did not necessarily emerge from implosives, as argued by Haudricourt (1950), Greenberg (1970), Bhat (1973), and Ohala (1983). Instead, we propose that the voiced front coronal plosive /d/ is generally articulated in a way that favours retroflexion, that is, with a smaller and more retracted place of articulation and a lower tongue and jaw position than /t/.
We measure face deformations during speech production using a motion capture system, which provides 3D coordinate data of about 60 markers glued on the speaker's face. An arbitrary orthogonal factor analysis followed by a principal component analysis (together called a guided PCA) of the data has showed that the first 6 factors explain about 90% of the variance, for each of our 3 speakers. The 6 derived factors, therefore, allow us to efficiently analyze or to reconstruct with a reasonable accuracy the observed face deformations. Since these factors can be interpreted in articulatory terms, they can reveal underlying articulatory organizations. The comparison of lip gestures in terms of data derived factors suggests that these speakers differently maneuver the lips to achieve contrast between /s/ and /R/. Such inter-speaker variability can occur because the acoustic contrast of these fricatives is shaped not only by the lip tube but also by cavities inside the mouth such as the sublingual cavity. In other words, these tube and cavity can acoustically compensate each other to produce their required acoustic properties.