Linguistik-Klassifikation
Refine
Document Type
- Conference Proceeding (4)
- Part of a Book (2)
- Article (1)
Language
- English (7)
Has Fulltext
- yes (7)
Is part of the Bibliography
- no (7)
Keywords
- Gestik (2)
- Lautsprache (2)
- Multimodalität (2)
- Visualisierung (2)
- Antwort (1)
- Artikulation (1)
- Computerlinguistik (1)
- Deixis (1)
- Dentallaut (1)
- Deutsche Gebärdensprache (1)
Institute
Saying and shaking 'No'
(2021)
In many instances, the head shake can be used instead of or in addition to verbal 'No'. Based on previous work on negation in dialogue, we observe head shaking as answer particles and as responding to an implicit or an exophoric (i.e., real world situation) antecedent. Exophoric head shake, however, seems to come in two flavours: with positive and with negative emotional valuation of the antecedent situation. We provide semantic analyses for all three uses (and a head nod) within an HPSG version which is implemented in Type Theory with Records and the dialogue framewok KoS. In particular, we extend on previous work by grounding ''exophoric negation'' in positive or negative appraisal. Finally, we briefly speculate about differences between verbal 'No' and head shaking due to (the lack of) simultaneity.
The paper addresses verbal agreement in German sign language from a constraint-based perspective. Based on Meir's Agreement Morphology Principles it presents an HPSG analysis of plain, regular and backwards agreement verbs that models the interaction between phonological (manual) features and syntactico-semantic relationships within a verbal sign by well-defined lexical restrictions. We argue that a sign-based declarative analysis can provide an elegant approach to agreement in sign language since it allows to exploit cross-modular constraints within grammar, and hence permits a direct manipulation of all relevant phonological features of a verb depending on its syntactic and semantic properties.
The use of hand gestures to point at objects and individuals, or to navigate through landmarks on a virtually created map is ubiquitous in face-to-face conversation. We take this observation as a starting point, and we demonstrate that deictic gestures can be analysed on a par with speech by using standard methods from constraint-based grammars such as HPSG. In particular, we use the form of the deictic signal, the form of the speech signal (including its prosodic marking) and their relative temporal performance to derive an integrated multimodal tree that maps to an integrated multimodal meaning. The integration process is constrained via construction rules that rule out ill-formed input. These rules are driven from an empirical corporal study which sheds light on the interaction between speech and deictic gesture.
This paper addresses the form-meaning relation of multimodal communicative actions by means of a grammar that combines verbal input with hand gestures. Unlike speech, gesture signals are interpretable only through their semantic relation to the synchronous speech content. This relation serves to resolve the incomplete meaning that is revealed by gestural form alone. We demonstrate that by using standard linguistic methods, speech and gesture can be integrated in a constrained way into a single derivation tree which maps to a uniform meaning representation.
We measure face deformations during speech production using a motion capture system, which provides 3D coordinate data of about 60 markers glued on the speaker's face. An arbitrary orthogonal factor analysis followed by a principal component analysis (together called a guided PCA) of the data has showed that the first 6 factors explain about 90% of the variance, for each of our 3 speakers. The 6 derived factors, therefore, allow us to efficiently analyze or to reconstruct with a reasonable accuracy the observed face deformations. Since these factors can be interpreted in articulatory terms, they can reveal underlying articulatory organizations. The comparison of lip gestures in terms of data derived factors suggests that these speakers differently maneuver the lips to achieve contrast between /s/ and /R/. Such inter-speaker variability can occur because the acoustic contrast of these fricatives is shaped not only by the lip tube but also by cavities inside the mouth such as the sublingual cavity. In other words, these tube and cavity can acoustically compensate each other to produce their required acoustic properties.
The author presents MASSY, the MODULAR AUDIOVISUAL SPEECH SYNTHESIZER. The system combines two approaches of visual speech synthesis. Two control models are implemented: a (data based) di-viseme model and a (rule based) dominance model where both produce control commands in a parameterized articulation space. Analogously two visualization methods are implemented: an image based (video-realistic) face model and a 3D synthetic head. Both face models can be driven by both the data based and the rule based articulation model.
The high-level visual speech synthesis generates a sequence of control commands for the visible articulation. For every virtual articulator (articulation parameter) the 3D synthetic face model defines a set of displacement vectors for the vertices of the 3D objects of the head. The vertices of the 3D synthetic head then are moved by linear combinations of these displacement vectors to visualize articulation movements. For the image based video synthesis a single reference image is deformed to fit the facial properties derived from the control commands. Facial feature points and facial displacements have to be defined for the reference image. The algorithm can also use an image database with appropriately annotated facial properties. An example database was built automatically from video recordings. Both the 3D synthetic face and the image based face generate visual speech that is capable to increase the intelligibility of audible speech.
Other well known image based audiovisual speech synthesis systems like MIKETALK and VIDEO REWRITE concatenate pre-recorded single images or video sequences, respectively. Parametric talking heads like BALDI control a parametric face with a parametric articulation model. The presented system demonstrates the compatibility of parametric and data based visual speech synthesis approaches.