Lecture: Keynote: Multimodal marking of prominence in communication

Frank Kügler

Following a multimodal conception of language [1], in this talk I will discuss the contribution of speech prosody and co-speech gestures to express prominence in communication. While speaking, interlocutors usually transfer information to their conversation partners to update the speaker-hearer common ground. The organisation of information transfer is subsumed under the notion of ‘information structure’ [2]. ‘Focus’ – being a cognitive category of information structure that represents the most important information of a sentence – usually carries high prominence [3]. In Germanic languages, a ‘focus’ is expressed prosodically by means of a pitch accent that is acoustically enhanced compared to the prosodic marking of less prominent information [e.g., 4].
From a visual perspective, co-speech gestures have been shown to coordinate with prosodic events [e.g., 5, 6]. Co-speech gestures are classified into different types according to differences in function [7]. For the present study, I will focus on iconic (referential) and beat (non-referential) gestures. An iconic gesture represents or supports a part of speech visually, e.g., a rounded window by forming a circle with both hands. A beat gesture may signal linguistic structure like focusing information by a single stroke of a stretched finger. Recent research however has shown that gestural category boundaries may overlap in multiple dimensions such as their form, function, and meaning [e.g., 8, 9]. To investigate the gesture-prosody link in more detail, the multidimensional hypothesis predicts similar behaviour of gestures in their coordination with prosody independent of gesture type.
Speech data are taken from the Bielefeld Speech and Gesture Alignment (SaGA) corpus [10]. The results show that focused constituents are always marked with a pitch accent, and between 25% and 35% of those are additionally accompanied by a co-speech gesture. Regarding the temporal synchronisation of gestures with the closest pitch accent, an interesting pattern arises: Focus synchronises the gestures more closely to the pitch accent than in non-focused constituents. This pattern points to two interesting facts. First, focus not only goes hand in hand with more articulatory effort at the segmental level [11] and in prosody [e.g., 12] but also at the visual level in terms of gestures and their coordination with speech. This holds both for iconic and beat gestures. Second, a pragmatic function of highlighting seems to be added to iconic gestures to their otherwise ascribed dimension of expressing a semantic relation to speech concepts. Based on the data, I conclude that co-speech gestures relate to phonological structure and thus have important functions of signalling linguistic structure and highlighting of information.
References
[1] P. Perniss, “Why We Should Study Multimodal Language,” Front. Psychology, vol. 9, p. 1109, 2018, doi: 10.3389/fpsyg.2018.01109.
[2] M. Krifka, “Basic notions of information structure,” Acta Linguistica Hungarica, vol. 55, no. 3, pp. 243–276, 2008.
[3] F. Kügler and S. Calhoun, “Prosodic Encoding of Information Structure: A typological perspective,” in The Oxford Handbook of Language Prosody, C. Gussenhoven and A. Chen, Eds., Oxford: Oxford University Press, 2020, pp. 453–467.
[4] C. Féry and F. Kügler, “Pitch accent scaling on given, new and focused constituents in German,” Journal of Phonetics, vol. 36, no. 4, pp. 680–703, 2008.
[5] S. Shattuck-Hufnagel, Y. Yasinnik, N. Veilleux, and M. Renwick, “A method for studying the time-alignment of gestures and prosody in American English: 'Hits' and pitch accents in academic-lecture-style speech,” in Fundamentals of verbal and nonverbal communication and the biometric issue, A. Esposito, M. Bratanic, E. Keller, and M. Marinaro, Eds., Amsterdam: IOS Press, 2007, pp. 34–44.
[6] D. P. Loehr, “Temporal, structural, and pragmatic synchrony between intonation and gesture,” Laboratory Phonology: Journal of the Association for Laboratory Phonology, vol. 3, no. 1, pp. 71–89, 2012, doi: 10.1515/lp-2012-0006.
[7] D. McNeill, Hand and mind: What gestures reveal about thought. Chicago: University of Chicago Press, 1992.
[8] D. McNeill, “Gesture: a psycholinguistic approach,” in Encyclopedia of language and linguistics, E. K. Brown, Ed., 2nd ed., Amsterdam: Elsevier, 2006, pp. 58–66.
[9] P. L. Rohrer et al., The MultiModal MultiDimensional (M3D) labeling system for the annotation of audiovisual corpora: Gesture Labeling Manual. UPF Barcelona, 2020.
[10] A. Lücking, K. Bergmann, F. Hahn, S. Kopp, and H. Rieser, “The Bielefeld Speech and Gesture Alignment Corpus (SaGA),” in LREC 2010 Workshop: Multimodal Corpora–Advances in Capturing, Coding and Analyzing Multimodality, 2010, pp. 92–98.
[11] B. Lindblom, “Explaining phonetic variation: A sketch of the H&H theory,” in Speech Production and Speech Modelling, W. J. Hardcastle and A. Marchal, Eds., Dordrecht: Kluwer, 1990, pp. 403–439.
[12] J. Hanssen, J. Peters, and C. Gussenhoven, “Prosodic Effects of Focus in Dutch Declaratives,” in Proceedings of Speech Prosody 2008 Conference, 2008, pp. 609–612.

Info

Day: 2023-05-27
Start time: 10:00
Duration: 01:00
Room: SKW B

Links:

Feedback

Click here to let us know how you liked this event.

Concurrent Events