Read my lips

When George H.W. Bush, in his now famous 1988 speech said, “Read my lips: No new taxes“, I don’t think he had the benefits of lip-reading in mind. Jokes aside, have you ever wondered how much you can understand of someone’s speech by looking at (‘reading’) their lips alone? Most people are pretty poor at this task. What is really interesting though is what happens when you can both ‘hear’ and ‘see’ the talker. Performance is much greater when you have access to both auditory and visual information than either alone. More surprisingly, this increase in performance is much greater than what you would get by simply adding up the performance in either condition. It makes a huge difference in terms of intelligibility of speech. Notably, the benefit from visual information persists in noisy environments (Sumby and Pollack, 1954).

The McGurk Effect is a powerful example of the effect of vision on the ability to understand speech (McGurk and MacDonald, 1976). In their classic experiment, McGurk and MacDonald filmed a video of a woman saying the syllable /ba/ and dubbed it on to lip movements corresponding to /ga/. When this dubbed movie was presented to adults, they reported hearing neither /ba/, nor /ga/, but /da/. This experiment and its variants have been repeated numerous times now, with the same result. If one were to close their eyes and listen to this movie, they would report /ba/. Conversely, if one were to just watch the video, with no audio, they would report /ga/. This suggests that when we are both seeing and hearing, then we are unable to ignore the visual input and process auditory information alone. What is even more amazing about the McGurk Effect is that it is involuntary. Despite knowing that the video and audio are mismatched, we still hear the syllable /da/. Intrigued? See/hear for yourself by checking out this video.

Another example of the effect of vision on audition is the well known ‘ventriloquism effect’, where irrespective of where the source of the auditory stimuli is located, the listener perceives it to be coming from where the visual stimuli is located. If you want to see a ventriloquist in action, here is a short video.

When both auditory and visual speech information is available, listeners with hearing loss understand speech much better than when they have access to the auditory information alone (Kaiser et al., 2003). To assess the ability to benefit from visual speech information, audiologists routinely administer the CUNY (City University of New York) sentence test when testing listeners with hearing loss. CUNY sentences are presented in the auditory (hearing only), visual (seeing only), and audiovisual conditions (hearing and seeing). Audiovisual (AV) benefit is then typically calculated as a difference in performance between the audiovisual (AV) and the auditory (A) condition (i.e., AV benefit = AV-A). Many cochlear implant centers around the US routinely include this test in their pre- and post-implant candidacy tests. Typically, when assessing cochlear implant candidacy, if someone shows good audiovisual benefit, then that indicates a good prognosis for cochlear implant outcomes..

Usually, we think of hearing and vision as being important senses for understanding speech. This is true for most of us, but touch can also be used to convey meaningful speech information. For instance, people who are deaf-blind may rely on touch as a method of communication. Hearing and vision, along with our other senses – smell, taste, and touch, together help us perceive our surroundings. Imagine eating without being able to see what you eating, or smell it. It wouldn’t be the same, would it? Taste, sight, and smell together contribute to the eating experience. Similarly, hearing, vision, and sometimes even touch together contribute to the experience of speech communication. Hopefully this post has convinced you of that. Hope you enjoyed this post, and as always, your feedback is welcome.

Note: As a side, when you can see the face of the talker, then you are ‘speech-reading’ as opposed to when you are just seeing the mouth and lips.

REFERENCES:

  1. Kaiser, A. R., Kirk, K.I., Lachs, L., and Pisoni, D.B. (2003). “Talker and lexical effects on audiovisual word recognition by adults with cochlear implants,” Journal of Speech, Language, and Hearing Research 46, 390-404.
  2. McGurk, H., and MacDonald, J. (1976). “Hearing lips and seeing voices,” Nature 264, 746-748.
  3. Sumby, W. H., and Pollack, I. (1954). “Visual contribution to speech intelligibility in noise,” Journal of the Acoustical Society of America 26, 212-215.
 Copyright © 2023 Vidya Krull. All Rights Reserved.