Content area
Full Text
Purpose: This study investigated the effects of visually presented speech envelope information with various modulation rates and depths on audiovisual speech perception in noise.
Method: Forty adults (21.25 ± 1.45 years) participated in audiovisual sentence recognition measurements in noise. Target speech sentences were auditorily presented in multitalker babble noises at a -3 dB SNR. Acoustic amplitude envelopes of target signals were extracted through low-pass filters with different cutoff frequencies (4, 10, and 30 Hz) and a fixed modulation depth at 100% (Experiment 1) or extracted with various modulation depths (0%, 25%, 50%, 75%, and 100%) and a fixed 10-Hz modulation rate (Experiment 2). The extracted target envelopes were synchronized with the amplitude of a spherical-shaped ball and presented as visual stimuli. Subjects were instructed to attend to both auditory and visual stimuli of the target sentences and type down their answers. The sentence recognition accuracy was compared between audio-only and audiovisual conditions.
Results: In Experiment 1, a significant improvement in speech intelligibility was observed when the visual analog (a sphere) synced with the acoustic amplitude envelope modulated at a 10-Hz modulation rate compared to the audio-only condition. In Experiment 2, the visual analog with 75% modulation depth resulted in better audiovisual speech perception in noise compared to the other modulation depth conditions.
Conclusion: An abstract visual analog of acoustic amplitude envelopes can be efficiently delivered by the visual system and integrated online with auditory signals to enhance speech perception in noise, independent of particular articulation movements.
Previous investigation of the various contributions that acoustic cues (fundamental frequency, amplitude envelope, etc.) have on speech perception in individuals with profound hearing loss considered speechreading as the most important source of speech information (Breeuwer & Plomp, 1984; Grant et al., 1985). However, it is well documented in literature that speechreading alone is an inadequate means for successful and efficient speech information transmission. Researchers have studied other facilitating sensory modalities and the capacity of each modality to complement speechreading (Hill et al., 1968). Intending to transmit optimal speech information via a substitute sensory channel successfully, these studies explored two significant issues: (a) the necessity to extract optimal speech signals from alternate sensory modalities for adequate information delivery and (b) the necessity that alternative transmission channels should be stimulated...