The ability to comprehend speech through listening may at first appear to be a simple task. When we consider the complex nature of speech perception, we find it is not so easy. It involves the acoustic cues being extracted from the signal. This signal then needs to be stored in the sensory memory and identified on to a map of linguistic structure. To understand this process we need to consider the stimulus presented and what factors play a part in how we perceive it. Considering the complexity and variation in the acoustic patterns produced by speech, we are reminded that while this process may appear effortless, our ability to recognize speech is the result of a complex speech perception system. While we can describe speech patterns in terms of frequency, we also need to look at the meanings these sounds convey when strung together into sentences, and the influence this meaning has on perception. Speech perception can be understood in terms of the interaction between top-down, knowledge based processing and bottom-up processing based on the incoming acoustic signal. The sounds produced in speech are formed by patterns of pressure changes in vocal apparatus called the acoustic signal. The signal is created when the vocal apparatus changes pressure in the air that is released from the lungs. The vocal tract changes shape according to the movement of the articulators, including the lips, teeth and tongue. Understanding the act of speech production is an important to our understanding of speech perception. Speech is made up of phonemes, syllables and words that are combined to form a stream of units. A phoneme is the smallest unit of speech that when changed, changes the meaning of a word. On their own they do not convey meaning, but when combined they form words. Our perception of phonemes is affected by context, how the phonemes are arranged in words changes the way in which they sound, however we perceive the sound as the same due to perceptual constancy a phenomena that is not only found in sound, but also in vision. There is also a wide variety in acoustic signal due to the variety of speakers, accents, and poor pronunciation. The listener needs to be able to translate all these differences in acoustic signal into understandable words. Being able to distinguish phonemes into an understandable pattern and perceive words is dependent on our ability to perceive breaks between words. The acoustic signal is continuous with breaks occurring that do not match up to the breaks between words. This continuous stream of speech signal is apparent when hearing a person speak in a foreign language. Without the knowledge of sound meanings, it it difficult to determine where one word ends and another begins. Speech segmentation is influenced by our knowledge of the language and context. The way phonemes are ordered in a language can be described in terms or transitional probabilities, the likelihood that one sound will follow another. These transitional probabilities are learnt through statistical learning that begins in infancy. (Goldstien, 2007) Speech is multi-modal, meaning that our perception of acoustic signal is influenced by the other senses. The McGurk effect occurs when a person hears a phoneme (such as /ba-ba/) but sees a face seemingly voicing another phoneme (such as /ga-ga/). The listener will report hearing the sound as the sound the mouth movements are making (/ga-ga/), even though the acoustic signal has not changed. (Goldstein, 2007) This may appear to support the motor theory of speech perception. This theory suggests that the way a phoneme is articulated or produced by the mouth and the way it is perceived have more in common, than, the way the phoneme is represented as an acoustic signal and the way it is perceived. One way to consider the complex nature of speech perception is to consider computer speech recognition programs, how they mimic human ability and where they cannot match up. While...
Please join StudyMode to read the full document