All Professionals
Speech Processing Strategies
Introduction
The speech processing strategy is the detailed way in which the information in sound is converted into electrical stimulation at the implant. All modern multi-channel inputs work on the basis that the sound is split into different channels depending on its pitch or frequency, and the sound levels in these channels are converted into electrical levels that are sent to the internal implant. However there is more than one way to accomplish this. The details of how this process is carried out form the speech processing strategy. Different implant manufacturers offer different types of speech processing strategy in their implant systems.
Speech processing strategies are also referred to as "speech coding strategies", or simply "strategies".
Background
The speech processor component of a cochlear implant system carries out the task of converting sound signals into electrical signals that are transmitted to the internal implant itself. Sound signals such as speech can be described at an instant in time by two characteristics. The first is the range of pitches or frequencies the sound contains. The second is the different loudness or level that these are at. For example, a vowel sound such as /a/ in 'bar' contains mainly lower pitches at a moderate level, and a consonant such as /s/ in 'less' contains mainly higher pitches at a slightly lower level. This distribution changes from instant to instant in everyday speech. How the pitches and levels vary over time transmits the information in the speech (or other sound) signal.
The graph below represents a person saying "grass". The x-axis is time, the y-axis is the pitch or frequency; the higher on the axis, the higher the pitch. Level or loudness is represented by colour; bright green represents a relatively high level, light blue a moderate level, and dark blue quiet.
The different frequencies, or pitches, in the sound signal are sent to different channels of the implant, and the loudness is transmitted by making the electrical signal larger for louder sounds, and lower for quieter sounds. (Refer to the rehab section for more information). Therefore the pattern of electrical stimulation is a representation of the speech signal. The user's brain interprets the electrical signal as a sound sensation.
Why a speech processing strategy is required
The ear with an implant has a number of technical disadvantages compared to the ear with functioning hearing. This is why cochlear implants cannot restore "normal" hearing. (However despite these disadvantages they do in the great majority of cases vastly improve the access to sound and speech when compared to a severe/profound hearing loss). Probably the main reason for this is that evolution designed the ear to respond to sound waves rather than electrical stimulation. One of the goals of a speech processing strategy is to overcome these limitations as much as possible.
Also, speech is a very information-rich signal. The speech processing strategy attempts to condense this information as required to convert it to a form suitable for transmission as an electrical signal by the internal implant, at the same time preserving the important information.
Finally, speech processors are basically mini-computers, and as such are limited in terms of how much information they can process in a given time. A further goal of the strategy is to use this limited power as efficiently as possible.
Dynamic Range
The functioning ear can respond to a range of sounds from extremely quiet (~0 to 20dB HL) to the level of discomfort (~100-120 dB HL). This gives a dynamic range of 100 dB. For someone using a cochlear implant, the equivalent range for the electrical stimulation (from a signal producing an extremely quiet sensation to one producing a very loud sensation) is typically 10-20 dB. To cope with this, speech processors use compression and/or limiting of the signal picked up at the microphone. Limiting cuts out quieter sound levels considered not useful, and "resets" louder sounds to a constant level. Compression "squashes" the signal into a smaller range. For example, compression at a ratio of 3:1 will result in an increase at the output of 1dB for an increase in the input of 3dB. (See the figure below).
Strictly speaking, the compression/limiting described above is not part of the actual speech processing strategy, but it does vary between manufacturers. Therefore different types are associated with different speech processing strategies.
The levels produced after this compression/limiting are then translated into electrical levels. This is achieved with a mathematical function that is part of the speech processing strategy. With some strategies this can be altered by the audiologist at programming to achieve the preferred sound quality / performance for the user.

Frequency Resolving Power
A cochlear implant splits sound signals into a number of different channels on the basis of frequency. The functioning auditory system performs a similar task, but acts as if there are a vast number of overlapping channels for frequency. It appears that in practice, the listener can automatically choose which channel(s) to listen with to give him/her the best signal.
For someone with normal hearing, the ear is generally able to distinguish sounds when they differ in frequency by roughly 10 to 17%. With an implant system the separation of the frequency channels can be close to this figure, or a little wider depending on the programming, and the number of channels used. However it is important to remember that this is not comparing like for like: the functioning ear has a vast number of overlapping channels, the cochlear implant, a limited number of distinct channels. Overall it is accepted that the frequency resolving power of the functioning ear is greater than for the cochlear implant user.
A further complication with implants is that the electrical signals from channels close together in the implant may interfere with each other under certain conditions. In practice this may mean that a loud signal on one channel "swamps" a quieter signal on adjacent channels.
Cochlear implants attempt to overcome this undesirable interaction by several methods:-
- Separate the stimulation in time
- Physically separate the channels as widely as possible
- Use an electrode array designed to limit the interactions
- Use a combination of the above
Which method(s) used depend on the speech processing strategy, and the manufacturer.
Separating the channels in time is achieved by stimulating each channel in turn. The processor splits the signal over time into many "cycles" per second. During each one of these cycles, the channels are active in turn. For example in a cycle most deeply inserted channel may be activated, then the next deepest, and so on until all channels have been activated, at which point a new cycle begins. There are various patterns of stimulating over each cycle possible. As long as there are enough cycles per second, important information in the speech signal will not be lost. A strategy using this method is termed pulsatile. Where interactions between channels are considered not to be important or significant, channels can be active simultaneously. Such a strategy is termed simultaneous. It is also possible to use a combination of these techniques. For example half the channels could be stimulated simultaneously, the others in a pulsatile manner.
The effect of the limited frequency resolving power is illustrated by the visual analogy below. The picture on the left is the original, the one on the right has had some of the frequency information altered. Both are easily recognisable as a cat, but some of the fine detail is lost on the right.

Finally, it is interesting to note that there is evidence showing that people with even just mild or moderate levels of hearing impairment very often also have a limited frequency resolving power compared to someone with normal hearing.
Computing power
The information in speech (or other sounds) is transmitted in two ways. One is which pitches or frequencies are present, and the level they are at. This is termed spectral information (from spectrum). The other is how this pattern changes over time. This is termed temporal information. With limited computing power, some compromise between these types of information has to be made. This leads to temporal speech processing strategies - those that concentrate mainly on the timing information, and spectral speech processing strategies - those that concentrate mainly on the frequency information.
Temporal strategies traditionally work by having fewer channels that the whole sound signal is sent to at a high rate. For example an implant could have 6 channels that are continuously active.
Spectral strategies use a decision process to select several channels from the total available at a given time, to stimulate. This is usually on the basis of which channels have the highest level. For example the implant may have 12 channels available, but in each cycle only the 6 with the highest levels are active.
There is always a trade off that has to be made between the spectral (frequency) and temporal (timing) information, no matter how powerful the speech processor computer is. Some strategies attempt to provide more or less equal spectral and temporal information. These could be called spectro-temporal strategies.
Types of processing strategy
Modern strategies fall into the categories shown below.
![]()
Other strategies
In the earlier days of cochlear implantation some people thought that due to the limitations of the implanted ear, it would not be able to cope with a speech processing strategy that attempted to transmit as much of the speech signal as possible (like those described above). There are some features in speech sounds that change over time in a more or less predictable way. An example is the change in pitches in vowel sounds, which vary depending on which consonant sound follows the vowel. It was thought that by using the processor's computer to extract these features, and transmit just them to the implant channels, better results would be achieved. These were called feature extraction strategies. In general the evidence shows that these gave poorer results than strategies as described above. It seems to be preferable to give the brain as much sound information, as efficiently as possible, and let it do the job of interpreting this.
Summary
- The speech processing strategy is used to overcome the technical limitations of a cochlear implant system, and transmit the information in sound as efficiently as possible to the user.
- There are a number of different speech processing strategies, reflecting the different ways this task can be accomplished, and the implant manufacturers developing strategies that they feel work best with their particular implant systems.
- The speech processing strategy available to the user depends on which implant system (i.e. manufacturer) he/she uses.
- All the modern speech processing strategies have been shown to be capable of producing very good results.
Further information
More information can be found on the manufacturers' websites. Alternatively ask your audiologist at your local cochlear implant centre.