Mason Bretan

Technology, Art, and Science
   Home      Writings      Musical Interpretation through Cochlear Implants and Audio Processing Techniques

Musical Interpretation through Cochlear Implants and Audio Processing Techniques

Mason Bretan


Cochlear implantation consists of an implanted prosthesis that acts as a substitute to a malfunctioned or defective ear. They compensate for a loss of the auditory sensory function through the use of electrical stimulation of nerves. The implants were initially designed for linguistic use and language perception. The first cochlear implants provided little more than a vague awareness of environmental sounds and some auditory cues to assist in visual speech reading. The technology and effectiveness of the prosthesis has advanced rapidly and now cochlear implant recipients can effectively understand speech using the device by itself. Over the past few years increasing efforts to increase the success of musical perception has taken place. This paper reports both the similarities and differences in music and spoken language to demonstrate why recent cochlear implant users are able to successfully perceive and understand basic language, yet find other listening conditions such as the ambient sounds of music to be quite difficult to distinguish. It then describes a signal processing methods to improve quality for the users – specifically, the means of establishing a higher signal to noise ratio (SNR) and pattern and rate of electrical stimulation.


The Perception of Music of Cochlear Implant Recipients

The use and acceptance of cochlear implants (CIs) has been steadily rising since the FDA approved them in 1984. Though accurate speech recognition and language perception has been achieved among implant users, the perception and general enjoyment of music has been unsatisfactory. Music is reported as sounding unnatural and unrecognizable. Studying how CI users perceive music has revealed the many fundamental differences between spoken language and music and the interpretation of music on a psychoacoustic level.

Speech and music share many of the same qualities and characteristics. Both relay a message, however, speech is more concrete and definite, whereas music is abstract and conceptual. The interpretation of a musical message is determined by many factors including emotion, sensation, musical background and training, cultural background, and general passion and liking for the sound. Speech and music also consist of sounds of varying frequencies, timbre, attack, rhythm, and duration; and these are the attributes that are often used to convey the message (whether it is abstract or concrete). Though the character of music naturally makes the interpretation of its message much more subjective than that of spoken language, studies among CI users demonstrate that they are hearing different (sometimes unpleasant) sounds than those of normal hearing people on a physical basis and not merely on a psychological interpretative level. Yet, this complaint is not present when it comes to speech recognition and audibility of CI users. This means that when it is music (opposed to speech) that is being processed by a listener, the process of a cochlear implant device stimulating nerves through electric impulses produces a different effect than that of nerves stimulated through natural acoustics. This difference accounts for one of the reasons why pitch and music sound differently to CI users than those of normal hearing. To better understand why this is, the basic functionality of a cochlear implant and differences between the rhythms, frequency, timbre, and psychological interpretation of music and speech need to be demonstrated.

Unlike conventional hearing aids, which amplify sound, a cochlear implant acts as a bionic ear by processing and transmitting sound as electrical current into the cochlea, which stimulates the auditory nerves and in turn is immediately interpreted as sound by the brain. One reason why music is difficult to perceive by cochlear implanted patients is because of the functionality of the device. A speech processor filters sound to make audible speech more apparent and lessen the loudness of any ambient sounds. When ambient sounds such as music are played the speech processor filters out much of the original and essential sound of the music. This makes it difficult for the user to distinguish different instruments and perceive the message of the music.


Fig 1. Components of cochlear implant device

Studies have shown that rhythmic information is the most successfully and readily perceived element of music among CI users. On average it is perceived about just as well as those with normal hearing. In fact, rhythm is the most crucial element of music used by implanted patients in determining musical message and song recognition. In fact according to one study of seventy-nine CI patients were 66% more likely to correctly identify a melody that had a familiar and memorable rhythmic line, as opposed to a melody consisting of a sequence of notes with equal duration (Gfeller, Turner, etc). The nature of rhythm and its place in space and time makes it more easily distinguishable. Donnelly and Limb wrote, “Rhythm generally describes the temporal features of music that typically occur on the order of seconds, as opposed to the fine scale temporal features that occur on the order of milliseconds that are crucial in the perception of pitch and timbre.” Rhythmic patterns are quite important to cochlear implanted patients because certain patterns can often describe a musical passage resulting in accurate song perception despite poor perception of other musical aspects such as pitch and timbre. Another study further indicated the dependence of rhythmic cues over pitch cues by CI listeners when listeners were played two versions of familiar tunes. The first version had both the original and melodic information present while the second version removed rhythmic cues by equalizing the length of each note. It came as no surprise that normal hearing listeners were able to recognize the song under both conditions; however, CI users were only able to successfully determine the melody when the rhythmic information was available.

CI users are accurately able to process speech and are unsuccessful when it comes to music that is reliant on a critical melodic pattern. In other words CI users have difficulty determining pitch. Melody perception is defined by distinguishing and differentiating changes in pitch. This includes changes in a pitch’s direction (higher or lower) and the interval of which the pitch jumped. A study tested the ability of CI users to distinguish pitch changes played both as a single frequency sine wave and as a note on the piano. The listeners were asked if they noticed any change at all and then asked what sort of change occurred (if the frequency went up or down). The results showed that the just noticeable difference (JND) of a general change for piano was much better than that of the sine wave. However, the JND for frequency was significantly worse for piano (Fearn). Figure two shows the difference in frequencies of both speech and music. The difference of the distribution of spectral energy is quite noticeable between the two. In speech, the spectral energy is spanned over many frequencies and their respective harmonic partials, whereas the spectral energy of a musical note emphasizes the fundamental and its harmonic partials. The results of the study as well as the shown difference of spatial energy between speech and music can help to explain why CI users complain music sounds so poor. They can perceive a general change in pitch of music with reasonable accuracy, but have a substantial amount of difficulty determining both the direction of the moving pitch and the degree of which the pitch changed. As a result, music sounds confusing, unnatural, and unpleasant.


Fig. 2 Frequency Spectrum of Different Sounds

The perception of timbre further instills and solidifies the results of musical pitch and rhythmic perception of CI users. When different instruments of the same pitch and volume are played a psychoacoustic ability (derived from the harmonic ratios to the fundamental and the instrument’s envelope) allows those of normal hearing to differentiate between them. However, a CI user does not posses this same ability because a cochlear implant device codes and filters for frequency and not the tone color. Differentiating between two different instruments with long sustain as opposed to percussive instruments would be nearly impossible. If a trumpet and clarinet held out the same note at the same volume a CI user would have extreme difficulty determining the difference as a result of the electric impulses stimulating the same nerves because the implant has an input of the same frequencies. However, a CI user would be able to tell the difference between a percussive instrument, such as a piano and a non percussive clarinet if both instruments held out a note (assuming the pianist was not holding down the sustain pedal) because the distinctive attack associated with the piano is an essential temporal trait that can be used to identify between the two instruments. Figures three demonstrates a comparison of a single note played by a piano and trumpet. The acoustic waveform on the left side shows that the amplitude of the piano’s initial attack is much larger than the rest of the sustained note. The amplitude of the clarinet, on the other hand, is fairly uniform.


Fig. 3 Waveform of Piano and Clarinet


Musical perception remains a challenge for cochlear implant recipients and though increasing study is being done to help these patients garner a better sensitivity to music and abstract sound there are still many limitations imposed on them. The implants are designed for language perception and that has successfully been achieved, but a new goal in design and processing strategy looks to make the ability to perceive and enjoy music possible through the use of a cochlear implant. The act of perceiving music encompasses much more than interpreting particular frequencies as pleasant and abstract sound; it includes absorbing each of its characteristics such as melody, rhythm, harmony, repetition, intonation, and many more and not just interpreting these features individually, but instead as an entire unit and putting it all together to perceive a message. For current cochlear implanted recipients this is impossible, yet perhaps from their interpretations music can be learned to be perceived in a new manner in which musicality still exists, but a fundamental and accustomed key in the perception of music is different.

List of Resources

Townshend B, Cotter N, Compernolle D. V., White R. L. “Pitch perception by cochlear implant subjects.” Journal of the Acoustical Society of America

Gfeller K, Turner C, Gantz B. Accuracy of Cochlear Implant Recipients of Pitch Perception, Melody Recognition, and Speech Reception in Noise. Ear and Hearing 2007; 28: 412-423

Kong YY, Cruz R, Jones JA. Music perception with temporal cues in acoustic and electric hearing

Donnelly Patrick J and Limb Charles J. Music Perception in Cochlear Implant Users

Fearn Robert Alexander. Music and Pitch Perception of Cochlear Implant Recipients

Viemeister, N.F., Stellmack, M.A., and Byrne, A.J. (2005). The role of temporal structure in envelope processing. In Pressnitzer, D., de Cheveigne, A., McAdams, S., and Collet, L. (Eds.) Auditory Signal Processing: Physiology, Psychoacoustics, and Models. Springer-Verlag, New York, 221-229.

Bacon, S.P. and Viemeister, N.F. (1985). Temporal modulation transfer functions in normal-hearing and hearing-impaired listeners. Audiology,24, 117-134.

Viemeister, N.F., Rickert, M., Law, M., and Stellmack, M.A. (2002). Psychophysical and physiological aspects of auditory temporal processing. In Tranebjaerg, L., Christen-Dalsgaard, J., Andersen, T., and Poulsen, T. (Eds.) Genetics and the Function of the Auditory System. Holmens Trykkeri, Denmark, 273-291.

Viemeister, N.F., Stellmack, M.A., and Byrne, A.J. (2005). The role of temporal structure in envelope processing. In Pressnitzer, D., de Cheveigne, A., McAdams, S., and Collet, L. (Eds.) Auditory Signal Processing: Physiology, Psychoacoustics, and Models. Springer-Verlag, New York, 221-229.