Mason Bretan

Technology, Art, and Science
   Home      Audio and DSP      Active Noise Cancellation and Its Applications

Active Noise Canceling Algorithms,
Binaural Processing, Fourier Analysis, Microsecond Delays,
and a 3D Virtual Acoustic Space with 2 Speakers

Active Noise Cancellation
Noise cancellation is a method for destructing sound with sound.  When a pressure wave is emitted through something as a speaker (or other medium) and combines with another wave identical in amplitude, but with an inverted phase a new waveform is created.  This wave is the result of destructive interference between the original two and can be considerably less in volume and if done perfectly can be inaudible.
The research I did concerns developing these secondary canceling sonic sources so that they can be as effective as possible.  In a perfectly acoustic environment a destructive signal that is solely an inversion to the original can produce effective cancellation.  However, in order for this method of noise reduction to be adequate in the real world refraction, reflection, and filtering properties of the surrounding environment need to be taken into account. I did work with signal processing to cancel the sound of a small house fan and develop a stereo crosstalk cancellation system.
 Canceling Fan Noise

Here, a microphone records the sound of the fan. It is important to understand the mechanics of the system you are canceling in order to get a high signal to noise ratio (SNR). Also, I used a unidirectional mic as another measure to reduce unwanted noise.  The sound produced from a fan does not propagate in the same direction as the air flow, but rather propagates outward following the vector paths of the fan blades. Because of this, microphone placement is essential to getting a good signal. The audio files to the left demonstrate when the mic was placed in front of the fan (low SNR) and when it was more effectively placed to the fan's side (high SNR). The center and most prominent frequencies are those produced by the blades movement through air so that "hum" is the most important signal to process when canceling.

 Here is a diagram that demonstrates the signal processing flow of the fan audio signal.  The signal is multiplied into 3 separate, but identical signals with each going through a bandpass filter with a different Q value.  The first order band pass has the most narrow Q and the 3rd has the largest.  The signals are then combined and attenuated to match the input amplitude and delayed to adjust for the latency before being outputted through a speaker as a destructive sound wave.  In this particular ANC system latency is easier to make adjustments for because the sound of a fan is cyclical.  Below, an example of the filtering programming is shown. 
Signal processing done with MaxMSP
3 Speaker System
One speaker with positioning as shown here creates a "lobe" of destructive space where the fan noise is reduced.  Using 3 speakers (one for each fan blade) positioned geometrically concurrent with the fan blades it is possible to create more of these destructive "lobes" and acoustically lower the level of fan noise in a room.  The sound files demonstrate the difference in noise level in room with and without the ANC system.

Crosstalk Cancellation
 Crosstalk in a stereo system is the combined signal from one speaker with the signal of the other.  With crosstalk cancellation it is possible to reduce the level of that combination and make it so that the listener's right ear can hear only what is emitted from the speaker on his or her right side and the left ear hear only what is emitted from the the speaker on the left side.  This is done by having each speaker emit 2 signals: the original and a canceling.  The canceling the signal should destruct what is coming out of the opposite speaker at the adjacent ear.  In this image the crosstalk signal is being canceled at the left ear.  The left speaker canceling signal is delayed the length of time (microseconds) it takes for the crosstalk from the right speaker to reach the left ear.  Though through this method there is a definite "sweet spot" where effectiveness is most evident, head tracking can be used so that the delay times are modified for the listener's positioning in relation to each speaker.
 Using a set of impulse responses taken around the KEMAR head, convolution is used to get the head related transfer functions (HRTFs).  These functions allow you to develop a more effective crosstalk canceling signal by understanding how a person's torso, head, and pinnae act as filters.  To the right is an image of my interface that allows the user to set the speaker positions in relation to the listener.  This information tells how each canceling signal needs to be filtered in order to destruct the crosstalk sound at each ear.  Using the same method, it is possible to simulate sound source location around a person's head by controlling what goes to each ear.  Because crosstalk cancellation allows a user to do this it is then possible to create 3D virtual sonic environments (surround sound) with only two speakers.
Frequency Analysis

Unlike other crosstalk canceling algorithms, mine uses FFT to separate the filtered signal into 3 frequency ranges and applies an individual delay time for each range.  The image to left shows some of the code processing that analyzes the signal and the image to the right shows the output of 3 new signals each with an individual frequency cutoff range and delay time.  The delay times compensate for the time difference it takes for sound to travel to each of the listener's ears.

 The image above and the image to the right show the code where phase inversion takes place and the 3 part canceling signals are combined into a single waveform.  The two most essential parts of crosstalk cancellation consist of the HRTFs and delay times.  The HRTFs manipulate 5 variables: azimuth, distance, elevation, frequency, and time delay.  Because the delay times are so minute it most be possible to alter the times by fractions of a second.  In the digital domain, the delay lines are delayed sample by sample.  Therefore it is possible to achieve more accurate and effective crosstalk cancellation by using a higher sample rate such as 96K as opposed to 44.1K (though this can be very CPU intensive).
Here are two images from my interface.  The FFT frequency cutoffs, delay, and gain values are all variable and can be changed by the user.  When selecting the frequency cutoff values be aware that higher frequencies are very difficult (especially in the digital domain) to cancel because of their small wavelengths.  In noise cancellation the two signals must be perfectly time aligned in order for the effect to work.  If not aligned the noise the noise may be amplified instead.  Here, it is necessary for frequencies below about 2000 Hz to be canceled because the human body does filter them due to their wavelength size.  Higher frequencies, however, the human body and head naturally filter and therefore are attenuated by the time they reach the ear.  HRTFs are supposed to simulate how people naturally hear and locate sound, but each person is different with different pinnae and head shapes and therefore different filtering properties.  I am currently using a set of impulse responses taken with multiple different ear shapes and sizes.  Other sources online such as the IRCAM Listen project also have sets of impulse responses developed from multiple human subjects.  What may be an effective crosstalk cancellation system for one person may just sound like spectral shaping to another.  Therefore adjustments must be made for each person's binaural processing in order to attain maximum effectiveness.  

The Virtual Room
3D Sound 

Using the effects of reverb and echo can simulate the sound of a room.  Furthermore, a simulation of the Doppler Effect can create a sense of motion and direction.  With crosstalk cancellation an entire virtual room can be developed and allow the listener to feel immersed inside of it.  Using HRTFs 3d sonic location can be simulated without the use of a surround speaker setup.  Though it is very difficult to achieve a very successful and effective virtual with only 2 speakers, some sounds are more effective than others.  In the patching to the left different sounds are created that work well with spatialization.  For example, the sound of a fly buzzing and beating from a helicopter are both effective sounds and can be created by filtering noise, low frequency saw tooth and square waves, high speed transients, and many other methods.


poster summary of my project