Welcome to AuSIM; leaders in 3D sound technology
Home < About AuSIM < Technology < Primers < Audio Perception
Primers
Auditory Perception - Localization
The study of spatial auditory perception involves three distinct domains: physical, perceptual and neurological.  The acoustical composition of a sound source contains vital spatial information encoded into its content, which represents the location of the source, as well as the space it occupies.   Our perceptual mechanism seizes the acoustical information and extracts important cues pertaining to the source and its environment.   When the signal reaches the neural processing stage, the source information and its direction information are combined.   The auditory neural system extracts the directional information and creates a representation of the location and space where the sound source originated. 
Localization Cues
Spatial hearing refers to the ability of human listeners to judge the direction and distance of environmental sound sources.  To determine the direction of a sound, the auditory system relies on various physical cues.  Sound waves emanating from a source travel in all directions away from the source.  Some waves travel to the listener using the most direct path (direct sound) while others reflect off walls and objects before reaching the listener’s ears (indirect sound).  The direct sound carries information about the location of the source relative to the listener.  Indirect sound informs the listener about the space, and the relation of the source location to that space. 

Interaural Differences
John Strutt, better known as Lord Rayleigh, developed the Duplex Theory of Sound Localization which states the two primary cues used in sound localization.  Because of the ears’ spatial disparity and the mass between them, they each receive a different version of the arriving sound.  The ear that is closest to the sound (ipsilateral ear) will receive the sound earlier and at a greater intensity or level than the ear farther away from the source (contralateral ear).  The differences in time of arrival and in level are referred to as the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) respectively. 

The ITD can be clarified by looking at the physical properties of how a sound wave propagates through air.  Since sound travels at c an approximate velocity of 343 meters per second, the difference in time of arrival between the left and right ears, separated by distance d = 2 * r, coming from a direction specified by azimuth theta will be defined by:


          where r is the head radius, c is the speed of sound, and θ is the direction of the sound source.

Over a wide range of frequencies, ITD is frequency independent and ITD values are typically stored by extracting the group delay between left and right channel signals.  If we consider signals that are identical as a function of time, with one signal reaching its corresponding ear earlier than the other, it is apparent that all frequencies in the signal are shifted in time by a constant.  However, due to diffraction, low frequencies result in a larger ITD than high frequencies.  Calculations done with a spherical head model or a binaural model produce more detailed data on frequency-dependent ITD values. 

Research has shown that mechanisms used to decode ITDs are sensitive to the signal’s phase at frequencies below 1500 Hz.  Above 1500 Hz, where the wavelength is smaller than the diameter of the head, the detection of phase difference becomes indeterminate due to phase ambiguities and it is difficult for the auditory system to determine which is the leading wavefront.  However, research shows that when high frequency signals are modulated in amplitude, the auditory system uses the ITD envelope cue to extract timing information of the onsets of the amplitude envelopes. 

The interaural parameter ILD depends strongly on frequency, decreasing in magnitude as the frequency is lowered.  At frequencies above 1500 Hz, the head acts as a baffle between the ears because the size of head is greater than the wavelength.  This creates a “shadow” on the sound reaching the contralateral ear which can be as high as 35dB. Below 1500 Hz, the longer wavelengths diffract around the head, thus minimizing the ILD cue. 


Figure 1 Cone of confusion

The Duplex Theory states that the ITD and ILD cues are complementary.  Taken together, the two cues provide localization information across the audible frequency range.  Nevertheless, localization by means of binaural interaction alone has an important intrinsic limitation.  Although the ITD and ILD cues are good indicators for determining the location of sources along the interaural axis, they provide an insufficient basis for judging whether a sound is located above, below, in front or in back.  For sources located at an equal distance on a conical surface extending from the listener’s ear, ITD and ILD cues are virtually identical producing what is referred to as the “cones of confusion”. A cone of confusion is presented in Figure 1.  In this case, a listener would have difficulty determining the difference in location between sources A and C because the ITD and ILD cues would be equivalent for the two sources, resulting in an up-down confusion.   Similarly, sources B and D would result in equivalent interaural cues, leading to a front/back confusion. 

<< Go Back to Primers      Continues on Page 2 - Head-Related Transfer Functions      Jump to Page 3 - Motion Clues >>
Footer Bar
AuSIM Inc., the experts in 3D sound [Home] [About AuSIM] [Products] [Services] [Applications]

[Support] [Contacts] [Buy Online] [Downloads] [News & Events]
© AuSIM Inc. 1998-2011.    Last updated on