Ben Collins Media


Spatial audio

So this falls into the category of things I've actually done some work on.

We hear a lot about head tracking with respect to virtual reality headsets. Watch Johnny Lee hack a Wii IR remote for head-tracking here.

Why save all the cool shit for your eyes? Your ears deserve the same treatment. In the movie biz, they call audio the "heart" of a film. For real immersion, proper audio is invaluable, but occasionally overlooked. So where are we at with 3D audio? It certainly exists, but things aren't perfect by any means.

By the way, ever had a virtual haircut?

Most 3D audio is recorded using what's called a binaural head. This is a special stereo microphone, embedded in the ears of a dummy.

"Fritz" does not appreciate being called a dummy.

"Fritz" does not appreciate being called a dummy.

In humans, audio localization is accomplished by several physical traits. The pinna (the flappy, ear-y part of the ear) changes the spectral content of audio coming from behind you, so you can tell when someone is sneaking up. That's why mutants are actually surprisingly easy to attack: 

"Gaahhh I suffer from auditory localization problems"

"Gaahhh I suffer from auditory localization problems"

In addition, the distance between your ears allows side-to-side localization through ITD (interaural time differences). In other words, if someone is yelling at you from the right, that sound takes a bit longer to go around your head and reach your left ear, which is the "time difference" part of "interaural time difference."

The voice of this irritating person on your right will also have different spectral content by the time it reaches your left ear. The long trip around your big head shaves off some of the high frequencies. This is how humans know where sounds are coming from.

Also, the more I talk about "humans" the more I feel like an alien. Side note: reading the Wikipedia entry on "humans" is super alarming.

So humans (sorry, I can't stop now) made these things called HRTFs, or "head related transfer functions." These are essentially mathematical maps of how the ears interpret sound, and are used to fake audio spatialization. If you ever want to play with this, I recommend downloading soundhack!

It's not hard to see where this is going. Turn a Wii-mote backwards, put some IR LEDs on your headphones, and get tracking. If we can use HRTF in real-time, we can simulate audio spatialization. If the computer knows your head position, it can compensate: turn your head to the left, and it'll "pan" an audio source to the right. In other words, we now have a virtual audio world.

I was able to mock this up using Max/MSP and a Wii-mote, but mine needs work and I'm getting a bit busy. Hopefully I've presented the problem well. Write me if you want to see my work, but I promise it's a bit sloppy for now.

Ben CollinsComment