3D Audio: The Immersion of Binaural Sound


Far from being a limited medium (essentially a placeholder for the much more immersive experience of watching two talking heads on a video!?), we are only scratching the surface of what’s possible with creative audio in podcasting.

Whilst cinematic visuals can show you something, audio provides you with the tools to visualise it. That’s why you often feel disappointed when the monster is finally revealed in a horror film. Your imagination had conjured up something much more terrible.

Of course, that doesn’t mean you can’t have brilliant video or outstanding visuals. But it does mean we often undersell audio as “sound only” when there is so much more to be done and explored with the medium.

We’ve already written about what podcasts do for your brain, binaural beats, and ASMR. Now, I figured it was time to take a look at the world of 3D audio. So whether you’re a podcast creator looking to create your magnum opus or a listener hungry for something that really stimulates the senses, here’s why 3D audio could be your new favourite playground.

What is 3D Audio?

You’re likely familiar with stereo audio, where sounds can differ in either ear at one time. Stereo panning can be used effectively to create a feeling of immersion, but 3D audio takes that to a whole new dimension – quite literally.

3D audio simulates how sound exists in a real-world, three-dimensional space. It’s achieved through a technique known as “spacialisation”, where each sound originates from specific directions (left, right, front, back, above, below) and distances. This creates the perception that sounds are coming from different locations in space, not just from fixed speaker positions.

3D audio is also known as binaural audio. However, to my layman’s mind, the subtle difference between the two is that 3D audio is created with software and binaural (spatial) audio is recorded using mics that replicate our ears and heads.

Sound good so far? Well, that depends on whether you’re wearing headphones or not. But enough dad jokes – who came up with this brilliantly immersive, cutting-edge, and obviously brand-new concept?

A Brief History of 3D & Binaural Audio

In 1881 (what a year, eh?), the first known experiment with binaural audio was conducted by Clément Ader in Paris. Ader developed the Théâtrophone, which allowed people to listen to opera performances remotely using two telephone lines, each carrying the sound from a separate microphone placed on the stage. This created a rudimentary form of stereo sound. His podcast made the New & Noteworthy charts in iTunes a few weeks later.

In the 1930s and 40s, the modern concept of binaural recording began to take shape. Harvey Fletcher and his team at Bell Laboratories were pivotal in advancing binaural sound research. They developed the head and torso simulator (a dummy head with microphones in the ear canals) to capture sound more realistically.

The 1960s saw developments in stereophonic sound systems and multi-channel recording techniques, and in the 70s and 80s, the rise of quadraphonic sound (a four-channel system) and surround sound systems in cinemas and home audio systems further pushed the boundaries of 3D audio.

In the mid-90s, the famous “virtual barber shop” recording was created. Here, you could pop on a pair of headphones and experience the sound of clippers and scissors moving around your head, presumably sculpting a glorious 3D middle-parting bowl cut.

It’s no shock to learn that today, in two thousand and twenty-whatever-year-it-is-now, software has taken the driving seat when it comes to creating spectacular 3D sound. But before we dig more into that, here’s a quick note on something that sounds like it’s come straight out of a sci-fi story about a mad scientist – Head-Related Transfer Function.

Head-Related Transfer Function (HRTF)

Whilst nobody is having their head transplanted onto another body here, Head-Related Transfer Function is still a cool and interesting concept.

HRTF refers to the way sounds are filtered and modified by the listener’s body, (particularly the head and ears) before reaching the eardrum. HRTFs account for how sound waves are affected by the shape of the outer ear (pinna), the head, and the torso, allowing the brain to determine the direction and distance of a sound source. These functions vary from person to person, making personalised HRTFs a future goal for truly immersive 3D audio.

I could go on and talk about ambisonics, interaural time differences, and holographic audio, but, to be honest, I’m already pretty out of my depth, here. So, let’s quickly grab a 3D audio expert and find out a little more about actually creating this stuff.

Owl Field · The Fairy Tree – Chapter 1

Owl Field: 3D Audio Production

Here’s Michel Lafrance of Owl Field. I’ve been enjoying Michel’s work for the best part of a decade, and he’s my go-to guy for anything 3D audio-related. I reached out to Michel and said something like, “If I send you some questions, could you answer them for me?” and he replied something like “, Yeah, sure.”

Think of this part as the preamble in a podcast where we banter about the weather.

Q. Tell us about Owl Field and the type of audio you create.

I started Owl Field in 2015 with the goal of trying to create the most immersive audio experience possible. Something that made the listener feel like they were not just immersed by the audio, but actually felt like they were in the story world, present alongside the story’s characters, surrounded by the sound effects and music.

Q. What was the catalyst for you becoming a binaural audio creator? 

I’d say that was in 2014. I stumbled across the virtual barbershop on YouTube, which is now a very famous binaural experience. That was my first introduction to binaural audio. And as I was listening, I was already envisioning all the exciting things I could do with this type of perspective for the listener. I knew right away that’s what I wanted to do.

Q. How do you record and mix your productions, and has this evolved over the years? 

Yeah, it’s absolutely evolved.

I started out with the binaural head approach, where the microphone is, is literally a head with microphones in the ears. And while there are still uses for that approach, I quite quickly realised that I much prefer the post-production approach, which is a more polished sound. Working with a collection of sounds to build a custom soundscape, I find I have more control over the direction of the sound, the accuracy of the mix, the noise floor, and the noise ratio. I like having that precision.

But even since making that decision, it’s been a very fast-moving field. Every year, new companies release their own 3D audio production software. So, yeah, I’m always trying to keep on top of that, on top of the new tech, adjusting my workflow, and I’ve actually found different techniques work better for different types of sounds. And that’s just something that comes with testing and experience.

Q. Tell us about something particularly cool or unique that you created, and are extra proud of.

I’m probably most proud of The Escape Room. It’s a completely interactive 60-minute escape room podcast. And, although Owl Field had done an interactive audio drama before with The Fairy Tree, this was just on a different level, as the player needed to be guided through the game without them accidentally skipping ahead.

There needed to be a variety of audio puzzles, puzzles that were an appropriate difficulty level for most people. There needed to be hints given to the player at just the right time; then, on top of everything, it needed to work out to approximately 60 minutes for the average player. And the fact that this is a full-cast audio drama entirely in 3D, audio puzzles are practically an afterthought. But I always say I can’t thank the Alpha and Beta testers enough because there’s no way I would have been able to work out the timing without them.

Q. Have you ever dabbled in the creation of binaural beats or ASMR?

ASMR elements do tend to sneak their way into the productions. Yeah, just to give the listener an extra thrill, especially when they’re not expecting it—like a mischievous little fairy maliciously whispering just behind your ear.

Q. Why do you think people enjoy 3D audio so much? What’s so unique about it? 

Immersive content is a big thing, and has been for a while, but I find 3D audio even goes beyond just immersive. It gives the listener the sense of presence that you receive from virtual reality experiences. It feels like you’re in the story alongside the characters and surrounded by the sound effects.

“I’ve always maintained that I think 3D Audio will eventually do to stereo what stereo did to mono.”

Sounds can come at you from any direction. There can be sounds that are very, very distant and quiet. They can wrap around your head. The soundscape has so much more space to use. Characters can be in a myriad of configurations, and if done well, it rewards the listener for listening attentively. It’s just more thrilling, I think, and more engaging than traditional stereo. And I’ve always maintained that I think 3D Audio will eventually do to stereo what stereo did to mono.

Q. There’s a lot of hype around video content right now. What do you say to folks who think of audio as video’s poorer and more limited cousin? 

I’m not sure I would call audio limited, especially with spatial audio. I know the sentiment there is that it’s easier to follow and understand something visually than to understand something that’s audio only.

One weakness of audio fiction, or a trope of audio fiction, is that sometimes you need the narrator to describe the action, or a character to exclaim that the monsters are behind you, or something, so that the listener can follow along. But with spatial audio, you just place the monsters behind the listener and let them experience it themselves.

I think audio is less limited than video in some respects, because audio can use the imagination of the listener, whereas in video you’re telling the viewer exactly what it is. And often what the visual is is not as spectacular as what the listener might have imagined. So it’s like when a viewer is disappointed with the character design of the film adaptation of their favourite book, for instance.

So I think the real challenge is still about building the willingness in people to take on listening as a prime-time activity, to stop scrolling or passively watching Netflix, and put on headphones in a good listening environment and give it their full attention like they would a film. And I think that’s on us as producers to produce quality, immersive content that rewards the listener for taking the audio seriously.

Q. What advice do you have for anyone who would like to get into 3D audio production? Can you point them to any useful resources?

The first thing to do is determine whether you prefer the raw live sound of spatial recording versus the more polished sound of working with spatial audio in post-production.

You can find examples of spatial recording by searching YouTube for “binaural head” or “ambisonic” recordings. And then, of course, for post-production sound, you can check out stuff on owlfield.com.

Each has pros and cons, not just for the resulting sound but also for the start-up cost, the production, the post workflow, and the ability to edit, so deciding what works best for you is the essential first step.

Apart from that, I think my top tip is about the depth of the soundscape. To create realism, we want to imitate how we hear in real life as best we can, so things that are close are loud, and things that are far away are quiet. If this isn’t the case, the soundscape won’t sound realistic, and the immersion suffers.

That sounds easy enough, but the problem is that we can’t control the listener’s listening environment—they might be using a cheap set of earbuds on a bus—so if you make something quiet, the listener might not be able to hear it at all. Therefore, it’s necessary to consider the balancing act between creating realism and ensuring critical sounds aren’t lost.

Q. Finally, you mentioned cheap earbuds on a bus. Do you have any tips for optimising the 3D listening experience?

Well, the first thing to say is that a binaural experience will work on any pair of stereo headphones – even with a cheap pair of earbuds, you’ll get a sense of the surrounding soundscape.

It’s similar to other types of listening, though, in that if you’re listening to music on the bus, you won’t catch much detail unless you’ve got some decent headphones on. So, I’d say it depends on the listener. Sure, you can watch a film on your phone, but it is much more captivating on the big screen. For me, 3D audio content is definitely worth putting in whatever effort you can to shut out any surrounding distraction, so either a quiet space in your home or some noise-cancelling headphones during a commute.

Shutting your eyes also improves the 3D effect dramatically, in that the realism of the audio isn’t damaged by conflicting with what you’re seeing.

Thank you so much for your time, Michel!

With 3D Audio, You’re Never in the Same Sound Space Twice

The point of this whirlwind guide to 3D audio wasn’t to convince you to rush out and make your own binaural production (though it’d be fantastic if you did). Obviously, for most non-fiction shows where the focus is on the conversation, 3D audio wouldn’t just be overkill; It’d be outright jarring for your listeners.

Really, this is just part of our ongoing exploration into what is possible with audio. We want to continue to highlight exciting and innovative ways you can work with and present sound. Video definitely has its earned place at the podcasting table, but there is so much more to come from the realm of creative audio. Sometimes, the answer to “What next?” is the development and evolution of what you’re already doing rather than starting from scratch in a whole new medium.

So check out What Podcasts Do for Your Brain, and keep on exploring!


Leave a Reply

Your email address will not be published. Required fields are marked *

Generated by Feedzy