Ambisonics, especially Higher Order Ambisonics, is great for 3D sound applications. But what if you have spent a long time mixing for a 3D audio format but want to share it with listeners who are only listening on stereo?
The first thing depends if they’re going to be using headphones or loudspeakers. If they’re using headphones then you can create a binaural mix in the usual way. If they are using loudspeakers then binaural is no longer an option (unless you want to go down the fragile transaural route). In this post we will focus on how you can decode from first order Ambisonics to stereo using one of two common options.
Mid-Side Decoding
The first option is probably the simplest – treat the Ambisonics signal as a mid-side recorded scene by taking the W and Y channels, with W being the mid and Y being the side. Then you can make your left and right (L and R) stereo playback channels using begin{eqnarray} L = 0.5(W+Y),\ R = 0.5(W-Y) end{eqnarray}
This is effectively the same as recording a sound field with two cardioid microphones pointing directly left and right. Sounds panned to 90 degrees will play only through the left loudspeaker and those at -90 degrees through the right.
The advantage of this sort of decoding is that it is very conceptually simple and, as long as your DAW can handle the routing, it is even possible to do without any dedicated plugins. It also results in pure amplitude panning, meaning that it has all of the advantages and disadvantages of standard intensity-stereo. However, we’ve got another option to choose from when we want to play back over a stereo system that has some advantages.
UHJ Stereo
A more complex and interesting technique is UHJ. We’re only going to go over how UHJ for stereo listening, but it is worth noting that UHJ is mono compatible and that a 4-channel version exists from which full first-order Ambisonics information that can be retrieved via correct decoding. 3-channel UHJ can get you a 2D (horizontal) decoder by retrieving the W, X and Y channels. A nice property of the 3- and 4-channel versions is that they contain the stereo L and R channels as a subset. This means, importantly, 2-channel UHJ does not require a decoder when played back over two loudspeakers. All you need to do is take the first two channels of the audio stream.
The stereo L and R channels can be calculated using the following equations:begin{eqnarray} Sigma &=& 0.9397W + 0.1856X \ Delta &=& j(-0.3430W + 0.5099X) + 0.6555Y\ L &=& 0.5(Sigma + Delta)\R &=& 0.5(Sigma – Delta)end{eqnarray} where (j) is a 90 degree phase shift.
You can see from these equations, converting to UHJ from first-order Ambisonics results in signals with phase differences between the L and R channels. This creates quite a different impression to the kind of mid-side decoding mentioned above. There will obviously be some room for personal taste as to whether or not UHJ is actually preferred to mid-side decoding. Sound sources placed to the rear of the listener are more diffuse when reproduced of a stereo arrangement than those at the front, while for mid-side decoding there is no sonic distinction between a sound panned to 30 degrees or to 150 degrees.
Beyond front-back distinction, UHJ can actually result in some sounds appearing to originate from outside the loudspeaker pair by a small amount. This is why it is sometimes referred to as Super Stereo. In my experience, this effect is very dependent on the sound being played, both its frequency content and how transient it is.
Because UHJ stereo relies on phase differences between the two channels, any post-processing or mastering applied should preserve the phase relationship between L and R, otherwise there is a very real risk that the final presentation will be phase-y and spatially blurred.
Figure 1 shows the localisation curves for a sound played back over a stereo system where the signal in the Ambisonics domain is panned fully round the listener. Obviously the sound stays to the front, but the actual trajectories between UHJ and mid-side decoding are quite different. (These localisation curves were calculated using the energy vector model of localisation, so they are most appropriate for mid/high frequencies and broadband sounds).
Which of the two stereo loudspeaker decoding strategies you’ll want to use will depend on the needs of your project. Mid-side decoding is simpler and results in pure amplitude panning. UHJ can result in images outside of the loudspeaker base, but relies on the phase information being preserved. If you want to retrieve any spatial information then UHJ is absolutely the way to go.
Tools for Stereo Decoding
I have an old Ambisonics to UHJ transcoder VST that you can download here, but they are old and I am not sure how compatible they are with newer version of Windows and Mac OSX. To remedy that, I’ve been working on an updated version that will provide simple first-order to stereo decoding. Just select which method you want to use and pump some Ambisonics through it. Keep an eye out in the near future for when it is made available!
I’m curious to hear from anyone who has used both techniques what you prefer. Leave a comment below!