One of the key new initiatives in sound reproduction is the move to create immersive sound. We have moved from monaural to stereo to 5.1 to 7.1 and even more channels. Now, we are adding an overhead layer and the ability to create sounds at any arbitrary point in this immersive audio dome. This is being done in theaters and now even in the home.
Dolby Atmos is an object-based solution (How to Make a TV Look Better). That means that each sound is treated as its own “object” and can be reproduced at any point n the aural dome. The audio is delivered as sound plus a “track” where describes where the sound is in the volume over time. At the theater or home, a processor then renders this object to deliver signal to the right speakers. Since each theater or home can have a different speaker types and locations, the processor needs to know this information to properly render the object sound for that room. It is a powerful solution but requires quite a bit of local processing for playback (the source of Dolby’s revenue).
In our visit, we visited an Atmos lab where we heard examples of Dolby’s new audio compression codec, AC-4. This has been submitted to the ATSC 3.0 committees as a solution for standardization. It offers higher compression over the AC-3 predecessor and more features, like personalization. Personalization means the ability to change the audio track to the language of your choice at the playback device (assuming this is part of the delivered content). This includes closed captioning as well. It also allows for the selection of different audio tracks. A sports example was shown where you could select the home team announcer, away team announcer or no announcer – just the “on-court” audio.
While these capabilities are part of the ATSC 3.0 proposal (and the Fraunhofer’s version is also under discussion as part of the standard proposal at the time of writing – Man. Ed.), broadcasters don’t have to wait for ATSC 3.0 to implement them – they can do it now.
We also saw demos of new processing capabilities for TVs with stereo speakers to give them a more “Atmos-like” sound. The demo featured a $400 TV with downward firing speakers that was modified to offer the Atmos like sound. There was a slight improvement.
We also saw some movies that used the technology on a sound bar and sub-woofer solution that was quite good for a sound bar. One piece showed music from Saturday Night Live that was processed in real time to demonstrate its use for live events as well. The processed content can all be delivered in less bandwidth than 5.1 sound today.
Dolby is also looking at mobile audio. Improving cell phone audio is not on their roadmap however, because they cannot control the delivery ecosystem. They are looking at improving the virtual reality experience with immersive sound, however.
The demo we got featured a Samsung GearVR headset playing two pieces of content. I had the horror movie clip demo where sounds are supposed to come from various directions to help guide my attention in the 360-degree environment. I was not very impressed, however, as the sound was low and of rather poor quality. I heard comments from others that the music video clip was better as the sound of the piano was localized to stay in the same place even as you turned your head. Dolby has more work to do on this one.
One of the more impressive demos was Dolby Voice. This is a voice over IP solution for teleconferencing that really adds value. As was noted in the discussion and demonstration, nearly everyone agrees that the audio quality of speaker phones is poor and that the industry is ripe for innovation – and Dolby delivers. There are over 150 billion minutes of teleconferencing per year, so a big opportunity.
Dolby Voice works by processing the microphone audio to extract the voice component and remove background and other noise sources. They also enhance the voice so it does not sound so “tinny”. This can be done for cell phones for tablets and laptops for remote participants. Dolby also has a desktop phone that provides additional processing to derive spatial cues that helps to isolate each speaker n the room and process their voice in ways to enhance quality.
For example, one speaker started by talking directly at the speaker phone. He then turned and walked away, lowering his voice and talking toward the wall 20 feet away from the phone. I was on a remote station listening to this and heard a pretty consistent audio level and quality. When this was repeated while I was in the conference room, it was harder to hear him in the room than over the remote connection!
Another demo featured a remote user with a loud noise source. This was completely filtered out.
The key to the service is control. All audio is routed through a server that does the audio processing.
Dolby says that improving the audio quality of these teleconferences improves productivity and engagement. British Telecom is their lead launch partner and BT has landed over 90 corporate customers. While two-thirds of teleconferences are audio only, BT has merged Dolby Voice with Webex to allow a video component as well. We think integration with other video service like Skype will really help in adoption – and can change the nature of teleconferencing because of the step change in quality.
Dolby says BT is selling the service as a cost saver as the corporation can eliminate their circuit switched telecom elements and go entirely to Voice over IP. There is security built in as this is being offered to banks. -CC