Demonstrations at NAB 2015 will showcase new TV audio system for ATSC 3.0 and I nternet streaming in a complete broadcast chain showing live sports and music production.

Fraunhofer IIS, Qualcomm and Technicolor to Demonstrate the World’s First Live Broadcast of MPEG-H Interactive and Immersive TV Audio

Press Release / 10.4.2015

Las Vegas, Nevada, USA, April 10, 2015 – NAB, South Upper Hall: Fraunhofer IIS, Qualcomm Technologies, Inc. (“QTI”), and Technicolor, the three major technology companies behind the MPEG-H Audio standard, are demonstrating this new technology at the NAB 2015 conference in Las Vegas this week. In two separate venues, the companies will offer the world’s first live broadcast demonstration of their new immersive and interactive TV audio system currently proposed for ATSC 3.0 and being developed for over-the-top streaming video services.

MPEG-H Audio is designed to offer broadcasters a cost-effective means to elevate the sound quality of their offerings beyond 5.1 surround while incorporating groundbreaking new interactive and immersive features across the full range of modern viewing devices from high-end home theaters, to tablets, smartphones, and sound bars.

MPEG-H Audio Live Broadcast Demonstration

The end-to-end live production demonstration at the Fraunhofer booth SU3714 will incorporate a live audio feed from a remote truck combined with recorded programming from video servers at the network. The process for distributing the new, live content to affiliate stations, inserting local commercials and emission to viewers’ living rooms will be included in the demo.

Aside from the Fraunhofer prototype audio/video encoders and decoders and a Jünger Audio monitoring unit, all of the equipment in the demonstration is unmodified broadcast equipment used in TV plants and remote trucks today.

The system is based on the new MPEG-H Audio international standard. It offers viewers the ability to choose different audio presentations, such as “home team” or “away team” commentary for a sports event, or volume control over specific audio elements in a program – such as dialogue or sound effects. Viewers are also able to experience immersive sound over loudspeakers, new 3D soundbars, tablet computer speakers, and headphones. Additionally, it is a true multi-screen audio system that tailors playback so programs sound best on a range of devices and environments – from quiet home theaters with speakers to the subway or airport with earbuds.

All of these features will be under the control of the broadcaster or content distributor, providing new creative opportunities, such as the ability to efficiently add additional languages, players, or official microphones, or, as the three companies have demonstrated, car to pit crew radios at races.

“This system has grown from our pioneering work on Dialogue Enhancement years ago and our early work in immersive sound, as well as our 15 years of providing half the world’s TV sound. In November 2013, we presented the idea of a football game where you could Hear Your Home Team™ and adjust audio elements of a program to your preference. We have progressed to a full, live implementation of the audio path from the field of play at a sports event to the listener’s ears at home or mobile" said Robert Bleidt, Division General Manager at Fraunhofer USA Digital Media Technologies.

“At Technicolor we are proud to contribute our Scene Based Audio Higher-Order Ambisonics technology to this great opportunity for broadcasters,” said Claude Gagnon, SVP, Content Solutions & Industry Relations. “We continue to invest in the creative community to develop a rich ecosystem for the Film and the Broadcast industry.”

The demonstration will also include prototypes of new consumer devices supporting MPEG-H, including a Technicolor set-top box, Samsung pre-production prototype TV, and Texas Instruments-based audio-video receiver.

To experience this demo at NAB, please visit the Fraunhofer booth SU3714 or contact matthias.rose@iis.fraunhofer.de.

MPEG-H Scene-Based, 4TH Order HOA Audio Demonstration Over 7 PCM channels

In addition, in meeting room SU 201LMR, Qualcomm Technologies will be demonstrating an end-to-end simulated live broadcast of immersive, scene-based MPEG-H audio. Every stage of a live Higher Order Ambisonic (HOA) production will be demonstrated: from capture of a live 3D musical performance, through efficient transport through a TV plant (NoC to affiliate), an emission encoder (MPEG-H) to playback on consumer devices with various speaker configurations.

"Qualcomm Technologies collaborated on the development and supports the new MPEG-H standard, and is taking important steps towards widespread distribution of HOA and MPEG-H audio across a range of consumer devices,” said Samir Gupta, Vice President, Engineering, Qualcomm Technologies. “Given Qualcomm Technologies’ technology leadership in 4K UHD video and now with MPEG-H audio, Qualcomm Technologies continues to drive innovations that can deliver unprecedented mobile multimedia experiences to consumers.”

As people across the globe consume more and more content on their mobile device, Qualcomm is committed to advancing multimedia and broadcast experiences. To experience this demo at NAB, please visit the Qualcomm Technologies meeting room at Las Vegas Convention Center South Upper Hall, room SU 201LMR.

Detailed description of the live broadcast demonstration at the Fraunhofer booth SU3714 at NAB:

The audio program will be created in a mockup of a TV remote truck, using the standard equipment present in trucks today, adapted for MPEG-H Audio with a Jünger Audio MPEG-H Audio Monitoring and Authoring Unit. MPEG-H Audio dynamic audio objects will be used live on the air to carry sound effects and MPEG-H Audio static objects will be used for carrying English, foreign language, and venue PA commentary. The audio bed will be mixed in 5.1 surround plus 4 height speakers (5.1+4H) immersive format.

The audio and video from the event will be sent to a Network Operations Center where the live audio and video will be combined with other programming stored on standard video servers. The programming will offer audio formats ranging from stereo to 5.1 surround, to 7.1 + 4H and Higher-Order Ambisonics immersive sound.

Programming from the Network Operations Center will be sent to a Local Affiliate TV station where local advertisements will be added in formats ranging from stereo to immersive sound. The affiliate’s TV signal will then be transmitted to a Technicolor MPEG-H Audio enabled set-top box in a consumer’s living room for playback in the 7.1 + 4H speaker configuration.

The system is designed to offer broadcasters and content creators an easy-to-use integrated system for implementing efficient interactive and immersive audio. The system includes MPEG-H Audio Encoders (envisioned to normally be included in video encoding equipment), MPEG-H Audio Decoders (envisioned to be included in professional IRDs or consumer receivers) and MPEG-H Audio Monitoring and Authoring Units for professional monitoring and content authoring. Additional tools for content editing and preparation in a post-production environment will be announced soon.

Content from the demonstration can also be heard on the Fraunhofer prototype 3D Soundbar, which brings immersive audio to mainstream consumers with a soundbar instead of loudspeakers. The MPEG-H Audio system will also work with other immersive audio speakers and audio receivers entering the market.

An additional demonstration is a movie channel providing IP delivery of feature films in immersive sound using Higher-Order Ambisonics, showing the use of MPEG-H Audio to provide the immersive sound consumers experience in the cinema in their home and on mobile devices.

As a further demonstration of MPEG-H Audio as a broadcast-ready system, MPEG-H Audio content will be decoded on a pre-production prototype Samsung TV set, showing the ability to adjust the audio elements of a sports broadcast to a viewer’s preference.

Further background information on the live broadcast demonstration at the Fraunhofer booth SU3714 at NAB

In a simulated remote truck audio section, pre-recorded microphone signals from an extreme sports event will be mixed live on an unmodified Calrec Artemis console adapted for interactive and immersive sound using the Jünger Audio MPEG-H Audio Monitoring and Authoring Unit. The Audio Monitoring and Authoring Unit extends existing 5.1 broadcast consoles with functions to create static and dynamic audio objects, author metadata, and measure loudness of each of the possible MPEG-H Audio programs. The monitoring unit also allows listening to each dynamic range control profile and audio downmix. The remote truck output feeds a Fraunhofer prototype MPEG-H Audio and H.264 video encoder.

The remote truck signal is received in a mockup of a Network Operations Center (“The MPEG Network”) using a Fraunhofer prototype MPEG-H Audio and H.264 contribution decoder. The output of the decoder flows through a standard unmodified frame synchronizer and HD-SDI router to the NOC's Master Control position, where the truck signal is switched into recorded programming from a video server under automation control. The system allows all supported audio formats, such as stereo, 5.1 surround, 5.1 + 4H, 7.1 + 4H, and Higher-Order Ambisonics, including static or dynamic objects, to exist simultaneously on broadcast servers with no special preparation. The switched program is then input to a Fraunhofer prototype distribution encoder for transmission to affiliate stations.

A simulated affiliate, WMPG-TV, receives the transport stream from the network with a Fraunhofer prototype distribution decoder. The HD-SDI signal is then fed through a frame synchronizer and input to an HD-SDI router. Stored local commercials are inserted under automation control in the signal to the Fraunhofer emission encoder. The emission encoder output is then fed to a Technicolor set-top box in a simulated consumer living room. The signal from the local affiliate may also be received over an Internet connection by a tablet computer. On-screen displays on both devices allow the viewer to select audio presentations prepared by the broadcaster, or directly control the audio elements within limits set by the broadcaster. Additionally, content may be played on the Fraunhofer prototype 3D Soundbar or a prototype AVR based on the Texas Instruments DA830 DSP.

Broadcaster benefits

Since the MPEG-H Audio system is designed to work over unmodified HD-SDI embedded audio channels, stations can begin implementing MPEG-H Audio features as they choose without changing their internal plant or operating procedures. A four-stage process for broadcasters has been proposed to consider when adopting MPEG-H:

  • Transmission of stereo and surround programming using MPEG-H Audio: This would allow broadcasters to gain the bitrate efficiency and new mobile audio features of MPEG-H Audio without any operational changes.
  • Addition of audio objects for additional languages or alternate commentary, enabling viewers to Hear Your Home Team™ audio or listen to their favorite race driver’s radio.
  • Addition of immersive sound to improve the realism of the sound by adding height channels, Higher-Order Ambisonics, or statically panned objects above the listener.
  • Addition of dynamic audio objects: In contrast to static objects fixed in position, dynamic objects move over time to track video action or provide creative effects. If sound effects are to be panned, for example, a dynamic object can reduce the required bitrate compared to sending a five or nine channel static object.

At NAB, dynamic objects will be used to pan sound effects in the remote truck. The position information for the object in each frame of video will then be sent in the MPEG-H Audio control track channel to the network, then the affiliate, and finally through bit stream metadata to the consumer's set-top box or TV. If the MPEG-H Audio encoder and decoder are connected by an HD-SDI path, network signals can be passed through affiliates with no additional equipment. The control track is only needed to take advantage of MPEG-H Audio’s dynamic audio objects or for agile loudness control or channel assignment. Broadcasters can continue to use a fixed loudness level and channel assignment as they do today without the need for control track information.

This system is based on the latest audio coding standard from MPEG, the organization that develops open international standards for audio and video, including MP3, AAC, HE-AAC, MPEG-2, MPEG-4, MPEG-H, MPEG-DASH, AVC/H.264, and HEVC/H.265. While H.264 was used as the video codec for prototypes since mature implementations are readily available, MPEG-H Audio will also work well with HEVC/H.265 and other video codecs.