4K HDR Summit

MPEG-H Audio Live Streaming in Spain

Fraunhofer IIS recently started a live streaming trial using the MPEG-H Audio technology in a demonstration organized by UHD SPAIN, an initiative led by Medina Media. The production company – organizer of the 4K HDR Summit since 2015 – promotes 4K, HDR, and Next Generation Audio in Spain.

For the current trial, content from UHD SPAIN members that was originally produced in stereo, was used. Fraunhofer IIS pre-processed the content using the MPEG-H Audio production technology Dialog+ to provide dialog enhancement for better intelligibility and to enable basic personalization options. This makes it possible for the audience to interact with the content and adapt it to their own preferences within the limits set by the broadcaster in the ADM production metadata.

In addition to the UHD service, Fraunhofer IIS has enabled a second HD service using content provided by the European Broadcasting Union and France Television produced in MPEG-H Audio in order to show the entire capabilities of the technology, from immersive sound to advanced interactivity features based on audio object positioning.

 

MPEG-H Audio – The Next Generation Audio Standard for Broadcast and Streaming

 

With its unique personalization features, the MPEG-H Audio system, substantially developed by Fraunhofer IIS, offers fully adjustable dialogue levels, customizable audio description, multiple languages, and even interactive object positioning. As a result, users can tailor the media consumption experience to their individual preferences and needs. MPEG-H delivers unprecedented customization options, as well as enveloping immersive sound, on every kind of playback device – from home theaters to 3D soundbars to mobile devices.

The MPEG-H Audio system is specified in all major international TV broadcast standards such as ATSC 3.0, DVB, SBTVD (ISDB-T broadcast in Brazil) as well as in the mobile broadband standard 3GPP. It is currently the only Next Generation Audio codec used in commercial TV broadcasting since the launch of the South Korean ATSC 3.0-based UHDTV service in May 2017.

Sony’s immersive music streaming ecosystem 360 Reality Audio is based on the MPEG-H Audio system. The technology enables a new generation of music entertainment and is already available through streaming services like Amazon Music HD, Deezer, and Tidal.

Dialog+

Improved speech mix

Many people find it hard to follow speech in broadcasting and streaming due to loud background sounds. A recent survey carried out by Fraunhofer IIS and WDR showed that 68% of the audience across all demographics frequently or very frequently had issues with understanding speech on TV. Dialog+, an MPEG-H technology, addresses this issue and ensures clear speech by allowing the adaption of loudness levels of both speech and background sounds. To achieve this, it uses a solution based on deep learning and can be applied when only a final mix is available. This makes it possible to customize the speech level to individual requirements.

Dialog+ is next generation accessibility

Dialog+ is a technology that works particularly well to upgrade older content for which only the final audio mix is available. It also works on today’s legacy systems. Combined with the MPEG-H Audio system, it provides a whole new level of personalization to its users. Thanks to MPEG-H Dialog+, viewers can now select the mix they like and personalize the sound to meet their preferences.

 

Immersive and Personalized Sound for Sports

Service provides are looking for new and innovative ways to enhance their sport programs, especially now, under the COVID-19 pandemic when the audience access to sport events is extremely limited. Often services feature two different streams, one with the empty stadium sound - “Natural Sound” and a second stream including additional Crowd Noise – “Stadium Sound”. With MPEG-H Audio this additionally functionality is brought to consumers’ homes in a much more efficient way using a single stream and in the same time offering an enhanced quality of experience. Sports fans can now simply chose their favorite version of the content and balance the ratio between Natural Sound, the Crowed Noise and the commentator.

Webinar: “Next Generation Audio and Video Technologies”

Part 1: Round table – Meet the Experts

Our experts discuss the current status of standardization and adoption of Next Generation TV technologies in different regions of the world as well as the direction broadcast and streaming are taking in various markets.

Part 2: “Next Generation Audio and Video Technologies”

With the advanced MPEG-H immersive and personalized sound features as well as Advanced HDR by Technicolor, content creators can now enable completely new audio and video experiences for their viewers.

In this session, our experts share their experience gained while working with broadcasters all over the world during major live events that were broadcast or streamed in MPEG-H Audio and Advanced HDR by Technicolor.

Part 3: Hands-on Experience with MPEG-H Audio and Advanced HDR by Technicolor

“How can I produce in HDR and immersive and interactive audio today using my existing production workflows and infrastructure?” This is one of the most important questions we receive from broadcasters. This webinar will give you the answer. Our experts walk you through live production and broadcast, showing you how to enable the most advanced features in your existing facilities.

 

MPEG-H Audio for Live Production and Broadcast

The MPEG-H Audio system is designed to work with today's streaming and broadcast equipment. To learn how to create MPEG-H in live productions for sports events, music shows or any other content, watch our Live Production video series:

The MPEG-H Authoring Suite

The MPEG-H Authoring Suite (MAS) is a set of tools that make the production of MPEG-H Audio content easier, faster, more intuitive and more powerful. They support the recently published MPEG-H ADM Profile, as well as binaural monitoring for immersive audio reproduction over headphones.

 

  • The MPEG-H Authoring Plug-in (MHAPi) takes you through all the steps of creating object- or channel-based MPEG-H Audio productions inside a VST3- or AAX-enabled digital audio workstation (DAW). You will be able to export your immersive and interactive MPEG-H Audio scenes to either MPEG-H Production Format (MPF) or MPEG-H BWF/ADM, containing audio and metadata and ready for distribution via MPEG-H-enabled channels.

 

  • The MPEG-H Authoring Tool (MHAT) is a new software tool for Mac and Windows that helps you create MPEG-H metadata with existing audio material. The MHAT allows for easy MPEG-H authoring without the need of a digital audio workstation (DAW). You can define specific MPEG-H parameters, instantly listen to your configurations and export your authored mixes as MPEG-H Production Format (MPF), MPEG-H BWF/ADM or as a template export in an XML file.

 

Check out our MPEG-H Authoring Suite tutorial:

Studio Recommendations

3D-Audio or immersive audio mixing for home delivery is done using loudspeakers in an audio control room or near-field mixing environment.

Fraunhofer IIS with its extensive experience in 3D-Audio mixing, as well as setting up studios and listening rooms can provide all the structural requirements and technical specifications for a 3D-­Audio production environment for accurate mixing and reproduction in a flexible manner for loudspeaker reproduction systems ranging from 1.0 up to 7.1+4H channel layouts.

Fraunhofer IIS offers consultation for room geometry and room acoustics, loudspeaker positioning and electroacoustic performance, 3D-Audio monitoring and mixing capabilities and provides recommendations for related literature.

Our comprehensive paper on studio recommendations is a great starting point.

 

Download the Studio Recommendations

Audio Definition Model ADM

The Audio Definition Model (ADM) according to ITU-R BS.2076 defines an open metadata format for production, exchange and archiving of Next Generation Audio (NGA) content in file-based workflows. Its comprehensive metadata syntax allows describing many types of audio content including channel-, object-, and scene-based representations for immersive and interactive audio experiences. A serial representation of the Audio Definition Model (S-ADM) is specified in ITU-R BS.2125 and defines a segmentation of the original ADM for use in linear workflows such as real-time production for broadcasting and streaming applications.

Fraunhofer IIS offers the ADM Info Tool, an application that provides automated conformance verification for ADM files.

Learn more about the MPEG-H ADM Profile

Request the Fraunhofer ADM Info Tool