Dialogue Enhancement

Personalized sound for TV programs


Finding the right balance between dialogue and ambient sound is a major challenge for sound engineers and an increasing cause of audience complaints.1 Fraunhofer IIS developed a backwards compatible technology that enables the audience to influence the audio balance to suit their personal preferences, their listening environment and their hearing abilities.

Dialogue Enhancement has been standardized within DVB as "Advanced Clean Audio Services" for the audio-video-coding toolbox.


Dialogue Enhancement...


  • ...enables users to adjust the volume of individual audio elements within a broadcast program.
    • Example: Adjust the audio volume levels of commentary or background ambience at live sport coverage
    • Result: Increased speech intelligibility or deeper involvement in the atmosphere of a live event
  • ...enables cost efficient hearing-impaired audio service.
  • ...is backwards compatible with existing receiver infrastructures that play back the default mix.

1  Compare presentation of BBC Business- and Technology-Analyst Phil Greene at the Loudness Summit in London, Dec. 2011.

Working Principle

Dialogue Enhancement working principle
© Fraunhofer IIS


The source signals (for example speech or music) are analyzed before they are mixed into a single signal. A parametric description of the relation between the signals is generated and transmitted in addition to the downmix signal. The parametric side information enables the receiver to adjust the volume of each source individually and to therefore improve the intelligibility of a dialogue or a sports commentary.

Learn more about working principle in the Dialogue Enhancement technical paper.

To prove the feasibility and to test user reactions the BBC and Fraunhofer IIS conducted an experiment during the 2011 Wimbledon tennis tournament. For this experiment UK listeners of the BBC Radio 5 Live Internet stream could download a special player allowing them to control the relative volume of commentary to court sound. This player was linked to a user survey. According to this survey, the majority of the participating listeners regarded the possibility of modifying the sound balance of one’s TV or radio as very useful. Further, half of the users explained they would prefer enhanced court sounds while the other half favored louder commentary.



Dialogue Enhancement has been designed to enable a personalized listening experience of broadcast TV programs for better speech intelligibility. The technology is a response to the acknowledgement that although audio mixes delivered by the broadcasters themselves may be well-balanced, this doesn’t always guarantee a satisfactory experience on the receiving end.

Several aspects determine how well each listener comprehends the dialogue of a broadcast program:


Listening environment

The listening environment and the reproduction equipment have a considerable influence on how listeners perceive an audio mix. For instance, when listening with headphones, noisy background audio can mask important dialogue, and thus a different balance of dialogue versus background would be advantageous.


Foreign languages

Listening to programs in non-native languages typically requires greater concentration. A higher dialogue volume in comparison to the background levels makes listening less arduous and helps to improve overall intelligibility. Experiments have shown that an approximately 3dB increase in signal-to-noise-ratio (SNR) enhances dialogue intelligibility to match that of the listener’s native language1. In some cases when the speech material is too complex, 3dB is not enough. According to Warzybok et al., these situations call for an increase of 5-10 dB, depending on the audience’s language skills2.


Hearing abilities

Individuals with hearing impairments benefit from an increased signal-to-noise-ratio (SNR) in order to enjoy and understand broadcast programs as well as people with normal hearing abilities. Raising the SNR by only one dB alone achieves a great increase in speech intelligibility.For more information about the effect of an increased SNR, please see Brand and Kollmeier3. Heger and Holube4 as well as Kochkin5 provide more information about the extension and development of hearing loss.


With Dialogue Enhancement TV viewers can individually adjust the incoming audio mix to their abilities and preferences, and increase the TV programs’ intelligibility.

1Florentine, M. 1985., Speech perception in noise by fluent, non‐native listeners. J. Acoust. Soc. Am., Volume 77, Issue S1, pp. S106-S106.

2Warzybok, A. et al., 2010. Influence of the linguistic complexity in relation to speech material on non-native speech perception in noise. DAGA 2010, Berlin.

3Brand, T. and Kollmeier, B., 2002. Efficient adaptive procedures for threshold and concurrent slope estimates for psychophysics and speech intelligibility tests. The Journal of the Acoustical Society of America, 111(6), 2801. doi:10.1121/1.1479152

4Heger, D., and Holube, I., 2010. Wie viele Menschen sind schwerhörig? Zeitschrift für Audiologie, 49(2), pp. 61–70.

5Kochkin, S., 2005. MarkeTrak VII: Hearing loss population tops 31 million people. Hearing Review, 12(7). https://www.hearing.org/hearingorg/document-server/?cfp=hearingorg/assets/File/public/marketrak/MarkeTrak_VII_Hearing_Loss_Population_Tops_31_Million_People.pdf