Fraunhofer upHear® Voice Quality Enhancement

Enhanced Speech Recognition in Smart Devices

The Fraunhofer upHear Voice Quality Enhancement software is designed to facilitate voice-controlled human-machine interactions using microphones built into mobile phones and smart home devices such as smart speakers. It removes interfering sounds captured by the device’s microphones, extracts the user’s voice and cancels out acoustical echoes that would otherwise make it impossible for the HMI to understand the user’s request.



With the rapid advancements in machine learning over the last few years, voice-controlled Human Machine Interfaces (HMI) are becoming more widespread with applications in several areas, including mobile phones, smart home devices and cars. Voice-controlled HMI systems typically consist of the following processing units:

  • a keyword spotter to wake up the system
  • an Automatic Speech Recognizer (ASR) module to convert speech into text
  • a Natural Language Understanding Interface (NLUI) to enable natural conversations with the machine
  • a Natural Language Generation (NLG) module to generate meaningful feedback commands to the user
  • a Text-To-Speech (TTS) module to create synthesized speech from text

The input of any voice-controlled HMI is the audio stream captured by the microphones built into the device. In particular, the keyword spotter and the ASR performance are directly impacted by the quality of the captured voice.


Our solution

Fraunhofer upHear Voice Quality Enhancement is a fully integrated and flexible solution combining advanced multichannel source localization and beamforming techniques with echo and noise reduction algorithms. It provides outstanding audio quality even under unfavorable acoustic conditions. Advanced acoustic echo cancellation allows for barge-in functionality in an always-listening operation of the voice-controlled HMI.

Even though the technology supports single-microphone use cases, we recommend the use of microphone arrays to further improve the user experience in challenging conditions, especially for far-field applications.

Contact us for information on device-specific tuning by our sound engineers and consultancy regarding microphone placements.

Product Features

Fraunhofer upHear Voice Quality Enhancement improves voice quality by an optimized integration of the following functionalities:

  • Acoustic Echo Cancellation (AEC) attenuates echoes originating from the devices’ loudspeakers.
  • Direction of Arrival (DOA) estimates the direction of the active talker.
  • Beamforming exploits the spatial diversity offered by an array of microphones to achieve improved directional sound acquisition and extracts the user’s voice even in far-field conditions.
  • Noise Reduction (NR), Dereverberation and Automatic Gain Control (AGC) further enhances the quality of the captured voice.

Product Requirements

Fraunhofer upHear Voice Quality Enhancement can be adapted to the unique body and microphone configuration of the device. This enables flexibility in the product design, and ensures optimal performance. Commonly used array geometries such as linear or circular microphone placements are natively supported.

The number of microphones and their arrangement needed for multichannel speech enhancement depend on the application scenario and the product design. Typically, it ranges from 2, 4 or even up to 8 for highest quality operation. Configurations shown in the following graphic are only examples.


Fraunhofer upHear Voice Quality Enhancement is available for licensing. The software library can be provided for:

  • Desktop platforms (Windows, Mac, Linux)
  • Mobile Apps (iOS, Android)
  • Embedded Systems (e.g., ARM Cortex)


If you are interested in licensing software from us please fill out the request form below.

Request licensing information: upHear Voice Quality Enhancement

To request a price quote or an evaluation license, please fill in and submit the form.

* Required

Software platform:
Hardware platform