AI-driven speech technologies

Reliable speech technologies have gained hugely in importance due to increased use of videoconferencing and voice-controlled devices

© Fraunhofer IIS und F.M.Eckstein Fotografie
Using AI methods, unwelcome sounds can be eliminated from recordings.

The Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS and Fraunhofer IIS have globally unrivaled expertise in the field of speech technologies. Building on this, the institutes are partnering with German industry to develop a completely new voice assistance system. Because the resulting solutions are independent of US or Asian technologies, they can guarantee that data security complies with European standards. To pool their expertise, the two Fraunhofer Institutes are planning to establish a German Center for Speech Technologies, which will be at the center of an extensive ecosystem of start-ups, SMEs, industry and research.

Unlocking the power of AI to solve complex tasks is a particular challenge here. For example, neural networks and machine learning methods can detect the speech signal in a video recording and separate it from other sounds. The noises are efficiently suppressed as a result, which greatly increases speech intelligibility. And when it comes to speech output, AI can generate natural-sounding voices that very closely approximate the intonation and emotions of human speakers. Moreover, the AI-driven adaptability of speech characteristics can be used to reinforce brand identity – for example, with the voice of a well-known speaker.