A modern car
© Sergey Nivens - Fotolia.com

SEMULIN

Natural human-machine interface for automated driving

SEMULIN – natural, multimodal interaction for automated driving

Machine learning in human machine interfaces

Goal

Development of a self-supporting natural human-machine interface for automated driving using multimodal input and output modes including facial expressions, gestures, gaze, and speech.

In conjunction with considerations relating to the immediate environment (vehicle interior, etc.), the result is a holistic development approach for a human-machine interface (HMI) tailored to the human senses, based on machine learning approaches.

The system facilitates interaction while enhancing user experience and acceptance for automonous driving in all areas. The methods to measure user satisfaction developed in the course of the project will form the basis for other projects with a similar design.

Motivation and challenge

The user interface has a key role to play in automated driving: in light of the growing complexity of systems and the demands placed on them, user interfaces must be able to support a range of functions, process information, and offer a high degree of operator friendliness.

At present, there are various constraints limiting natural interaction between driver, passenger, and vehicle, particularly where a shift between the various modes (gestures, language, lighting, speaker, etc.) or combinations thereof is concerned.

To enable human-centered interaction of this sort, it is necessary to take these different modes – along with contextual information – into account, and combine them in a meaningful way. A particular challenge is correctly identifying the user’s precise intentions and generating actions on the part of the system accordingly.

A human-machine interface with intelligent sensor interpretation and data fusion

In order to develop a human-centered human-machine interface featuring a tailored system architecture with due consideration of the overall context, we examine all available modes so as to be able to intelligently interpret and combine the resulting aggregated sensor data.

To this end, we employ established technologies such as our SHORE® analysis software for video-based emotion recognition and additional integrated sensors to pre-process and intelligently interpret the data. Rules-based and AI-based methods and their multimodal implementation allow us to form connections within the data.

Intelligent sensor interpretation and fusion delivers information on the state of the user, their intentions, and their potential reactions. Novel approaches such as interactive learning are employed in order to constantly adapt the system to the needs of each user.

Partners

Elektrobit Automotive GmbH (project coordinator)

  • Concept, design, HMI, system architecture, demonstrator setup

Fraunhofer IIS | Smart Sensing and Electronics Division, Audio and Media Technologies Division

  • Video-based facial expression and emotion recognition
  • Speech platform, dialog systems, speaker recognition

audEERING GmbH

  • Affective voice computing, machine learning, speech-based emotion recognition

Eesy Innovation GmbH

  • Output modes, user-centered controllable lighting solution

Blickshift

  • Eye tracking, multimodal framework

Infineon Technologies AG

  • HPC architectures, intelligent sensors, gesture recognition, safety concepts

Ulm University | Institute of Media Informatics, Institute of Psychology and Education

  • Medieninformatik, HMI, Multimodalität, ELSI
  • Human Factors, psycholog. Modellierung, Empirik

Mercedes-Benz

  • Technical consulting and reviews in the fields of HMI, UX, multimodality

ELSI

In order to maximize acceptance for the system, throughout the project the associated ethical, legal, and social implications (ELSI) will be examined, assessed, and incorporated into the research approach in accordance with the applicable guidelines.

For further information concerning the SEMULIN joint project please contact Stephan Gick.

stephan.gick@iis.fraunhofer.de