The OnHW Dataset: Online Handwriting Recognition from IMU-Enhanced Ballpoint Pens with Machine Learning
Felix Ott, Mohamad Wehbi, Tim Hamann, Jens Barth, Björn Eskofier, Christopher Mutschler
In: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2020, Article No.: 92, https://doi.org/10.1145/3411842
This paper presents a handwriting recognition (HWR) system that deals with online character recognition in real-time. Our sensor-enhanced ballpoint pen delivers sensor data streams from triaxial acceleration, gyroscope, magnetometer and force signals at 100 Hz. As most existing datasets do not meet the requirements of online handwriting recognition and as they have been collected using specific equipment under constrained conditions, we propose a novel online handwriting dataset acquired from 119 writers consisting of 31,275 uppercase and lowercase English alphabet character recordings (52 classes) as part of the UbiComp 2020 Time Series Classification Challenge. Our novel OnHW-chars dataset allows for the evaluations of uppercase, lowercase and combined classification tasks, on both writer-dependent (WD) and writer-independent (WI) classes and we show that properly tuned machine learning pipelines as well as deep learning classifiers (such as CNNs, LSTMs, and BiLSTMs) yield accuracies up to 90 % for the WD task and 83 % for the WI task for uppercase characters. Our baseline implementations together with the rich and publicly available OnHW dataset serve as a baseline for future research in that area.
Tobias Feigl, Sebastian Kram, Philipp Woller, Ramiz H. Siddiqui, Michael Philippsen, Christopher Mutschler
In: Sensors 2020, 20(13), 3656; https://doi.org/10.3390/s20133656
Pedestrian Dead Reckoning (PDR) uses inertial measurement units (IMUs) and combines velocity and orientation estimates to determine a position. The estimation of the velocity is still challenging, as the integration of noisy acceleration and angular speed signals over a long period of time causes large drifts. Classic approaches to estimate the velocity optimize for specific applications, sensor positions, and types of movement and require extensive parameter tuning. Our novel hybrid filter combines a convolutional neural network (CNN) and a bidirectional recurrent neural network (BLSTM) (that extract spatial features from the sensor signals and track their temporal relationships) with a linear Kalman filter (LKF) that improves the velocity estimates. Our experiments show the robustness against different movement states and changes in orientation, even in highly dynamic situations. We compare the new architecture with conventional, machine, and deep learning methods and show that from a single non-calibrated IMU, our novel architecture outperforms the state-of-the-art in terms of velocity (≤0.16 m/s) and traveled distance (≤3 m/km). It also generalizes well to different and varying movement speeds and provides accurate and precise velocity estimates
ViPR: Visual-Odometry-aided Pose Regression for 6DoF Camera Localization
Felix Ott, Tobias Feigl, Christoffer Löffler, Christopher Mutschler
In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Visual Odometry (VO) accumulates a positional drift in long-term robot navigation tasks. Although Convolutional Neural Networks (CNNs) improve VO in various aspects, VO still suffers from moving obstacles, discontinuous observation of features, and poor textures or visual information. While recent approaches estimate a 6DoF pose either directly from (a series of) images or by merging depth maps with optical flow (OF), research that combines absolute pose regression with OF is limited.We propose ViPR, a novel modular architecture for longterm 6DoF VO that leverages temporal information and synergies between absolute pose estimates (from PoseNet-like modules) and relative pose estimates (from FlowNet-based modules) by combining both through recurrent layers. Experiments on known datasets and on our own Industry dataset show that our modular design outperforms state ofthe art in long-term navigation tasks.
Localization Limitations of ARCore, ARKit, and Hololens in Dynamic Large-scale Industry Environments
Tobias Feigl, Andreas Porada, Steve Steiner, Christoffer Löffler, Christopher Mutschler, Michael Philippsen
In: Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 1: GRAPP, ISBN 978-989-758-402-2, pages 307-318. DOI: 10.5220/0008989903070318
Augmented Reality (AR) systems are envisioned to soon be used as smart tools across many Industry 4.0 scenarios. The main promise is that such systems will make workers more productive when they can obtain additional situationally coordinated information both seemlessly and hands-free. This paper studies the applicability of today’s popular AR systems (Apple ARKit, Google ARCore, and Microsoft Hololens) in such an industrial context (large area of 1,600m2, long walking distances of 60m between cubicles, and dynamic environments with volatile natural features). With an elaborate measurement campaign that employs a sub-millimeter accurate optical localization system, we show that for such a context, i.e., when a reliable and accurate tracking of a user matters, the Simultaneous Localization and Mapping (SLAM) techniques of these AR systems are a showstopper. Out of the box, these AR systems are far from useful even for normal motion behavior. They accumulate an average error of about 17 m per 120m, with a scaling error of up to 14.4cm/m that is quasi-directly proportional to the path length. By adding natural features, the tracking reliability can be improved, but not enough.
NLOS Detection using UWB Channel Impulse Responses and Convolutional Neural Networks
Maximilian Stahlke, Sebastian Kram, Christopher Mutschler, Thomas Mahr
In: 2020 International Conference on Localization and GNSS (ICL-GNSS)
Indoor environments often pose challenges to RFbased positioning systems. Typically, objects within the environment influence the signal propagation due to absorption, reflection, and scattering effects. This results in errors in the estimation of the time or arrival (TOA) and hence leads to errors in the position estimation. Recently, different approaches based on classical, feature-based machine learning (ML) have successfully detected such obstructions based on CIRs of ultra wideband (UWB) positioning systems.This paper applies different convolutional neural network architectures (ResNet, Encoder, FCN) to detect non line-of-sight (NLOS) channel conditions directly from the CIR raw data. A realistic measurement campaign is used to train and evaluate the algorithms. The proposed methods highly outperform the featurebased ML baselines while still using low network complexities. We also show that the models generalize well to unknown receivers and environments and that positioning filters benefit significantly from the identification of NLOS measurements.
Real-Time Gait Reconstruction For Virtual Reality Using a Single Sensor
Tobias Feigl, Lisa Gruner, Christopher Mutschler, Daniel Roth
In: 2020 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)
Embodying users through avatars based on motion tracking and reconstruction is an ongoing challenge for VR application developers. High quality VR systems use full-body tracking or inverse kinematics to reconstruct the motion of the lower extremities and control the avatar animation. Mobile systems are limited to the motion sensing of head-mounted displays (HMDs) and typically cannot offer this.We propose an approach to reconstruct gait motions from a single head-mounted accelerometer. We train our models to map head motions to corresponding ground truth gait phases. To reconstruct leg motion, the models predict gait phases to trigger equivalent synthetic animations. We designed four models: a threshold-based, a correlation-based, a Support Vector Machine (SVM) -based and a bidirectional long-term short-term memory (BLSTM) -based model. Our experiments show that, while the BLSTM approach is the most accurate, only the correlation approach runs on a mobile VR system in real time with sufficient accuracy. Our user study with 21 test subjects examined the effects of our approach on simulator sickness and showed significantly less negative effects on disorientation.
A Sense of Quality for Augmented Reality Assisted Process Guidance
Anes Redzepagic, Christoffer Löffler, Tobias Feigl, Christopher Mutschler
In: 2020 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)
The ongoing automation of modern production processes requires novel human-computer interaction concepts that support employees in dealing with the unstoppable increase in time pressure, cognitive load, and the required fine-grained and process-specific knowledge. Augmented Reality (AR) systems support employees by guiding and teaching work processes. Such systems still lack a precise process quality analysis (monitoring), which is, however, crucial to close gaps in the quality assurance of industrial processes.We combine inertial sensors, mounted on work tools, with AR headsets to enrich modern assistance systems with a sense of process quality. For this purpose, we develop a Machine Learning (ML) classifier that predicts quality metrics from a 9-degrees of freedom inertial measurement unit, while we simultaneously guide and track the work processes with a HoloLens AR system. In our user study, 6 test subjects perform typical assembly tasks with our system. We evaluate the tracking accuracy of the system based on a precise optical reference system and evaluate the classification of each work step quality based on the collected ground truth data. Our evaluation shows a tracking accuracy of fast dynamic movements of 4.92mm and our classifier predicts the actions carried out with mean F1 value of 93.8% on average.
High-Speed Collision Avoidance using Deep Reinforcement Learning and Domain Randomization for Autonomous Vehicles
Georgios D. Kontes, Daniel D. Scherer, Tim Nisslbeck, Janina Fischer, Christopher Mutschler
In: 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC)
Recently, deep neural networks trained with Imitation-Learning techniques have managed to successfully control autonomous cars in a variety of urban and highway environments. One of the main limitations of policies trained with imitation learning that has become apparent, however, is that they show poor performance when having to deal with extreme situations at test time- like high-speed collision avoidance - since there is not enough data available from such rare cases during training. In our work, we take the stance that training complex active safety systems for vehicles should be performed in simulation and the transfer of the learned driving policy to the real vehicle should be performed utilizing simulation to-reality transfer techniques. To communicate this idea, we setup a high-speed collision avoidance scenario in simulation and train the safety system with Reinforcement Learning. We utilize Domain Randomization to enable simulation-to-reality transfer. Here, the policy is not trained on a single version of the setup but on several variations of the problem, each with different parameters. Our experiments show that the resulting policy is able to generalize much better to different values for the vehicle speed and distance from the obstacle compared to policies trained in the non-randomized version of the setup.
IALE: Imitating Active Learner Ensembles
Christoffer Löffler, Karthik Ayyalasomayajula, Sascha Riechel, Christopher Mutschler
Active learning (AL) prioritizes the labeling of the most informative data samples. However, the performance of AL heuristics depends on the structure of the underlying classifier model and the data. We propose an imitation learning scheme that imitates the selection of the best expert heuristic at each stage of the AL cycle in a batch-mode pool-based setting. We use DAGGER to train the policy on a dataset and later apply it to datasets from similar domains. With multiple AL heuristics as experts, the policy is able to reflect the choices of the best AL heuristics given the current state of the AL process. Our experiment on well-known datasets show that we both outperform state of the art imitation learners and heuristics.
Ashutosh Mishra, Christoffer Löffler, Axel Plinge
In: Workshop on Energy Efficient Machine Learning and Cognitive Computing; Saturday, December 05, 2020 Virtual (from San Jose, California, USA)
Given the presence of deep neural networks (DNNs) in all kinds of applications, the question of optimized deployment is becoming increasingly important. One important step is the automated size reduction of the model footprint. Of all the methods emerging, post-training quantization is one of the simplest to apply. Without needing long processing or access to the training set, a straightforward reduction of the memory footprint by an order of magnitude can be achieved. A difficult question is which quantization methodology to use and how to optimize different parts of the model with respect to different bit width. We present an in-depth analysis on different types of networks for audio, computer vision, medical and hand-held manufacturing tools use cases; Each is compressed with fixed and adaptive quantization and fixed and variable bit width for the individual tensors.