Online Handwriting Recognition from Sensor-Enhanced Pens

Handwriting is an important skill. In children, it benefits motor control, visual-motor integration, proprioception and sustained attention. Studies have also shown that handwriting improves the brain’s processing capabilities compared to typing notes on a laptop. Furthermore, it enhances the users’ ideation abilities. Despite these scientific advantages, the main reason for most people to use pen and paper is just the comfortable, analog, natural look-and-feel.

The OnHW-chars Dataset

The dataset can be downloaded here (895 MB). (Update:2021-06-30)

For more information, see the readme.pdf file.

 

Data Acquisition

To obtain the sensor data, we implemented a recording app that connects to a DigiPen and tells the volunteers which letter to write. These are some of the constraints that were met during the recordings:

  • The recordings were conducted sitting on a chair in front of a table.
  • The writing surface was horizontal.
  • Normal, white paper sheets (about 80g/m^2) were used to write upon.
  • The sheet was padded by five additional sheets.
  • There was no guideline concerning the size of the handwriting. The subjects were asked to use a size that is natural for them.
  • There was no guideline concerning the way of holding the pen. The subjects were asked to use a position that is natural for them.
  • The volunteers were asked to make sure the STABILO logo faces up to avert different pen orientations.
  • Participants could choose freely between print and cursive writing styles.
  • Only right-handed recordings are released.
 

Sensors

Hardware

The STABILO Digipen is a sensor-enhanced ballpoint pen with internal data processing capabilities. Its Bluetooth module enables live streaming sensor data to a connected device. The pen’s internal power source lasts at least 17 hours and is recharcheable via micro USB. Its diameter is 15mm, its overall length is 167mm and it weighs 25g which, along with its ergonomic soft-touch grip zone, makes it comfortable and easy-to-use.

Each Digipen is equipped with five sensors.

  • Front accelerometer (STM LSM6DSL)
  • Gyroscope (STM LSM6DSL)
  • Rear accelerometer (Freescale MMA8451Q)
  • Magnetometer (ALPS HSCDTD008A)
  • Force sensor (ALPS HSFPAR003A)

Sensor Data

The sensors’ raw data stream is provided in the files called sensor_data.csv. Each file consists of 15 columns:

  • Millis: The timestamp when the data were processed on the tablet computer that the pen was connected to during recording
  • Acc1 X, Acc1 Y, Acc1 Z: The values of the front accelerometer in three dimensions
  • Acc2 X, Acc2 Y, Acc2 Z: The values of the rear accelerometer in three dimensions
  • Gyro X, Gyro Y, Gyro Z: The gyroscope values in three dimensions
  • Mag X, Mag Y, Mag Z: The magnetometer values in three dimensions
  • Force: The force with which the pen tip touches the surface
  • Time: A sample counter
 

Citation

If you use the OnHW dataset, please cite:

Felix Ott*, Mohamad Wehbi*, Tim Hamann, Jens Barth, Björn Eskofier, and Christopher Mutschler. The OnHW Dataset: Online Handwriting Recognition from IMU-Enhanced Ballpoint Pens with Machine Learning. In Proc. of the ACM Interact. Mob. Wearable Ubiquitous Technol. (IMWUT), vol. 4, no. 3, article 92, pages 1-20, Cancún, Mexico, September 2020, doi: 10.1145/3411842.

 

BibTex:

@inproceedings{ott_onhw,

author = {Felix Ott and Mohamad Wehbi and Tim Hamann and Jens Barth and Björn Eskofier and Christopher Mutschler},

title = {{The OnHW Dataset: Online Handwriting Recognition from IMU-Enhanced Ballpoint Pens with Machine Learning}},

booktitle = {Proc. of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT)},

volume = {4(3), article 92},

pages = {1--20},

address = {Canc\'{u}n, Mexico},

month = sep,

year = {2020}

doi = {10.1145/3411842}

}

Pen Tip Reconstruction and Classification from Online Handwriting

The supplementary material of the paper and the dataset can be downloaded here (28 MB).
For more information, see the readme.txt file.


Citation

If you use the dataset, please cite:

Felix Ott, David Rügamer, Lucas Heublein, Bernd Bischl, and Christopher Mutschler. Joint Classification and Trajectory Regression of Online Handwriting using a Multi-Task Learning Approach. In Proc. of the IEEE/CVF Winter Conf. on Applications of Computer Vision (WACV), pages 266-276, Waikoloa, HI, January 2022, doi: 10.1109/WACV51458.2022.00131.

 

BibTex:
@inproceedings{ott_wacv,
author = {Felix Ott and David Rügamer and Lucas Heublein and Bernd Bischl and Christopher Mutschler},

title = {{Joint Classification and Trajectory Regression of Online Handwriting using a Multi-Task Learning Approach}},

booktitle = {Proc. of the IEEE/CVF Winter Conf. for Applications on Computer Vision (WACV)},

address = {Waikoloa, HI},

pages = {266--276},

month = jan,

year = {2022}

doi = {10.1109/WACV51458.2022.00131}
}

Sequence-based OnHW Datasets

If you use the dataset, please cite:

Felix Ott, David Rügamer, Lucas Heublein, Tim Hamann, Jens Barth, Bernd Bischl, and Christopher Mutschler. Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens. In International Journal on Document Analysis and Recognition (IJDAR), September 2022, doi: 10.1007/s10032-022-00415-6.

 

BibTex:

@inproceedings{ott_ijdar,

author = {Felix Ott and David Rügamer and Lucas Heublein and Tim Hamann and Jens Barth and Bernd Bischl and Christopher Mutschler},

title = {{Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens}},

booktitle = {International Journal on Document Analysis and Recognition (IJDAR)},

month = sep,

year = {2022},

doi = {10.1007/s10032-022-00415-6}

}

Dataset Right-handed Left-handed
Writer-dependent Writer-independent Writer-dependent Writer-independent
OnHW-equations OnHW-equations_dep.zip OnHW-equations_indep.zip OnHW-equations_dep_L.zip

OnHW-equations_indep_L.zip

OnHW-words500 OnHW-words500_dep.zip

OnHW-words500_indep.zip

OnHW-words500_dep_L.zip

OnHW-words500_indep_L.zip

OnHW-wordsTraj

OnHW-wordsTraj_person1.zip

 

OnHW-wordsTraj_person2.zip

- - -
ICROW ICROW_dep.zip

ICROW_indep.zip

- -

We used a writer-dependent and writer-independent split of the VNOnDB and IAM-OnDB datasets. For more information, see for the VNOnDB dataset and see for the IAM-OnDB dataset.

Character-based OnHW Datasets

If you use the dataset, please cite:

Felix Ott, David Rügamer, Lucas Heublein, Tim Hamann, Jens Barth, Bernd Bischl, and Christopher Mutschler. Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens. In International Journal on Document Analysis and Recognition (IJDAR), September 2022, doi: 10.1007/s10032-022-00415-6.

 

BibTex:

@inproceedings{ott_ijdar,

author = {Felix Ott and David Rügamer and Lucas Heublein and Tim Hamann and Jens Barth and Bernd Bischl and Christopher Mutschler},

title = {{Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens}},

booktitle = {International Journal on Document Analysis and Recognition (IJDAR)},

month = sep,

year = {2022},

doi = {10.1007/s10032-022-00415-6}

}

  Right-handed Left-handed
Dataset Writer-dependent Writer-independent  
OnHW-symbols, split OnHW_equations OnHW-symbols_equations_dep.zip OnHW-symbols_equations_indep.zip

OnHW-symbols_equations_L.zip

OnHW-symbols, split OnHW_equations (CTC split) OnHW-equations_dep_split_ctc.zip OnHW-equations_indep_split_ctc.zip -
OnHW-chars See top OnHW-chars_L.zip

Readme file

Dataset images

Uncertainty-aware Evaluation of Online Handwriting Recognition

When combining right-handed and left-handed writers (an underrepresented group), out-of-distribution data with domain shifts appear. Uncertainty quantification (UQ) techniques can detect such a domain shift. We conduct a broad evaluation of aleatoric (data) and epistemic (model) UQ based on two prominent techniques for Bayesian inference, Stochastic Weight Averaging-Gaussian (SWAG) and Deep Ensembles.

The corresponding code of the paper can be downloaded here. This paper uses the right-handed OnHW-chars and left-handed OnHW-chars-L datasets (see top). Our paper is accepted for publication at the 1st Intl. Workshop on Spatio-Temporal Reasoning and Learning (STRL) collocated with the Intl. Joint Conf. on Artificial Intelligence IJCAI-ECAI 2022.

Citation

Andreas Klaß, Sven M. Lorenz, Martin W. Lauer-Schmaltz, David Rügamer, Bernd Bischl, Christopher Mutschler, and Felix Ott. Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift. In IJCAI-ECAI Intl. Workshop on Spatio-Temporal Reasoning and Learning (STRL), volume 3190, Vienna, Austria, July 2022.


BibTex:

@inproceedings{klass_lorenz_strl,

author = {Andreas Klaß and Sven M. Lorenz and Martin W. Lauer-Schmaltz and David Rügamer and Bernd Bischl and Christopher Mutschler and Felix Ott},

title = {{Uncertainty-aware Evaluation of Time-Series Classification for Online Handwriting Recognition with Domain Shift}},

booktitle = {IJCAI-ECAI Intl. Workshop on Spatio-Temporal Reasoning and Learning (STRL)},

volume = {3190},

month = jul,

year = 2022,

address = {Vienna, Austria},

issn = {1613-0073}

}

Domain Adaptation for Time-Series Classification

When combining right-handed and left-handed writers (an underrepresented group), out-of-distribution data with domain shifts appear. We propose a method based on optimal transport to transform source domain features into target domain features. The corresponding code of the paper can be downloaded here.

Our paper uses the right-handed OnHW-chars and left-handed OnHW-chars-L datasets (see top). Our paper is accepted for publication at the ACM Intl. Conf. on Multimedia (ACMMM), October 2022.

Citation

If you use our method, please cite:

Felix Ott, David Rügamer, Lucas Heublein, Bernd Bischl, and Christopher Mutschler. Domain Adaptation for Time-Series Classification to Mitigate Covariate Shift. In Proc. of the ACM Intl. Conf. on Multimedia (ACMMM), pages 5934-5943, October 2022, Lisboa, Portugal, doi: 10.1145/3503161.3548167.


BibTex:


@inproceedings{ott_acmmm,

author = {Felix Ott and David Rügamer and Lucas Heublein and Bernd Bischl and Christopher Mutschler},

title = {{Domain Adaptation for Time-Series Classification to Mitigate Covariate Shift}},

booktitle = {Proc. of the ACM Intl. Conf. on Multimedia (ACMMM)},

pages = {5934--5943},

month = oct,

year = 2022,

address = {Lisboa, Portugal},

doi = {10.1145/3503161.3548167}

}

Representation Learning for Tablet and Paper Domain Adaptation

To combine applications to write on paper with sensor-enhanced pens and applications to write on touch screen surfaces with a stylus pen, we integrated a Wacom EMR tip into the sensor-enhaned pen and collected data for writing on tablet. As a domain shift appears between sensor data for writing on tablet and on paper, we propose a representation learning technique to align the features between both representations. The paper is accepted for publication at IAPR Intl. Workshop on Multimodal Pattern Recognition of Social Signals in Human Computer Interaction (MPRSS).

Citation

If you use our method, please cite:

Felix Ott, David Rügamer, Lucas Heublein, Bernd Bischl, and Christopher Mutschler. Representation Learning for Tablet and Paper Domain Adaptation in Favor of Online Handwriting Recognition. In IAPR Intl. Workshop on Multimodal Pattern Recognition of Social Signals in Human Computer Interaction (MPRSS), August 2022, Montreal, Canada.

 

BibTex:

@inproceedings{ott_mprss,

author = {Felix Ott and David Rügamer and Lucas Heublein and Bernd Bischl and Christopher Mutschler},

title = {{Representation Learning for Tablet and Paper Domain Adaptation in Favor of Online Handwriting Recognition}},

booktitle = {IAPR Intl. Workshop on Multimodal Pattern Recognition of Social Signals in Human Computer Interaction (MPRSS)},

month = aug,

year = {2022},

address = {Montreal, Canada}

}

Cross-Modal Representation Learning with Triplet Loss Functions

To enhance to time-series classification task for online handwriting recognition, we augment the embeddings with image data for offline handwriting recognition. Our method first generates arbitrarily many image samples with ScrabbleGAN, trains an offline HWR architecture, and learns a common representation between the offline and online HWR architectures using pairwise and triplet learning techniques.

Citation

If you use our method, please cite:

Felix Ott, David Rügamer, Lucas Heublein, Bernd Bischl, and Christopher Mutschler. Cross-Modal Common Representation Learning with Triplet Loss Functions. In arXiv preprint arXiv:2202.07901, February 2022.

 

BibTex:

@inproceedings{ott_cmr,

author = {Felix Ott and David Rügamer and Lucas Heublein and Bernd Bischl and Christopher Mutschler},

title = {{Cross-Modal Common Representation Learning with Triplet Loss Functions}},

booktitle = {arXiv preprint arXiv:2202.07901},

month = feb,

Year = {2022},

doi = {10.48550/arXiv.2202.07901}

}