Masking speech contents by random splicing: Is emotional expression preserved?

Felix Burkhardt (audEERING GmbH); Anna Derington (audEERING GmbH); Matthias Kahlau (audEERING GmbH); Klaus Scherer (University of Geneva); Florian Eyben (audEERING); Bjoern Schuller (audEERING)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

We discuss the influence of random splicing on the perception of emotional expression in speech signals. Random splicing is the randomized reconstruction of short audio snippets with the aim to obfuscate the speech contents. A part of the German parliament recordings has been random spliced and both versions – the original and the scrambled ones – manually labeled with respect to the arousal, valence and dominance dimensions. Additionally, we run a state-of-the-art transformer-based pre-trained emotional model on the data. We find sufficiently high correlation for the annotations and predictions of emotional dimensions between both sample versions to be confident that machine learners can be trained with random spliced data.

Tags:

Speech emotion detection and analysis

Masking speech contents by random splicing: Is emotional expression preserved?

Felix Burkhardt (audEERING GmbH); Anna Derington (audEERING GmbH); Matthias Kahlau (audEERING GmbH); Klaus Scherer (University of Geneva); Florian Eyben (audEERING); Bjoern Schuller (audEERING)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

DST: DEFORMABLE SPEECH TRANSFORMER FOR EMOTION RECOGNITION

A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition

Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations

Join the IEEE Signal Processing Society