S. Tyagi, S. Szénási: Revolutionizing Speech Emotion Recognition: A Novel Hilbert Curve Approach for Two-Dimensional Representation and Convolutional Neural Network Classification. In Advances in Service and Industrial Robotics (Mechanisms and Machine Science). Cham, CH : Springer, pp. 75–85, 2024. ISSN 2211-0984, ISBN 978-3-031-59256-0 link

Abstract: Emotions are integral to human existence, influencing psychological well-being and permeating various aspects of daily life. Speech emotion recognition (SER) stands as a pivotal branch of emotion detection, focusing on decoding the acoustic nuances embedded in speech signals. This study delves into the landscape of SER, addressing challenges related to feature extraction and classifier development. Inspired by the Hilbert curve, a novel approach is proposed, converting one-dimensional time series data into informative two-dimensional images. A convolutional neural network extracts features from these images, and a fully connected network processes these features for sentiment classification. The study comprehensively evaluates this method across four diverse datasets, namely RAVDESS, TESS, SAVEE, and EmoDB. The proposed algorithm demonstrates promising results, showcasing potential advantages in emotion recognition tasks. Comparative analyses with existing methodologies, including Gram Angle Fields (GAF) and CyTex, affirm the feasibility and effectiveness of the proposed algorithm. The study contributes to advancing sentiment recognition by transforming time-series data into two-dimensional images, thereby opening new avenues in speech emotion recognition with improved accuracy and performance. The paper outlines the algorithms employed, details the methodology, presents experimental results, and concludes with reflections on findings and potential future directions.