EMOTION RECOGNITION FROM SPEECH USING DEEP LEARNING TECHNIQUES
Ms. Ritu Vijay Bhalerao, Ms. Kaveri Santosh Ahire
Data Science Department & Dr. D.Y Patil Arts, Commerce, Science College, Pimpri, Maharashtra
Abstract
Emotions are a natural component of human language and contribute significantly to communication. They are expressed in tone, pitch, and rhythm, through which people convey feelings beyond the words. Speech Emotion Recognition (SER) refers to the process of automatically recognizing emotions like happiness, sadness, anger, fear, or neutrality from voice signals.
Previously, hand-crafted features were used with classical classifiers, but these were not as accurate. With the development of deep learning, models such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks have proved to be better by learning features end-to-end from audio. These methods are able to extract both sound patterns and speech time sequence.
SER has numerous real-world applications in fields like virtual assistants, medicine, education, and customer care. Nonetheless, challenges exist in the form of background noise, speaker variability, and overlapping affect. This research is concerned with using deep learning models with characteristics such as MFCCs, Chroma, and Spectral Contrast to enhance emotion detection from speech accuracy and dependability.
Keywords: Speech Emotion Recognition (SER), Deep Learning, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Feature Extraction, Mel-Frequency Cepstral Coefficients (MFCCs), Chroma Features, Spectral Contrast, Human–Computer Interaction (HCI), Emotion Classification
Journal Name :
VIEW PDF
EPRA International Journal of Multidisciplinary Research (IJMR)
VIEW PDF
Published on : 2025-10-08
| Vol | : | 11 |
| Issue | : | 10 |
| Month | : | October |
| Year | : | 2025 |