TY - GEN
T1 - Deep Learning for the Detection of Emotion in Human Speech
T2 - 32nd Wireless and Optical Communications Conference, WOCC 2023
AU - Wurst, Alexander
AU - Hopwood, Michael
AU - Wu, Sifan
AU - Li, Fei
AU - Yao, Yu Dong
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Identification of emotion types is important in the diagnosis and treatment of certain mental illnesses. This study uses audio data and deep learning methods such as convolutional neural networks (CNN) and long short-term memory (LSTM) to classify the emotion of human speech. We use the IEMOCAP and DEMoS datasets, consisting of English and Italian audio speech data in our experiments to classify speech into one of up to four emotions: angry, happy, neutral, and sad. The classification performance results demonstrate the effectiveness of the deep learning methods and our experiments yield between 62 and 92 percent classification accuracies. We specifically investigate the impact of the audio sample duration on the classification accuracy. In addition, we examine and compare the classification accuracy for English versus Italian languages.
AB - Identification of emotion types is important in the diagnosis and treatment of certain mental illnesses. This study uses audio data and deep learning methods such as convolutional neural networks (CNN) and long short-term memory (LSTM) to classify the emotion of human speech. We use the IEMOCAP and DEMoS datasets, consisting of English and Italian audio speech data in our experiments to classify speech into one of up to four emotions: angry, happy, neutral, and sad. The classification performance results demonstrate the effectiveness of the deep learning methods and our experiments yield between 62 and 92 percent classification accuracies. We specifically investigate the impact of the audio sample duration on the classification accuracy. In addition, we examine and compare the classification accuracy for English versus Italian languages.
KW - convolutional neural network (CNN)
KW - deep learning
KW - emotion recognition
KW - long short-term memory (LSTM)
KW - spectrogram
UR - http://www.scopus.com/inward/record.url?scp=85162716669&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85162716669&partnerID=8YFLogxK
U2 - 10.1109/WOCC58016.2023.10139686
DO - 10.1109/WOCC58016.2023.10139686
M3 - Conference contribution
AN - SCOPUS:85162716669
T3 - 32nd Wireless and Optical Communications Conference, WOCC 2023
BT - 32nd Wireless and Optical Communications Conference, WOCC 2023
Y2 - 5 May 2023 through 6 May 2023
ER -