Font Size: a A A

Research And Implementation Of Music Emotion Recognition Based On Multimodal Features Fusion

Posted on:2022-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:C G ZhangFull Text:PDF
GTID:2505306779468724Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
The digital audio technology has resulted in the storage of a large amount of music in online music databases.Music emotion recognition(MER)is increasingly becoming a research hotspot in the fields of video scoring and music information retrieval(MIR).However,current techniques for developing models of musical emotion recognition and extracting musical features have encountered bottlenecks.The reason for this is that traditional classification models are ineffective at extracting the deep features of music.Meanwhile,the generalizability of various types of musical features is limited,and they are unable to adapt to diverse data sets.Tracing the evolution of music emotion recognition,it has been gradually introduced into the field of music emotion recognition as a result of the continuous development of deep learning technology.At home and abroad,breakthroughs in emotion recognition techniques based on distinct emotion spaces have been made.The purpose of this paper is to conduct an in-depth analysis and comparison of previous methods for fusing multimodal music emotion features and combining deep learning in music emotion recognition based on two emotion spaces(continuous emotion and discrete emotion),with the following major innovation points and work:(1)This paper used regression to forecast emotion in a continuous emotion space.It proposed the CLDNN_BILSTM model in order to address the shortcomings of previous models.Also,it employed the enhanced CLDNN as the filtering channel for the features.Following that,it fed MFCC and GTF features into two CLDNN filters with the same architecture,with their outputs weighted and fused.Afterwards,it extracted forward and inverse temporal information from the music using a bidirectional long and short term memory network(BILSTM)to obtain regression prediction values for the music’s Valence and Arousal values.Numerous feature combinations and model comparison experiments demonstrate that the proposed method improves the accuracy of Valence and Arousal value prediction.(2)In order to extract the chord content and time interval of music using a discrete emotion space for classification and recognition,this paper was inspired by Word2 Vec and proposed the Chord2 Vec and BILSTM_BLS model.It separated them into chord information based on fixed beats and used the chord vector obtained after training the chord information as the music’s text feature.Additionally,it combined residual phase(RP)and MFCC features to generate MF_RP feature.MF_RP and GTF features were weighted with chord vectors and input into BILSTM after filtering channels respectively.Ultimately,the emotion category was output following BLS enhancement of the BILSTM feature nodes.Numerous comparison experiments on various datasets demonstrate that combining the proposed feature fusion method with the BILSTM_BLS model improves classification accuracy by 2.4 percent,precision rate by 1.7 percent,and recall rate by 4.6 percent to a degree.(3)This study developed and implemented a two-layer C/S structured Smart Music System(SMS)using Py Qt5,which UI interface adopting simple design style and diversified layout while incorporated the BILSTM BLS model.The system is composed of three modules: retrieval,function,and management.Apart from playing and switching music,users can enter keywords for music retrieval,create and manage their own song lists,and export music to meet their specific needs.Meanwhile,the system can recognize the emotion contained in user-imported music and provides tools for music analysis and visualization.Essentially,SMS satisfies the functional requirements of users for music playback,song list management,music retrieval,and emotional classification of unfamiliar music.
Keywords/Search Tags:Bidirectional long and short term memory network, Music emotion recognition, Chord vector, Smart music system
PDF Full Text Request
Related items