| With the popularity and development of the Internet,the number of digital music is also growing rapidly.How to organize and manage a large number of song information effectively,and how to facilitate people to search or retrieve music by using music content and other related information,has become an important issue in the field of music information retrieval,and music classification has become a research hotspot in recent years.In the aspect of music emotion classification,the traditional classification methods include emotion classification based on music lyrics and classification research based on song audio,but both of them have their own defects.Firstly,the effect of music lyrics classification is not ideal,which can not achieve the expected results.Similarly,there is a ceiling based on the accuracy of audio emotion classification,which is not easy to improve the classification Because it is very difficult to extract the high-level music features with semantic manually in audio,and the extracted features can not guarantee the accuracy and effectiveness.To solve these problems,this paper proposes a new multi-modal emotional classification method based on audio content and lyrics.The main contents of this paper are as follows:(1)This paper proposes an improved fusion method:in the aspect of song audio,according to the characteristics of song audio signal,an algorithm of song emotion classification based on LSTM is proposed by manually extracting the time sequence characteristics as training data;in the aspect of song lyrics,the paper uses Bert model to study the emotion classification of lyrics,and introduces emotion dictionary to balance and optimize the lyrics.By extracting different audio features of songs and improving multi-modal fusion method,the influence of different fusion methods on song emotion classification is studied.The experimental results show that the new fusion method is 2.38%better than the linear weighted fusion method.(2)Based on the advantages of convolutional neural network in image processing,an algorithm of song emotion classification based on convolutional neural network is proposed in the aspect of song audio.By using this algorithm,the abstract features of audio signal spectrum can be learned automatically,and the tedious manual feature extraction process can be avoided.At the same time,different audio maps and different convolutional neural network structures are used for experiments the influence of different spectrograms on different emotion classification is analyzed.The experimental results show that the classification effect of multi neural network fusion is better than that of single-mode emotion,especially in the classification accuracy of energy in Thayer two-dimensional emotion model. |