With the continuous development of audio signal processing and machine learning technology,computer technology has been able to quantitatively analyze music emotion to help us better analyze,understand,store and recommend music.Music Emotion Recognition(MER)aims to establish the connection between musical features and emotions,and automatically identify and classify the emotional information conveyed in music.Existing music emotion recognition research is usually based on a preset and fixed emotion classification framework for data labeling and training,so it is impossible to identify new emotion types that did not appear during the training process.But at present,many music applications allow users to freely assign new,non-universal or personalized emotional labels to music.Therefore,existing music emotion recognition models cannot support the recognition of new emotion categories by analyzing pure audio features.A key reason is that existing music emotion recognition models map music features to preset and fixed emotion category labels instead of expressing the overall emotion conveyed by music.Meanwhile,existing research works treat music emotion recognition as a classification task given a classification target,and rarely investigate the quantitative correlation between emotions.This paper establishes an emotion vector representation model(Emotion Vector Recognition Model,EVRM).The model can map the overall emotion of a piece of music into an emotion vector to quantify the emotional factors in music.Through this emotion vector,potential emotions that have not appeared in the training process can be identified,and the emotional distance between musical works can be quantitatively calculated.The work of this paper mainly includes the following three aspects:(1)Construct a music emotion vector model,which can convert the overall emotion expressed by a piece of music into an emotion vector.This model proposes the concept of emotional anchors for the first time,and extracts local key features of emotional anchors based on the convolutional neural network VGGNet.Using the self-attention mechanism to simulate the temporal relationship between pitch transitions in emotions,we first realize the transition from anchor emotion to emotion vector.The comparative experiments with other models prove that our model can more reasonably realize the anchor emotion recognition and the measurement of the emotional correlation between music.(2)Based on the anchor emotion vector,the music emotion vector model realizes the automatic labeling of non-preset emotion labels outside the anchor emotion.With the help of semi-supervised learning,the cost of manual labeling can be reduced while the generalization ability and accuracy of the music emotion vector model can be improved,and the automatic labeling of non-preset emotional labels other than the anchor emotion can be realized.By changing the label ratio of the dataset and comparing it with other models of the same kind,it is proved that the emotion vector model can identify non-preset emotion types and realize the automatic labeling function of nonpreset emotion labels.(3)Implement music emotion recommendation system based on music emotion vector space.Based on(1)and(2),the music emotion recommendation function is realized based on the music emotion vector space in a visual way. |