| The main content of this paper is genre classification and automatic annotation of music based on audio acoustic features of digital music.Music genre analysis and labeling require a large number of music features,but traditional music feature design needs to combine professional knowledge in the field of music,and these features are not universal when dealing with different types of tasks,often requiring a lot of labor costs.The application of deep learning reduces the dependence on manual annotation in music genre classification and auto-tagging.Based on the music itself,this paper uses the method of deep learning to study the music genre classification,the automatic annotation of music labels and the offline music recommendation method based on genre and label.The content of the experiment is divided into three parts: the automatic classification of music genres,the automatic annotation of music tags and the offline music recommendation method based on audio content.In terms of music genre classification,the music audio was transformed into Mel spectrogram and MFCC by Fourier transform,and the network structure of Res Net + GRU was used for classification.In the automatic annotation of music labels,the structure of Res Net + Vi T is used for learning and labeling.In the aspect of the offline music recommendation method based on audio content,the music feature modeling is carried out based on the aforementioned classification and automatic annotation method,and the similarity between the feature vectors is used as the basis to establish the connection between the music platform and the users.The innovation of this paper is that the music audio is transformed into Mel spectrogram,MFCC and STFT three methods for genre classification and label automatic annotation respectively,and the influence of transformation methods on genre classification and music label automatic annotation is compared,and the gated recurrent unit and visual attention mechanism are introduced into the global feature extraction of music audio.The improved Res Net is used for the extraction of local features,and the Res Net + GRU and Res Net + Vi T hybrid models are used to combine the global features and local features,and the acoustic features(including local correlation and global correlation)and time series features of music are obtained for genre classification,label annotation and recommendation.Finally,the application prospects based on audio similarity are briefly described:it can be used to overcome the problems of cold start and excessive acquisition of user privacy data in music recommendation systems.It can also be used for music emotion recognition,copyright protection and cover recognition. |