Font Size: a A A

Research On Music Audio Classification Based On Deep Learning

Posted on:2021-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y X GaoFull Text:PDF
GTID:2415330611467017Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Music audio classification method can add category tags to music based on music content,which is of great significance in the research and application of efficient organization,retrieval and recommendation of music resources.The traditional method of music classification makes a lot of use of artificial acoustic features.The design of features requires knowledge in the field of music,and the features of different classification tasks are often not universal.The emergence of deep learning provides a new way to better solve the problem of music classification.The existing methods of music audio classification based on deep learning still have shortcomings in music data processing and model design.This paper makes an in-depth study on the classification method of music audio based on deep learning.The main contents are as follows:The audio signal of music is converted into the spectrum as a unified representation,which avoids the problem of manually selecting features.Music labeling is more difficult,and limited labeling data is not conducive to the training of deep learning models.In this paper,a variety of music data enhancement methods are used to enhance the music data based on the characteristics of music signal.The research of music audio classification based on deep learning usually use convolution to extract sound spectrum features.Most of the existing studies have not effectively designed the convolution structure for the characteristics of the sound spectrum.According to the characteristics of sound spectrum,this paper proposes a new convolutional neural network model combining the 1-D convolution,gating mechanism,residual connection and attention mechanism,which can extract more relevant sound spectrum features of music category.The accuracy of the model on the GTZAN music genre dataset reached 91.8%,which verified the effectiveness of the method.Music classification models based solely on convolutional neural networks ignore the temporal characteristics of audio.In this paper,the proposed convolutional structure is combined with the bidirectional recurrent neural network to design a new convolutional recurrent neural network model,and the attention mechanism is used to assign different attention weights to the output of the recurrent neural network at different moments,so as to obtain a better representation of the overall characteristics of music.The classification accuracy of the model on the GTZAN dataset has been increased to 92.2%.At the same time,the AUC on the multi-tag labeling dataset Magna Tag ATune reached 0.9122,surpassing other comparison methods.We also analyzes the labeling of different music labels.This method has good labeling ability for most labels of music genres,and also has good performance on some labels of instruments,singing and emotion.Finally,based on the above proposed music classification method,this paper designs and implements an audio based music tagging system,which can label music genre,emotion and scene,providing data support for the construction of knowledge graph in the field of music.
Keywords/Search Tags:music classification, deep learning, convolutional neural network, recurrent neural network, attention model
PDF Full Text Request
Related items