Font Size: a A A

Research On Music Genre Classification Based On Multiinput Deep Learning Model

Posted on:2023-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:C H WangFull Text:PDF
GTID:2555306851456334Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Music genre classification technology can add category labels to music based on music content.In recent years,with the emergence of more and more genres around the world,music genre classification has become a very popular concept.The music genre classification method based on traditional machine learning needs additional feature engineering based on music domain knowledge,while the deep learning technology can directly transmit the labeled data to the neural network without developing a new feature extractor for each problem.In addition,the interpretability of deep neural networks is still one of the main challenges in machine learning.Although current deep learning techniques have been successfully applied to many fields,such as image classification.However,researchers have always been concerned that many current deep learning models cannot be reasonably explained,thus allowing real trust between people and models.This thesis studies the application of deep learning and pixel attribution interpretation on the task of music genre classification.The main work is as follows:At present,research on the interpretability of deep neural networks mainly focuses on the context of traditional image classification,with little involvement in the audio domain.To better understand the decision-making of multi-input deep learning models in the task of music genre classification,we validate the superiority of the parallel convolutional architecture using a pixel-attributed gradient-weighted class activation mapping(Grad-CAM)method.And creatively put forward the combination of the Grad-CAM method and other pixel attribution interpretation methods to verify the importance of texture information in the Mel spectrogram to the final decision-making of the model.The traditional multi-input deep learning model divides modules and performs feature extraction for each type of input feature independently,and then connects the final extracted learned features and sends them to the classifier for classification.However,this method splits the internal connection between different types of features,and thus cannot further provide learned features with more prominent distinguishing information for classification.To generate internal connections between different types of features,this thesis considers connecting different types of learned features at the position of the middle layer of the model and feeding them into one of the branches of the multi-input model for further training.Furthermore,a method for middle-level learning feature interaction based on a multi-input deep learning model is proposed,and three interaction modes under this method are proposed according to the different input branches.Experimental results show that the designed method can significantly improve the accuracy of music genre classification.Among them,the classification accuracy rates under the GTZAN and FMA-Small datasets can reach 93.92% and76.80%,respectively,which are better than most of the current methods.
Keywords/Search Tags:Music genre classification, Feature interaction, Neural networks, Pixel attribution, Deep learning, Interpretability
PDF Full Text Request
Related items