Font Size: a A A

Research On Music Emotion Classification Based On ViT-XGBoost

Posted on:2024-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhouFull Text:PDF
GTID:2568307067997799Subject:Library and Information Science
Abstract/Summary:PDF Full Text Request
With the development of information technology,the scale of music data is getting bigger and bigger,and music emotion classification plays an important role in music retrieval,music recommendation,music therapy and other fields.Traditional music emotion classification first extracts the underlying features of music manually,and then normalizes the features,and then feeds them into machine learning classifiers for emotion classification,which has a large workload and poor model accuracy.With the development of deep learning,technologies such as convolutional neural networks and recurrent neural networks have gradually been applied to the field of music emotion classification,and researchers can perform deep feature extraction through the spectral map of music.From the research on music emotion classification in recent years,it can be found that compared with the underlying features,music emotion classification based on deep learning and spectrogram not only improves the accuracy to a certain extent,but also greatly reduces the cost of manual feature extraction.However,it still has problems such as poor generalization ability,such as models trained on one dataset often perform poorly on another dataset.The accuracy performance of the model also encountered bottlenecks,which limited the development of the field of music emotion classification.This paper aims to propose a music emotion classification model with high accuracy and strong generalization ability.In the process of research,a dataset containing music of different genres and different vocal categories(vocal music and pure music)in real scenes is first constructed.Then,the XGBoost classifier is fixed,and the classification experiment based on the underlying feature,the classification experiment based on CNN and the classification experiment based on ViT are carried out on the dataset,and the superiority of deep learning and the superiority of ViT model in image feature extraction task are verified by comparing their classification accuracy.Ablation experiments were subsequently performed to verify the effectiveness of XGBoost in the ViT-XGBoost architecture.Finally,based on the music features extracted by the improved ViT model,the parameters of the XGBoost classifier are tuned,which further improves the performance of the model.Through a series of experiments,a music emotion classification model based on ViT-XGBoost is proposed,in which ViT is used as the feature extractor and XGBoost is used as the classifier,and the final model reached 87.5% accuracy in the test set,which is better than other models compared in this paper.
Keywords/Search Tags:Musical emotion recognition, Convolutional neural networks, ViT, XGBoost
PDF Full Text Request
Related items