Font Size: a A A

Research On Music Automatic Annotation Based On Deep Neural Network

Posted on:2019-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:N HanFull Text:PDF
GTID:2348330542998851Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of the digital music market has brought huge resources of digital music,which makes music labels as structured information organization became increasingly important.Music tagging generally means generate music labels to describe the semantics,enabling fast retrieval,efficient management and personalized recommendations of a large number of music resources.Manual annotation,social annotation and other methods that are used by people widely are now faced with cost and quality issues,one of the most effective ways to solve is to enhance the automatic annotation system.Music automatic annotation based on content is an important research topic.However,the traditional auto annotation algorithms still have many problems to be solved,including hand-crafted feature design is sub-optimal and unsustainable,the power of shallow architectures is fundamentally limited,and short-time analysis cannot encode musically meaningful structure.Recently deep learning algorithm in the academic concern has made huge progress in image and voice field,which show that the algorithm has great potential to solve the music automatic annotation task.This paper mainly aimed at Chinese music,implemented convolutional neural network and recurrent neural network which represent deep learning algorithm to get high-level information by grapping the time-related features of music effectively.Because there is rarely complete Chinese music annotation dataset for music information retrieval,we also gave two datasets with audio,lyric and annotations in this paper that can let us complete more experiments.At first,we proposed an automatic model based on convolutional neural network that used lyric char embedding as input.We used several experiments to exploring the effects of different input representation methods,network structures,and hyper-parameters.The experiments also verified the superior performances of the model.Then the Mel-spectrogram of the audio signal is used as the input information.We proposed an automatic music annotation model based on convolutional neural network,and validates the model's effect on multiple data sets.A fusion network structure model is also proposed in combination with convolutional neural network and recurrent neural network.The characteristics of the two network structures are combined to extract deep-level representations and reconstructed sequences from the audio signal,which effectively improves the automatic labeling effect.Finally,based on the previous two work,we further proposed a multimodal automatic annotation model based on deep neural network.At the same time,the audio signal and lyrics text of the song are used to extract the deep-level features of the audio and the lyrics respectively through the convolutional neural network,and then the two types of deep-level features are combined to train the model to do the annotation tasks.Experiments have proved that the multimodal model has improved performance compared to models that only input audio signals or lyric texts.
Keywords/Search Tags:music automatic annotation, convolutional neural network, recurrent neural network, deep fusion network, multimodal approach
PDF Full Text Request
Related items