Font Size: a A A

Research On Automatic Vocal Transcription Of Chinese Popular Music

Posted on:2021-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhengFull Text:PDF
GTID:2415330623969210Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Vocal transcription,as an important branch of transcription tasks,has gradually attracted the attention of the scholars in the field of music computing in recent years.However,due to the instability of vocal pronunciation and the lack of a large music dataset which has high-accuracy vocal onset annotation,the onset detection(one of the critical steps in transcription)of vocal in popular music is much more difficult than that of musical instruments.Vocal transcription is therefore limited and cannot be effectively applied in practice.In view of this,we study the automatic vocal transcription in Chinese popular music.The main research contents and results are as follows:1)We propose a sentence segmentation algorithm based on voice activity detection.The system uses this algorithm to cut music into sentences intelligently to meet the time length requirement of the onset detection model,and avoid words being cut.2)We train a high precision vocal onset detection model on a speech dataset to accurately recognize the time of the note onset required for transcription task.The UNet network is introduced into the vocal onset detection task for the first time.An input format optimization strategy is proposed to convert the input single-channel spectrogram into multi-channel spectrogram,and the extreme imbalance of sequence data is resolved through positive sequence radiation and Dice Loss.3)A small Chinese pop music test dataset with vocal onset annotation is constructed and open sourced,and the model trained on the speech dataset is transferred to a real music scene for testing.At the same time,we propose a reliability filtering layer and a breath sound filtering layer to optimize the recognition effect after transferred,so that the model can achieve a good onset recognition effect in real music scene.4)We propose a note block pitch selection algorithm based on nearest neighbor name matching.After optimizing the Harvest pitch recognition algorithm to increase its running speed to 10 times,we use this algorithm to calculate the representative pitch of each note.In summary,we propose a complete set of vocal transcription solutions that can intelligently identify onset time and pitch information,and finally generate a MIDI-file which contains the accuracy music sheet without the help of lyrics.
Keywords/Search Tags:Automatic Music Transcription, Vocal Onset Detection, U-Net, Pitch Recognition, Chinese Pop Music
PDF Full Text Request
Related items