Font Size: a A A

Research On Automatic Singing Transcription System:from Singing Signal To MIDI Files

Posted on:2021-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:M T LinFull Text:PDF
GTID:2505306017973709Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the application of computer music research,people are developing an automatic system for musical notation which has intelligence to get music sheet information on a piece of music audio.In particular,the system outputs a symbolic score that is consistent with the melody described by the humming as much as possible when the input was singing signal.We also name it automatic singing transcription system(or automatic singing score recording system),that we use the musical note sequence as the symbolic score expression of the system output,which is designed to assist musicians to create and record music motivations,or note pattern recognition of music information retrieval system.The singing transcription system is aimed to achieve the following functions:recognizing humming melody and convert it to an MIDI format of data structure to save or play.In the humming signal,the melody of impromptu humming without lyrics is more common,which is not possible to study phonemes and other basic units like speech recognition technology,then forms a symbolic sequence in recognition.The singing transcription system primarily consists of note event detection and note pitch estimation.Note pitch estimation is related to the frequency of the framed signal data with several mature techniques that can be used,yet no convincing methods to detect the note events,because the feature of human humming is not necessarily a stable and clear change of sound,which makes it difficult for the system to describe the note onsets through a single context or grammar language.The research of note onset detection also can be divided into two parts when we consider the details of note event detection,detection functions and algorithms of peak picking to the corresponding time of peak points of detection function is the starting time of notes.Unfortunately,at the signal of the human voice,noised data is more common,which makes the false detection.If it determines that the note onset is positive,then the automatic singing transcription system is required to detect as many true positive samples as possible and eliminate false positives at the same time.Different from the traditional deep learning methods of constructing detection function,this dissertation holds that the core algorithm lies not in detection function,but algorithms of peak picking,we introduced deep learning methods in the step of post-processing to eliminate false detections.On the public dataset,this method is simpler and superior to the existing algorithms,also the note recognition in the singing transcription system is optimized because of the state-of-the-art result of note onset detection,and the correctly note detection rate is essentially improved when it compared to other traditional transcription systems.
Keywords/Search Tags:Automatic Singing Transcription, Note Onset Detection, Deep Learning
PDF Full Text Request
Related items