Font Size: a A A

Research On Piano Transcription Algorithm Based On Music Characteristics

Posted on:2024-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ShanFull Text:PDF
GTID:2545306932456154Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Music transcription means transcribing music audio into corresponding notes,which is equivalent to the speech recognition task in the music industry.Automatic music transcription is important for music education and popularization,music secondary creation,music retrieval,copyright recognition,etc.,and can greatly reduce the difficulty of related work.Music played by keyboard instruments such as piano is a suitable entry point for music transcription because of its fixed pitch,stable marks,and easy recording.Currently,the relevant piano music transcription algorithms have achieved good performance and have certain application scenarios in practice,but there are still differences that can be felt in the actual recognition.On the other hand,most of the existing music transcription algorithms are migrated from speech recognition,while their own characteristics of music have not been thoroughly explored.In this study,we construct a set of piano music transcription network based on Transformer structure,with which we explore the criticality of some music features in the music transcription task,and try to design a transcription network in response to note features.The main musical characteristics involved in the study include note duration,musical language performance,and note event characteristics.Also,the actual music composition contains many synthesized music,the related technology is quite developed,their applications are wider and better than speech synthesis.The study also tries to perform original data augmentation without additional dataset by synthesized music and tries to perform self-feature extraction on the original music audio.The results show that the piano music transcription task has a strong localization,the inter-note correlation is weaker than that in speech,and the local feature extraction has a greater impact on the final performance recognition.The decoding structure for note event characteristics can provide a great improvement in the note-level recognition results through targeted loss function design.Meanwhile the synthetic music pretraining scheme expands the effective training volume and can also improve the piano transcription performance by a small margin.The final input of short-time Fourier amplitude spectrum plus phase information,using a network structure targeting note events,the transcription network by the synthetic music pre-training method can achieve transcription results of note starting point F1=97.40%and note with cutoff F1=88.81%.
Keywords/Search Tags:piano music transcription, Transformer, music characteristic, synthetic music augmentation
PDF Full Text Request
Related items