Research And Implementation Of A Vision-based Piano Transcription System

Posted on:2021-11-02

Degree:Master

Type:Thesis

Country:China

Candidate:J Li

Full Text:PDF

GTID:2505306104986329

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Automatic Music Transcription(AMT)is the process of converting acoustic music signals into symbol annotations,often based on audio information for analysis.However,multiple pitches can overlap each other at the same time,so it is difficult to obtain accurate recognition results only by analyzing the audio,To solve this problem,computer vision-based approaches are adopted for transcription.In the existing research,the vision-based piano transcription system mainly includes two essential algorithms: a piano keyboard detection based on hough transform and pressed key detection based on weak classifier,but the accuracy and robustness of the two algorithms above need to be improved in complex environments.In this paper,we implement a robust and higher performance visual piano transcription system.The system contains four modules: piano keyboard registration,hand detection,automatic background update and pitch detection.The system takes piano transcription video as input.Firstly the background image and key position are determined through the piano keyboard registration module.Then the range of the hand is obtained through the hand detection module for each frame,and the background image is updated through the automatic background update module to prevent changes in lighting from affecting the result.Then difference image of each frame and background image is obtained.Finally,the pressed keys are detected by the pitch detection module to obtain the transcription result.The main contributions in this paper are the following four aspects:(1)In piano keyboard detection,Aiming at insufficient detection ability of hough transform,semantic segmentation is used for piano keyboard detection,which achieves the most accurate results so far;(2)In pressed key detection,Aiming at insufficient performance of the current pressed key detection classifier,we design and implement a CNN model suitable for detecting pressed keys,which outperforms the state-of-the-art approaches byexperimental verification;(3)The impact of different environments(light position,camera position,light intensity)on pressed key detection is discussed,and the best suggestions for deploying the system are given.;(4)in view of lacking public datasets in the field of visual piano transcription,we further propose a new dataset for visual transcription of piano music(Vision Piano),which includes the data recorded in the laboratory(Piano Dataset2)and the video data downloaded from the network(Piano Dataset3).The piano transcription dataset used in this paper includes the dataset proposed by Akbari(Piano Dataset1)and Vision Piano.The system’s f1-measure is 96.5% on the Piano Dataset1,and the f1-measure in Piano Dataset2 and Piano Dataset3 are 95% and93% respectively,it is state-of-the-art.

Keywords/Search Tags:

Automatic Music Transcription, Multi-pitch Estimation, Convolutional Neural Network

PDF Full Text Request

Related items

1	Research And Implementation Of A CNN-based Piano Music Transcription Algorithm
2	Research Of Audio-visual Fusion Piano Transcription Technology And System Realization
3	Research On Automatic Transcription Algorithm Of Piano Music Based On CNN-HMM
4	Research And Implementation Of A CNN-based Polyphonic Piano Transcription Algorithm
5	Research On Polyphonic Multi-Instrument Recognition Method Based On Dilated Convolutional Recurrent Neural Network
6	Research On Stereoscopic Reconstruction And Transcription Of Monophonic Music
7	Research On The Classification Algorithm Of Terracotta Warrior Fragments Based On The Optimization Model Of Convolutional Neural Network
8	Research On The Extraction Method Of Music Melody And Its Application
9	Research On Movie Recommendation Algorithm Based On Convolutional Neural Network And Recurrent Neural Network
10	Research On Automatic Vocal Transcription Of Chinese Popular Music