| Content-based music analysis plays a critical role for computer musicunderstanding as well as music information retrieval, and also offers greatflexibility to computer-assisted music education. This dissertation focused onmusic analysis in both levels of music representation and semantic content. Interm of music representation, automatic music annotation was addressedtogether with intelligent editing and singing evaluation. As for content basedanalysis, a key clustering approach and a cultural style classification approachwere investigated based on a novel set of mid-level musicology-based features.The main research work and innovative contributions are summarized asfollows:First, this dissertation presented an automatic annotation and intelligentediting model for multi-tracks instrumental performance. In this model, notesegmentation was achieved by audio-score alignment, which located notes in therecording, followed by a bootstrap learning based onset classifier for revisingnote boundaries accurately. In order to avoid the interference of pitch variationcaused by transition between notes or expressive performance, the most stablecomponent was extracted from pitch curve and considered as perceptual pitch ofeach note. A scheduling model tracked instantaneous tempo of recordedperformance and determined adjusted timings for output tracks. A time-domainpitch modification/time stretching algorithm performed pitch correction andtime adjustment. An empirical evaluation illustrated the proposed algorithmachieved a high onset detection accuracy and the intelligent editor improvedpitch and timing accuracy while retaining the expressive nuance of the originalrecording.Second, this dissertation proposed a singing evaluation approach based onpitch and rhythm accuracy. A phoneme boundary detector was adopted toperform note segmentation in a high time resolution by using characteristics ofconsonant and vowel. As for evaluation, relative interval between referencescore and actual pitch was adopted to estimate pitch deviation while difference between absolute position and expected position of beats served as rhythmconsistency measure. Experiments on a solfege corpus demonstrated anoticeable improvement in the performance of onset detection and pitchestimation. A detailed subjective evaluation showed great consistency betweenthe proposed evaluation approach and expert judgment.Third, to bridge semantic gap between audio waveform and high-levelsemantic content, a set of musicology-based features which characterizedmusical elements such as pitch class, chord and pitch interval was presented.Based on key-oriented inter-recording similarity measure, an agglomerativehierarchical clustering model was set up to divide recordings into categories inan unsupervised learning manner. The dissertation also proposed a cultural stylebased music classification apprpach for audio signal, in which timbre, rhythm,wavlet coefficients as well as musicology-based features were used as inputfeatures for supervised classifiers and their effectiveness was evaluated basedon classification result. Experimental results indicated musicology-basedfeatures facilitated integration of advantages of computational statisticapproaches and musical prior knowledge. Thus, pattern recognition approachesachieved satisfying results in both tasks of key clustering and cultural styleclassification. |