Font Size: a A A

Research On Improved Zcpa Speech Recognition Feature Extraction Algorithm

Posted on:2006-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z P JiaoFull Text:PDF
GTID:2168360155974205Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Recently, most of speech recognition systems have much more recognition rates in the clean environment, however, the performances of these systems are severely degraded when there exists noisy environment. For the practicability of the speech recognition technology, it has the important significance to study the robustness of the speech recognition.The recognition ability of the human ear is very well, even in the noisy environment. So some researchers have devoted to the study of the auditory model to extract speech feature parameters that will improve the robustness of the system.This paper focuses on the robust noise speech recognition, completing the following research works.Firstly, this paper accomplished the speech recognition withthe Zero-crossings with Peak Amplitudes features, which is based on the auditory model of the human ear. This model reflects the speech signals' frequency information by analyzing and computing the adjacent upward-going zero-crossing intervals, and allocates them to the corresponding frequency bins. Then it is detected that the peak amplitudes of the successive upward-going intervals. Finally, it goes along the compressive nonlinearity to weight the frequency bins' peaks. This paper has also analyzed the robust noise performance. The results of many experiments showed that the robust noise performance of this system outperforms the other recognition system that uses the LPCC, MFCC as the recognition features.Secondly, this paper presented the improved ZCPA features based on the above system, namely, combining difference ZCPA features. These features use the characteristic of the difference signals, adding the difference information to the ZCPA features. The new features can extract the high frequency information mixed in the low frequency information, so the deficiency of the ZCPA features are made up for, and the improved recognition results are obtained.And this paper studied the front-end filters of this recognition system, introducing to use the Bark wavelet filters instead of the FIR filters. However, most of wavelet transforms, whether they aredyadic wavelets, wavelet packets or M-band wavelet transform, their frequency allocations all are based on octave relation, however, these frequency allocations are much more different with critical frequency band allocations of the human. So, if there is a wavelet that can allocate frequency according to the critical band, this wavelet will much more accord with the perception of the human to the speech and will improve the performance of the system. The basal thought of construction Bark wavelet is that the selected wavelet mother function satisfies the minimum of the time and bandwidth product, namely, the gauss function of the Bark fields, the mother wavelet has the equal bandwidth in the Bark fields. This paper analyzed the decomposition and reconstruction of this wavelet, presented the characteristic of time and frequency fields about this wavelet, and introduced the theory of this wavelet used in the front-end preprocessing.Finally, this paper simulated the speech recognition based on the Bark wavelet filters and ZCPA features, obtained the improved results and increased the recognition rates of the system.
Keywords/Search Tags:speech recognition, feature extraction, wavelet transform, auditory model
PDF Full Text Request
Related items