Font Size: a A A

Research On English Accent Recognition Based On CNN-BiRNN-Attention

Posted on:2020-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:M FengFull Text:PDF
GTID:2415330623967003Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence technology,human-machine interaction will be like human-human interaction in the future intelligent era.In order to achieve this interaction mode,speech recognition technology is a necessary skill for the machine.Therefore,in the development of artificial intelligence,the research on speech recognition technology is particularly important.The research found that different accents often lead to abnormal pronunciation,which in turn makes the accuracy of speech recognition greatly reduced.The English accent recognition studied in this thesis is a Language Identification technology which uses computer to automatically recognize the country where the English speech belongs.At present,an effective English accent recognition method is to extract speech features,and use these features to train a machine learning model for English accent category determination.Acoustic and prosodic features used in English accent recognition are often not fine enough or feature dimension is too high.Although Recurrent Neural Network has achieved some results in the construction of recognition model,there is still a big gap from its application.In this thesis,the problems above are studied.The main work is as follows:(1)A method based on CNN for English accent recognition is proposed.Considering that the main features of the accent tend to appear in certain segments of speech,the CNN uses its convolution properties to summarize local information around these segments,and finally summarizes the types of sounds at the top level of the model.The experimental results show that this method can extract the features of the speech which represent the accent information very well.(2)A method based on CNN-BiRNN for English accent recognition is proposed.For the accent recognition method mentioned above,due to the limitation of convolution,it is still insufficient for CNN to deal with long sequence tasks.Considering that speech is a kind of data with "Streaming Property",the classification process should consider the local characteristic information,and the sequence relationship of data at each time point in time series.Therefore,after extracting the local features of the accent by CNN,the ordering relations between these features is extracted by BiRNN,so as to achieve the purpose of recognizing the accent category.The experimental results show that the sequence data information aggregation of the accent features extracted by CNN through BiRNN can distinguish English accent categories very well.(3)A method based on CNN-BiRNN-Attention for English accent recognition is proposed.In the CNN-BiRNN accent recognition method,it is impossible for BiRNN to distinguish the importance of each part in the sequence when summarizing the sequence features.Therefore,in this method,the local features that characterize the accent are effectively extracted by CNN,and then these features are sequential encoded by BiRNN,and each part of these sequence features have different contributions to the category of the discriminant accent.Therefore,with the characteristics of the Attention mechanism,these features are assigned different weights,so that these features can better characterize the accent.The experimental results that attention mechanism can better accomplish the task of English accent recognition by assigning different attentions to each part of the features of the accent sequence.
Keywords/Search Tags:English Accent Recognition, CNN, BiRNN, Attention Mechanism
PDF Full Text Request
Related items