Font Size: a A A

Research On Language Recognition Based On Multi-task Neural Network

Posted on:2021-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:C G QinFull Text:PDF
GTID:2415330611981931Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of deep learning technology,research in the field of speech has also made huge breakthroughs.Language recognition technology,as the first step in speech research field is the key to determining the effectiveness of speech backends.Most of the current language recognition methods are based on speech phonemes,which requires phoneme tagging of corpora.The language recognition based on deep learning neural network only needs to use the combination of acoustic features,and does not need the help of corpus phoneme information to achieve high-precision language recognition.In practical application scenarios,neural network-based language recognition models are small and effective,and can be effectively applied to other front-ends of speech research to improve the practicality of language recognition.At the same time,today’s multilingual environment data is huge and deep.Neural networks have advantages in training with large-scale data.Therefore,this paper proposes an end-to-end language recognition model based on speech rate features and multi-task dialect language recognition model.Make use of the powerful computing power and feature extraction capabilities of neural networks to achieve language recognition.The method is more efficient and practical,thereby improving the integrity of the model,improving the accuracy of language recognition,so as to achieve the purpose of protecting language civilization and promoting phonetic development.This paper first aims at the problem of international language recognition.We construct a language recognition model based on speech rate characteristics through deep neural network.Further,for high similarity and confusing languages,represented by dialect language recognition,a multi-task learning method is proposed to learn the implications.Research on dialect language recognition based on characteristics.Main tasks as follows:1.There have been researches on speech phoneme-based language recognition methods toextract the underlying acoustic features of the original audio,and use the GMM-HMM model(Gaussian mixture model and hidden Markov model)combined with phoneme discriminator to perform language recognition.Due to the phoneme features involved,the complexity is too high and difficult to implement in practical applications.In response to the above problems,this paper first proposes an end-to-end language recognition model based on deep neural networks.This model extracts the two sets of underlying acoustic features of the original audio,Mel Cepstrum Frequency(MFCC)and Fbank.The speed characteristics are different,and the original features are innovatively improved to form a new combination feature,which are MFCC,Fbank and speech rate features.Furthermore,a CLSTM model was constructed by training convolutional neural networks and recurrent neural networks to identify languages,and five languages were extracted from the internationally published Common Voice dataset for experiments.The experimental results show that the accuracy of the end-to-end model recognition of the deep neural network proposed in this paper is 90%.2.Dialect language identification is a sub-problem of language identification.In addition,dialect languages have regional characteristics,and their similarities are closer,which is easy to be confused.Aiming at the difficulty of distinguishing similar languages and the difficulty of classifying sub-language families under the division of international languages,this paper conducts research on the identification of dialect languages and proposes a dialect language recognition method based on multi-task learning.Using multi-task learning to mine the implicit characteristics of related tasks,a multi-language dialect language recognition model based on parameter hard sharing and a sub-task language recognition model based on parameter soft sharing are separately established,and a single task is performed on the IFLYTEK dialect dataset and multi-task comparison experiments,the experimental results show that the multi-task dialect language recognition model proposed in this paper can reach a recognition rate of 82% for ten dialects.
Keywords/Search Tags:Deep Learning, Language recognition, dialect language recognition, multi-task learning
PDF Full Text Request
Related items