Font Size: a A A

Research On Key Technologies Of Pathological Speech Feature Analysis Based On Deep Learning

Posted on:2021-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:2504306470462474Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Pathological speech is the speech produced by the abnormal sounding system,which can be caused by a variety of diseases.Among them,dysarthria is a disease in which the muscle function of the pronunciation organs is weakened or uncoordinated due to neuropathy.As people further analyze the generation principle and transmission process of speech signals,the analysis and recognition of pathological speech can play a significant role in the diagnosis and treatment of patients’ conditions,and the analysis of characteristics further deepens the distinction between dyslexia and normal speech Knowledge,so it has great social significance for the key technology research of pathological speech feature analysis.At present,the diagnosis and evaluation of articulation disorders caused by various diseases mainly rely on manual examination and related computer technology to assist,but the process is complicated and cumbersome,and the results are highly subjective.In order to overcome the above problems,many researchers have focused their attention on the field of deep learning,using artificial neural networks for feature learning,which has opened up a new path for further analysis of pathological speech.This paper has conducted a series of researches on deep learning and pathological speech feature analysis related technologies,and proposed a deep learning model for pathological speech feature analysis.The Mel-Frequency Cepstral Coefficients(MFCC)and language spectrum The two features in the figure are used as the research object to explore the difference between the features of pathological speech and normal speech.This paper mainly carried out the following work.(1)Extract the two different features of the spectrogram and MFCC,and make a preliminary analysis of the two features by observing and calculating the relevant evaluation indicators to describe the differences between the speech features of the patient and the normal person.(2)Take the spectrogram as input,conduct modeling training on different convolutional neural networks(CNN),and evaluate the experimental results according to various classification indicators.Two data set partition methods,completely random and speaker,are used to further analyze and sort the pathological classification results at the syllable level,and then to the pathological classification at the speaker level for comparison.(3)Take MFCC features as input,train one-dimensional CNN,Long Short-Term Memory(LSTM)and hybrid models of the two to complete the classification between pathological speech and healthy speech.Still divide the data in different ways,count the individual speaker’s disease probability,and analyze the results.(4)Finally,comprehensively analyze and compare the pathological speech model,build three different models of three-layer CNN,CNN+LSTM-2 and CNN+SVM,conduct experiments on the speaker individual data set,and analyze different features in the same model Under the pathological classification results,comprehensive analysis and comparison of the differences between the models to draw conclusions.The experimental results show that when using the spectrogram as the input of CNN,the classification evaluation indicators of the Xception network achieve the best effect.When using MFCC as input,CNN combined with LSTM network model can get the best classification effect.Randomly divided syllables as a data set has a good pathological classification effect,but can not further discuss the individual speaker ’ s illness.The syllable-level effect of speaker-level classification is poorer than that of randomization,but by rising to the individual speaker,the classification effect Has been improved,and more practical.
Keywords/Search Tags:Deep Learning, Spectrogram, Mel-Frequency Cepstral Coefficients, Convolutional Neural Networks, Long Short-Term Memory
PDF Full Text Request
Related items