Font Size: a A A

Research On Infant Cry Classification Method Based On Graph Convolutional Neural Network And Transformer Representation

Posted on:2023-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:B LiFull Text:PDF
GTID:2530307043988339Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Infant cry classification is one of the important research topics in the field of speech recognition and has a very wide range of applications in the field of bioengineering.Although some relevant research works have obtain some improvements in classification accuracy in recent years,existing public data is limited due to the fact that infant cry data is sensitive and not easy to collect,and the information contained in infant cries is very complex,which makes it difficult to accurately label infant cry data.Thus,the classification accuracy of infant cries is still at a low level at this stage.To address the above issues,this thesis improves the existing research method and designs two methods for classifying infant cries as follows.First,a graph convolutional neural network model-based infant cry classification method is designed.The graph convolutional network has the property of aggregating the neighbourhood information learning,which helps to extract the acoustic and rhythmic features contained in the infant cries,and can be effectively applied on many scenarios with limited labelled data.Therefore,this thesis uses a two-layer graph convolutional network to perform aggregated learning of infant cry data and uses softmax to classify the infant cry data.The results of five-fold cross-validation show that the model has a large improvement in learning accuracy compared to the latest classification methods on the Chillanto and Baby2020 baby cry datasets,respectively.Secondly,a bilinear end-to-end Transformer-based fusion method is designed for robust baby cry classification task.This thesis uses Transformer to organically fuse the spectrogram enhancement and attention mechanism modules together for learning.By combining the advantages of the spectrogram enhancement module in processing limited infant cry label data and the attention mechanism module in image channel feature learning,a better classification method for infant cry classification is presented.Finally,the Transformer bilinear network model that incorporates the spectrogram enhancement and the attention mechanism modules not only performs well on the infant cry dataset,but also on the complex environmental sound event datasets.
Keywords/Search Tags:Sound Event Detection, Baby Cry Classification, Graph Convolution Network, Transformer
PDF Full Text Request
Related items