Automatic detection of infant crying(ADIC)can remind baby caregiver’s attention,and avoid possible dangers.ADIC can also reduce the labour of baby caring,and increase the reliability and comfort of current infant monitoring behaviors.Infant crying may contain information like pain,starvation and so on.Taking advantage of these information may be helpful to the infants psychological or biological researches,especially on diagnosing of some diseases.ADIC can make such research more efficient.In this dissertation,two infant crying detection algorithms are designed,based on either modified K-means or CNN framework.In the stage of pre-processing,pre-emphasis increase the SNR of audio signal;an improved Voice Activity Detection approach is suggested,which can effectively suppress the noise components,by which the system complexity can be significantly reduced.Furthermore,MFCC(Mel-Frequency Cepstral Coefficients)is utilized to detect and classify target audio signals.In the modified K-means-based algorithm,DTW(Dynamic Time Warping)is used to calculate the distance between template and samples with variable length.The suggested CNN-based algorithm contains a five-layer neural network,in which the convolution layer extracts the deep features of the sound parameters,the pooling layer reduces the data,and the full-connection layer maps the learned features into the sample tag space to realize the classification of sound samples.Numeric results show that effectiveness of the two suggested algorithms are verified via well-designed experiments.In the experimental results,because more and deeper hidden features(such as cepstrum,formant,pitch,etc)are extracted,the CNN-based method has more advantages in recognition accuracy and robustness. |