Technical Research And System Implementation On Violence Audio Scene Classification

Posted on:2017-02-17

Degree:Master

Type:Thesis

Country:China

Candidate:J J Feng

Full Text:PDF

GTID:2308330503487187

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of internet and the film industry, multimedia files like audios and videos increase sharply. However, some audio and video files often contain a lot of violent elements. For the processing speed of audios is much faster than the videos, audio-based violence scene recognition technology has been getting more and more attention. Current violence audio detection technologies that mostly base on traditional machine learning algorithms has made a breakthrough when compared to the traditional manual review mode. But there still exist the following problems: First, The generanization ability of the system is weak, that is, different scenarios typically require to select different audio feature; Second, the recognition performance of the system needs to be improved, mainly because the traditional machine learning algorithms are based on shallow learning, so the modeling capability is weak when facing some complex feature like audios; The last one, most of the violent audio recognition methods get poor recognition performance when they are in the real noisy scene. Aiming at the above problems, this paper has mainly done the following researches:(1) For the problem of poor generalization ability under different scenarios, we has applied the Deep Neural network(D NN) to violence audio scene recognition task. As the deep learning model, the DNN has better performance in feature learning and feature expression when compared with the traditional shallow learning algorithms. In most scenarios, we donâ€™t need to select features manually for that it allows us to directly use low-level features such as logarithmic power spectrum, spectrogram, etc., as DNN input.(2) For the problem of low recognition performance of system, on the one hand, we can put new features which learned by DNN with other features such as MFCC, zero crossing rate and energy entropy to classifier; on the other hand, we use the discretization method and the feature selection to imporve the expression ability of the feature. Meanwhile, during the recognit ion phase of violence audio recognition task, the K-Nearest Neighbo(KNN) method is applied to correct the classification results and then improve the systemâ€™s recognition performance.(3) For the problem of low recognition rate in noisy background, we use the Deep Denoising Autoencoder(DDAE) to reduce noises, which can reduce the difference between the training data and the test data, and thus improve the robustness of audio features.(4) To imporve the training speed and the performance of the network, we proposed the self-increment restricted boltzmann machine(Incre-RBM) based on the restricted boltzmann machine(RBM). Experiments show that the Incre-RBM gains faster training speed and better classification permance.

Keywords/Search Tags:

violence audio recognition, deep learning, RBM, feature extract

PDF Full Text Request

Related items

1	Violence Detection And Face Recognition Based On Deep Learning Method
2	Violence Detection Method Based On Deep Learning
3	Research Of Violence Audio Fragment Detection Based On Tensor Model
4	Research Of Multimedia Violence Fragment Detection Based On Audio-visual Channel Fusion
5	Research On Violence Recognition Algorithm In Video
6	Research On Audio Event Recognition Based On Deep Learning
7	Research On Algorithms Of Audio Scene Recognition Based On Deep Learning
8	Research On Intelligent Audio Detection And Enhancement Method In Strong Noise Background
9	Research And Application Of Violence Detection System Based On Deep Learning
10	A Study On Bimodal Audio Visual Speech Recognition Based On Deep Learning