Research On Classification And Recognition Of Visual Targets Based On Machine Learning

Posted on:2020-08-08

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z K Weng

Full Text:PDF

GTID:1368330605972830

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

With the continuous development of computer technology as well as the deepening of people’s public safety awareness,intelligent image and video analysis as a common security measure is attracting widespread attention.Some classification and recognition methods at home and abroad have been extensively read and researched.There are some limits in machine learning based visual object classification and recognition methods at present.Some researches have been made in depth including visual object classification,moving object representation and recognition,and video activity prediction.Some main contributions have been developed as follows:A non-destructive classification method is proposed for ancient ceramics identification.The curvature of the edge contour of ancient ceramics is obtained by using differential chain code.Some multi-channel color features of ancient ceramics glaze are extracted in HSI color space.The LBP features of decorations are extracted.The dating analysis of ancient ceramics in different periods was carried out based on the above visual features.The experimental results show that the developed method has better performance than single feature recognition for non-destructive classification.A spatio-temporal edge trajectory for moving target representation is proposed.All kinds of edge trajectories with similar space-time and motion characteristics are regarded as a group of skeletons to better describe the evolution process of action under different motion modes.A coding method based on magnitude and direction information is introduced to cluster these trajectories.The space-time motion skeleton descriptor is gotten to fully consider the motion similarity between spacetime trajectories.The experimental results show that the proposed method can be better applied to the representation and recognition of real video motion actions without constraints.A stacked trajectory energy image method is proposed to represent the long-term dependence of frames.The scale or spatial position of moving object is ignored for global convolution operation.A trajectory-aware hierarchical convolution strategy is proposed to decompose each frame into three levels including complete video region,foreground target region and fine motion region,so that video action can be represented from multiple levels.Multi-frame trajectories are mapped onto an image to construct the stackedtrajectory energy image,which can effectively describe the long-term motion characteristics of video.Experimential results show that the proposed method can be applied to retain and enrich the temporal and spatial information of moving targets,and improve the recognition perforamnce.A multi-motalities trajectory-aware CNN model is proposed for video action recognition.On the basis of two-stream convolution neural network,it is extended to three-stream network.The spatial CNN,temporal CNN and the global motion CNN extract the static,dynamic and global motion action characteristics,respectively.A video multimodal motion target behavior descriptor based on trajectory perception is formed by the aggregation connection of three modal convolution features.A linear support vector machine is employed to classify and identify the multimodal convolution behavior descriptor as mentioned above.The experimental results show that the proposed framework can effectively recognize human action in videos,and has higher recognition accuracy than single motality network.A dynamic enhancement method of foreground moving trajectory based on boundary priori is proposed.An undirected weighted network is constructed,and the geodesic distance is defined as the distance between two super-pixels.The cumulative weighted shortest path defines the saliency probability of the superpixel by calculating the geodesic distance to the boundary.At the same time,the dynamic contrast segmentation strategy is introduced to get the fine motion region and realize the robust sampling of moving target.Robust sampling of video motion target region is achieved.An activity prediction method based on attention-based temporal encoding network is proposed.The long-short term memory network is adopted.The output features are more focused on the semantic key frames,which can effectively model long-term sequences while restraining temporal redundancy.The experimental results show that the proposed framework can predict the action well in the early stage of video,and effectively improve the prediction performance.

Keywords/Search Tags:

Machine learning, Visual object classification, Convolutional neural network, Motion representation, Activity prediction

PDF Full Text Request

Related items

1	Research Of Object Detection And Classification Algorithms Based On Deep Visual Representation
2	Research On Visual Object Tracking Based On Spatial And Temporal Context
3	Research On Machine Learning Classification Algorithm Based On Conformal Prediction
4	Research And Implementation Of News Classification System Based On Machine Learning
5	Research On Visual Ego-Motion Estimation With Convolutional Neural Network
6	Research On 3D Object Recognitionusing Volumetric CNNs
7	Research On Object Tracking Under Complex Scene Based On Convolutional Neural Networks
8	Machine Learning for Neural Activity Video Analysis and for Object Tracking in Vide
9	Research On Representation Traffic Classification Based On Auto-ML
10	Research On Image Classification Algorithm Based On Circular Convolutional Neural Network