Research And Implementation Of Agricultural Videocaption Algorithm

Posted on:2018-04-22

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Zhang

Full Text:PDF

GTID:2335330512986878

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

In order to generatebetter semantic index for agricultural videos,we research and implement an agricultural video caption algorithm to generate natural sentences which describe agricultural videos' content as agricultural videos' semantic index and synopsis.So framers can retrieve agricultural videos by semantic keywords and filter the retrieval results with the help of agricultural videos' caption.This method can greatly reduce the time wasted to retrieval desired videos in a large number of videos and make a contribution to the development of agriculture.Generatingcaptionfor agricultural videos is faced with many difficulties,such as how to extract semantic key frames which can represent agricultural videos' semantics,how to identify objects and their relative relationships in semantic key frames,how to express semantic key frames with natural sentences.It is aproblem which involves in computer vision and natural language processing.We proposeto generate captions for agricultural videosin these ways: divide agricultural videos intoshots according to the frame transitions,extract for shots,extract image featuresfor semantic key frames and map image featuresinto meaning space,extract text features for semantic key frames' captions which are generated manually,map text features into meaning space,study to generate captions for semantic key frames in meaning space using recurrent neural networks.The main work of this paper is as follows:(1)Extract image features for semantic key frames.Extract compression key frames for agricultural videos,divide agricultural videos into the shots usingshot boundary detection algorithm with fixed thresholds in compressed domain based on histogram features,useK-Means clustering algorithm to extract semantic key frames for shots,train deepimage feature extractor based on bounding boxes which are generated manually,extract deep image features for semantic key frames.(2)Extract text featuresforcaptions.Generate captions for semantic key frames manually,segment words in captions using words segmentation algorithm,build initial Chinese vocabulary for all captions,merge synonyms in initial Chinese vocabulary with the help of words similarities measure algorithm to get final Chinese vocabulary,convert words in captions into index array which can play as text features of captions according to the final Chinese vocabulary.(3)Learn to generate captions forsemantic key frames.Map image features ofeach semantic key frame to a meaning vectorand encode it into hidden layers of recursive neural network;map text features of the captions corresponding to the semantic key frameto a set of meaning vectorsin meaning space,input them to hidden layers to decode the captions.The encoding matrix and decoding matrix of recursive neural network are learned according to semantic key frames and captionsin the training dataset.The main innovation of this paper is extracting image features based on regions rather than the whole image,extracting text features based on synonyms rather than words.Experiments on the agricultural videos show that the two innovations increase the score of agricultural video caption by 5.1 and 1.7.

Keywords/Search Tags:

video retrieval, image caption, word segmentation, deep learning

PDF Full Text Request

Related items

1	Research On The Integrated Processing Technology Of Sentence Segmentation And Lexical Analysis Of Ancient Texts Based On Deep Learning
2	Research In Content-Based Movie Video Retrieval And Automated Video Abstract System
3	Research And System Implementation Of Thangka Image Retrieval Algorithm Based On Deep Learning
4	Research On Music Information Retrieval Algorithm Based On Deep Learning
5	Research On Automatic Recognition Of Braille Based On Deep Learning
6	Appliance Of Deep Learning In Music Information Retrieval
7	Vocabulary Learning Through Viewing L2 Videos:the Effects Of Two Enhancement Techniques
8	Research On The Methods Of Ancient Chinese Word Segmentation And Part-of-speech Tagging
9	Research On Portrait Relief Modeling From Single Image
10	Research On Automatic Texts Segmentation And Word Segmentation For Ancient Chinese Texts