Study On Semantics-embedded Deep Hashing For Multi-label Video Retrieval

Posted on:2022-11-03

Degree:Master

Type:Thesis

Country:China

Candidate:L Cao

Full Text:PDF

GTID:2568306500950469

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the popularity of portable mobile devices and the maturity of network transmission technology,video data expands massively and large-scale video retrieval is heavily demanded in this big-data era.Deep hashing is currently the most effective technique for the retrieval task because of its low storage and time cost.Existing video-hashing methods almost are developed from image hashing methods.They usually regard the video as a continuous image sequence and approximate the video features by fusing the video frame image features for the retrieval task.However,a very important difference between image and video is timing information having great effect on the performance.There are mainly three reasons for the unsatisfied performance: 1)Some hashing methods ignore the problem that video has temporal information,which is an important feature different from image and leads to an inadequate exploration of video features;2)Most hashing methods give equal weights to all the frames in their learning models,which neglects the fact that the content of a video is often determined by several key frames;3)Different from single label,multi labels has richer semantic information and multi-label video retrieval is more challenging.If we follow the definition of similarity for single label will ignore the similarity ranking for pairwise videos with multiple labels and the influence of category association on similarity,which results in a hard measure and cannot reflect the real distance between two videos.In this paper,a novel semantics-embedded deep hashing for multi-label video retrieval method is proposed to solve these problems.First,a hybrid attention module is integrated into the basic CNN+LSTM hashing network for video feature extraction and hash code learning.The attention module consists of a self-attention block and a relation-attention block,learning weights for different frames.Second,a semantics-embedding soft similarity is defined,which employs a GCN to learn both the instance and semantic associations.Results of experiment and comparisons which conducted on the multi video datasets show that,the proposed method achieves significantly higher performance than the competing ones in the multi-label video retrieval task.

Keywords/Search Tags:

video retrieval, deep hashing, multi-label learning, soft similarity, semantic-embedding

PDF Full Text Request

Related items

1	Research Of Multi-label Cross-modal Semantic Hashing Image-text Retrieval
2	Deep Joint Semantic-embedding Hashing
3	Research On Cross-modal Retrieval Method Based On Deep Semantic Hashing
4	Research On Near-duplicate Video Retrieval And Cross-domain Sentiment Classification Based On Embedding Learning
5	Clothing Image Retrieval Based On Deep Learning For Extract Feature
6	Research On Deep Hashing Algorithms For Image/Crossmodal Retrieval
7	Research On CT Image Retrieval Method Of Pulmonary Nodule Based On Deep Learning
8	Cross-modal Retrieval Research Based On Correlation Analysis And Structure Preserving
9	Deep Hashing For Multi-Label Medical Image Retrieval
10	Coupled-hashing For Cross-modal Retrieval