Font Size: a A A

Research On Visual Language Model For Behavior Analysis And Its Application In Instrument Field

Posted on:2023-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:J Z GaoFull Text:PDF
GTID:2542306917479124Subject:Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of artificial intelligence in the field of deep learning has made people feel the more convenience brought by science and technology.People are eager to enable computers to understand the complex information of cross-modal after computers fully understand the single-modal information of natural language and image.So there are diverse cross-modal models emerged and achieved excellent results on various cross-modal tasks.In recent years,it is a new trend to use deep learning to conduct behavior research,and the success of cross-modal models has brought new opportunities and challenges to behavioral research.The existing behavior research mainly focuses on the fields of behavior recognition,user behavior analysis and behavior capture,and lacks of analyses and researches on complex behaviors.Meanwhile,the behavior recognition task in deep learning often only utilizes the single-mode information of image.Based on this,this thesis studies the task of cross-modal behavior analysis,and further studies the visual language model for behavior analysis,then implements the prototype system of the model by using cross modal deep learning,natural language processing and other related technologies,and applies it to the field of instrumentation.The main work of this thesis is as follows:First,this thesis studies the task of cross-modal behavior analysis,and constructs a behavior data which is suitable for visual language model.The characteristics of human behavior are studied and divided into broad standard,high standard and comprehensive standard behaviors according to their complexity.According to the behavior classification and combined with specific cases,the cross-modal visual language behavior analysis data set is constructed,and the standard,redundancy,error and offset sub-behaviors in the data set are analyzed and preset.Then,the behavior library is constructed and various behaviors and their sub-behaviors are stored,and various evaluation indicators are proposed to measure the pros and cons of behavior.Second,this thesis designs and implements a visual language model for behavior analysis.Design behavior analysis schemes for different types of behaviors.The visual language model for behavior analysis is studied according to the behavior analysis scheme.The model mainly includes the behavior processing module based on contrastive learning,behavior matching and evaluation module.The behavior processing module is built based on the contrastive language-image pre-training(CLIP)model.It is mainly used for the discrimination of behavior categories and the extraction and screening of sub behaviors.By adding feature information to the input behavior text to increase the information proportion of the language modality,and further fine-tuning experiments are carried out on the module to improve the performance.In the sub behavior screening,the non-maximum suppression strategy is used to suppress the similar and redundant pictures detected by the same sub behavior.The behavior matching and evaluation module uses different matching strategies according to different behavior categories to conduct behavior matching and conduct behavior analysis and evaluation.Finally,the feasibility and effectiveness of the behavior analysis model in the general field are illustrated by specific experiments.Third,this thesis applies the visual language model for behavior analysis to the field of instrumentation.This thesis analyzes the importance of behavior analysis in the instrument field,and establishes the behavior analysis data set and behavior library according to the characteristics of the field to realize the prototype system of visual language model for behavior analysis in the instrument field,and then realize the real-time behavior detection and auxiliary behavior analysis in the instrument domain.Finally,a specific case is given to demonstrate the effectiveness of the model.
Keywords/Search Tags:Behavior analysis, Cross-modality, Visual language, Deep learning, Instrumental domain
PDF Full Text Request
Related items