| Smart farming is a key development direction for large-scale farming and intensive management,which is crucial for promoting the upgrading of the livestock and poultry farming industry.How to use visual perception technology to achieve understanding of pig posture and behavior,as well as automatic recognition of breeder behavior,is the key to improving the precision of pig feeding and intelligent epidemic prevention level.This article aims to study visual perception technology for smart agriculture,and the main content is as follows:Firstly,a multi-modal semantic learning method for pig pose and active understanding was proposed.This method consists of a global feature learning module based on YOLOX,a local feature learning module based on improved HRNet,and a topology semantic learning module based on GCN.The global feature learning module is first used to extract the global visual features of individual pigs from multi-view pig images;The local feature learning module is used for multi granularity visual semantic learning and fusion of individual pigs.The topology semantic learning module is used to analyze fine-grained topology semantics in pig posture,including the correlation semantics between key points.These three modules can be used independently or connected through serialization to form end-to-end pig pose estimation and action recognition methods.Secondly,a discriminative semantic learning method for human action recognition was introduced.The skeleton-based human action recognition method consists of three core units and two learning modules,including the global topology graph convolution unit,the discriminative topology graph convolution unit,the multi-scale temporal convolution unit,the multi-topology semantic learning module,and the temporal semantic reinforcement learning module.Experiment results show that the global topology convolution can extract the global semantics of human skeletons.Discriminative topological graph convolution can further learn the identifying semantics of each human skeleton.Multi-scale temporal convolution can learn the spatio-temporal association semantics of human skeleton sequence.The alternating use of multiple topology semantic learning module and temporal semantic reinforcement learning module further improves the ability of high-level topology semantic learning of the proposed method.Finally,the results of ensemble learning and multi-stream skeleton features verified the robustness and generalization of the presented model.The experimental results show that the technology proposed in this article has certain advantages in applications such as pig pose and action understanding,human action recognition,and has important theoretical and practical value for intelligent breeding and other related fields. |