| Fish counting is one of the key issues in aquaculture and trading.Based on the construction of three different grass carp counting datasets,this paper establishes a deep learning fish counting model,calculates the number of fish in images and videos,and lays a technical foundation for realizing automatic fish counting in the aquaculture industry.Taking grass carp as an example,the main research work in this paper includes the following four aspects:(1)Created grass carp dataset.We have successively collected grass carp data and constructed three data sets: the grass carp counting dataset(Grass Carp Counting Dataset,GCD),the dense grass carp counting dataset(Dense Grass Carp Counting Dataset,DGCD),and the grass carp video counting dataset(Grass Carp Video Counting Dataset,GVCD).The grass carp counting dataset GCD and the dense grass carp counting dataset DGCD each contain 2000 grass carp images,while the grass carp video counting dataset GVCD selects 8 segments of video data of different lengths.The methods proposed in this paper have been trained and validated on the three established datasets.(2)A grass carp counting model VSPNet based on an improved VGGNet model is proposed.To solve the problem of low counting accuracy of grass carp due to morphological and positional changes,a VSPNet model was proposed.Firstly,an EESP module is added after the 10 th convolution layer of the VGGNet model to further enhance deeper semantic information,increase the receptive field,and improve the recognition accuracy of fish under morphological and positional changes based on the feature extraction of the previous 10 convolutions;Secondly,the last three fully connected layers,the last two pooling layers,and the last three convolutional layers of the VGGNet model are removed to improve the counting efficiency of grass carp while maintaining the performance of the model;Finally,a mixed loss function including mean square deviation loss and correlation loss is proposed to improve the quality of density map generation and the counting accuracy of grass carp.The VSPNet model was tested on the grass carp counting dataset GCD,with a MAE value of 5.36 and an RMSE value of 6.57,which is superior to the counting results of the RCV model and the DSNet model.The experiment confirmed that this method effectively improves the counting accuracy of grass carp when their position and morphology change.(3)A network model FM-P2 PNet for dense grass carp counting based on an improved P2 PNet network is proposed.Aiming at the problem that dense grass carp count occlusion seriously reduces the count accuracy,the P2 PNet network model is improved.Firstly,it is proposed to add a feature alignment module before sampling and adding local features on the network,and use the information provided by the local feature map to adjust the upper sampling feature map to improve the recognition accuracy of grass carp under dense occlusion scenes;Secondly,it is proposed to add an FCM module after feature fusion,focus on the dense areas of grass carp in the image,extract the feature information of grass carp in the dense state,and further improve the counting accuracy of grass carp;Finally,using the improved model,an experiment was conducted on the dense grass carp counting dataset DGCD.The MAE value was 10.97,and the RMSE value was 14.64,which was superior to the counting results of the RCV model and the DSNet model.The experiment confirmed that this method effectively improved the counting accuracy of grass carp under dense and occluded conditions.(4)A grass carp counting model based on YOLOv5 and Strong SORT multi target tracking is proposed.In order to improve the recognition accuracy of grass carp in video,this chapter replaces the Deep SORT target tracking algorithm with Strong SORT based on the YOLOv5 detection algorithm to further extract the appearance and morphological features of grass carp.Firstly,the YOLOv5 algorithm is used to detect the target object in the video frame,and obtain the corresponding detection frame position information and other information;Secondly,the NSA Kalman filter in Strong SORT is used to predict the position information of the grass carp in the next frame,calculate the association relationship between the prediction frame and the detection frame,update the target trajectory status,and assign IDs to each tracked target;Finally,count the number of grass carp according to their ID numbers.The m AP value on the grass carp video counting dataset GVCD is 96.34%,the recall value is 91.54%,and the MOTA value is 81.38%.This method achieves real-time counting of grass carp in video. |