| Wheat is the food crop with the widest planting area,the largest area and the largest yield in the world,and the world wheat usage reached 754 million tons in 2021.The timely estimation of wheat yield has a significant impact on crop production,grain prices and food security.The number of ears per unit area is the difficulty and the top priority in wheat yield estimation research.At present,the artificial yield estimation method is based on expert visual estimation of yield,and the accuracy cannot be guaranteed.The sampling method for production estimation is time-consuming and labor-intensive by collecting part of the area,performing manual counting and weighing.With the development of computer vision technology,a large number of studies are devoted to counting the number of wheat ears in a single image to achieve yield estimation.Such research uses the powerful feature self-learning ability of convolutional neural networks to extract features of wheat ears,and train through a large amount of data.Model,and then successfully realized the count of wheat ears in the image,providing data reference for subsequent wheat yield estimation.However,some existing researches on wheat ear counting are based on the general raw counting network,and do not consider the characteristics of different scales and density of wheat for optimization,and the accuracy needs to be improved.Therefore,this thesis takes the image of wheat ears in the field as the research object,and makes targeted improvements to the original convolutional neural network model to achieve accurate wheat counting.The specific research contents are as follows:(1)Study the impact of data preprocessing strategies on the model training process and final performance.For the public global wheat detection dataset(GWHD dataset)and high-resolution wheat dataset(WED dataset),not only conventional data enhancement methods(rotation,mirroring,changing contrast,etc.)are used,but also targeted enhancement of images from different sources.The research uses a linear filtering method to select a convolution kernel suitable for wheat images to process the GWHD dataset,and filter and enhance the original compressed and poor-quality images to suppress image noise and improve image quality.Research the Gaussian blur algorithm to remove redundant details from the high-resolution(4000×6000)original image of the WED dataset,reduce the difference with the real image,solve the problem of premature overfitting during model training,and effectively improve the generalization ability of the model.(2)Research the target detection optimization algorithm.Most of the existing target detection networks are used for object detection in the general field,and the feature extraction ability of multi-scale wheat ears is poor.A single threshold is set to determine positive and negative samples,and the prediction frame of dense wheat ears is inaccurate.Therefore,this study proposes a target detection method based on Aug-FPN and cascaded Io U thresholds.Aug-FPN is used to adaptively pool the unreduced information to obtain the underlying information without loss and extract the features of wheat ears of different scales.Using cascaded Io U training,the bounding box output by the low-threshold detector is used as the input of the high-threshold detector,which gradually improves the ability to locate wheat ears and solves the problem of missed detection caused by dense wheat ears occlusion.Compared with the original FPN network,the improved wheat ear detection model has an increase of 7.7,5.5,and 6.8 percentage points in the precision(P),recall(R),and average precision(AP)of the wheat ear detection model,which are better than other mainstream object detection models.(3)Research the optimization algorithm of wheat ear density graph regression counting.The original crowd counting algorithm regards the center point of the human head as the target.The head is round and is quite different from other parts,while the wheat ears and leaves are similar in color and irregular in shape,which brings challenges to the density regression task.Therefore,at the front end of the network,this study uses VGG19 to extract deeper information,combines contextual semantic information,and fuses the features obtained by different sizes of receptive fields to perform pyramid pooling,obtains the relationship between key points and their surrounding parts,and feeds them back to the back end.The internet.At the back end of the network,convolution stacking with multiple dilation rates is used to ensure that each convolution can contain wheat ear information at multiple scales,generate a high-quality density map,and then perform accurate regression statistics on the number of wheat ears.The coefficient of determination(R~2),root mean square error(RMSE),and mean absolute error(MAE)of the improved model reached 0.95,6.1,and 4.78.The artificial count of the actual wheat image taken by the drone is 3880 wheat ears,and the model estimated result is 3871.The overall error rate is only 0.23%,and the coefficient of determination reaches 0.89.To sum up,the research on wheat ear counting based on the improved convolutional neural network proposed in this thesis can effectively count wheat ears in the field environment,and the improved counting methods based on target detection and density map regression maintain high accuracy,and it has strong robustness,lays a theoretical research foundation for the future realization of wheat yield estimation,and has research significance and practical application value. |