| Tomato is one of the most commonly cultivated vegetables in the world.Most of the countries in Europe,America,China and Japan adopt the method of facility cultivation.At present,the identification of tomato flowering and fruiting stage mainly relies on manual observation,which cannot meet the needs of real-time and rapid detection.Tomato picking mainly relies on manual work,which is labor-intensive and low-efficiency.Tomato picking robots can save labor and improve production efficiency,which is of great significance to factory tomato planting.This paper takes tomato flowers and fruits in different periods in glass greenhouse as the research object,and proposes an image vision-based identification and detection method for tomato flowers and fruits.The main research contents are as follows:(1)Detection method of tomato flowers and fruits under greenhouse environment based on improved Yolov4 was studied.In order to improve the accuracy of model detection and ensure real-time detection efficiency,this paper fuses the Cross Stage Partial Network(CSPNet)with the Residual Network(Res Net)in the Mask R-CNN network,integrate the target feature information of tomato flowers and fruits at different growth stages to reduce the interference of complex backgrounds.Experiment on the test set of tomato flowers and fruits,the results showed that the average precision of Yolov4-CBAM model was 90.12%,95.11%,92.60%,97.34%,91.13% and 97.49%,respectively,in the detection of tomato at bud stage,flowering stage,fruit expansion stage,green ripening stage,half-ripening stage and mature stage,the mean average precision was 93.97%,the average detection time was 16.54 ms.Compared with the YOLOv4 and YOLOv4-SE models,the mean precision of the YOLOv4-CBAM model in detecting tomato flowers and fruits at different growth stages was increased by 4.46%and 2.26%,respectively.(2)Segmentation method of the tomato fruits with different maturities under greenhouse environment based on improved Mask R-CNN was studied.In order to reduce the amount of network computation and improve the accuracy,in this paper,the Cross Stage Partial Network(CSPNet)was integrated with the Residual Network(Res Net)in the Mask R-CNN network,the repeated feature information in the process of back propagation could be reduced by cross-stage splitting and cascading strategies.Experiments on a test set of tomato fruits of different maturity,the results showed that the improved Mask R-CNN model with CSP-Res Net50 as the backbone network presented the mean average precision of 95.45%,F1-score of 0.912,and average segmentation time was 0.658 s.Furthermore,the mean average precision increased by16.44%,14.95%,and 2.29%,respectively,compared with the Pyramid Scene Parsing Network(PSPNet),Deep Lab v3+,and Mask R-CNN with Res Net50 as the backbone network.The average segmentation time of improved Mask R-CNN with CSP-Res Net50 as the backbone network was reduced by 1.98%,compared with Mask R-CNN with Res Net50 as the backbone network.(3)Finally,the improved Mask R-CNN model with CSP-Res Net50 as the backbone network was deployed to the picking robot,in order to verify the recognition and segmentation effect on different ripeness of tomato fruits in large glass greenhouses.The recognition accuracy of this model was 90%.The model had good recognition performance for tomato fruits of different maturity in the greenhouse environment,which could provide a basis for the precise operation of tomato picking robots. |