As the area of fruit cultivation in China continues to expand and the output continues to rise,the labor required for the fruit production process is increasing.At present,the fruit operation mainly relies on manual completion,and the manual operation has problems such as timeconsuming,labor-intensive,strong subjective factors and easy to make mistakes.In recent years,fruit recognition and segmentation technology based on deep learning has entered a brand new era,which has solved the shortcomings of traditional methods to a certain extent,but there are still problems such as low recognition efficiency and poor robustness,which make it difficult to realize engineering applications.To address the above problems,this paper conducts a deep analysis of deep learning networks,researches fruit recognition and segmentation algorithms,and improves the network models under different tasks to achieve better fruit recognition and segmentation performance.The main research works are as follows:1)Due to the large number of model parameters of some traditional target detection algorithms,it leads to high model training time overhead and slow detection speed.At the same time,traditional detection algorithms have low accuracy for fruit detection of small targets and lack robustness under different interference conditions(e.g.,branch and leaf occlusion,fruit overlap and light intensity changes).To this end,the YOLOX algorithm is chosen as the benchmark network and its backbone feature extraction network is replaced with a Dense Net network to enhance feature reuse and reduce the computational effort of the network,and then combined with an attention mechanism strategy to enhance deep feature fusion.Comparative experiments show that the proposed YOLOX-Dense-CT method can achieve higher detection accuracy,faster detection speed and stronger robustness.2)Traditional machine learning-based fruit classification algorithms mainly extract single features,and the extracted features are input to machine learning algorithms to achieve classification,but generally the classification efficiency is low.In addition,although the classification model based on convolutional neural network achieves better classification performance,higher accuracy requires significant training time overhead and has limitations in modeling global information.To address these problems,this paper conducts a study on Vision Transformer and Swin Transformer networks by directly extracting depth features with pretrained Transformer models and then inputting the features into the selected classifier for recognition.The proposed classification method based on Transformer network combine support vector machine and multilayer perceptron still achieves higher accuracy with lower time overhead.The results of different experiments show that the proposed method outperforms the traditional method in several metrics.3)To address the problems of poor generalization ability and low segmentation efficiency of traditional segmentation algorithms,this paper uses Mask R-CNN as the benchmark network for the fruit segmentation task and selects Res Net-50 network as the backbone feature extraction network.In this paper,some traditional convolutions in Res Net-50 are replaced with deformable convolutions to adapt to fruit targets with different morphologies,and the number of layers of the feature extraction network is modified to enhance the feature extraction.The experimental results show that the improved segmentation model achieves better generalization ability and higher segmentation accuracy.The above study not only effectively addresses the shortcomings of existing fruit detection,classification and segmentation algorithms,but also improves the overall performance of fruit recognition and segmentation algorithms.It provides a theoretical reference for developing fruit operation robots with recognition and segmentation functions,and is also of great significance for promoting the development of fruit industry in the direction of intelligence. |