Font Size: a A A

Research On Recognition Method Of Vegetable Diseases In Open Environment Based On Multimodal Data And Knowledge Fusion

Posted on:2024-01-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:C S WangFull Text:PDF
GTID:1523306935987809Subject:Agricultural information technology
Abstract/Summary:PDF Full Text Request
Sustainable development of the vegetable planting industry plays an indispensable role in improving people’s livelihood,maintaining social stability,and promoting public harmony.Outbreaks of vegetable diseases can easily cause large-scale reduction in production yield and quality,resulting in irreversible economic losses.In this regard,accurate identification of diseases is the key to effective disease prevention and control,so as to minimize yield losses,reduce pesticide use,guarantee the quality and safety of agricultural products,and promote the healthy and sustainable development of the vegetable industry in case of outbreaks.Traditional disease identification methods are subjected to low efficiency,high subjectivity and high misdiagnosis rate,and therefore cannot meet the needs of modern agricultural production.The vegetable disease identification methods combining image processing and machine learning technologies have shown incomparable advantages over traditional methods,including rapid detection,high accuracy and real-time feedback,and have become the developing trend of modern vegetable industry.To date,existing methods have achieved relative success under limited and restricted settings,but in complex and volatile agricultural scenarios in terms of time and space,there are still problems to be addressed,such as:the disease features are "incomprehensive";the decision-making basis is "inexplicable";and the domain knowledge is "difficult to integrate".In response to these issues,this paper investigated the task of disease identification from three dimensions,i.e.,multimodal joint representation learning,automatic captioning of disease images,and domain knowledge fusion,by taking common tomato and cucumber leaf diseases as an example.The main research contents and conclusions are as follows:1.Aiming at the problem of "incomprehensive" feature detection due to insufficient utilization of multimodal information in existing models,a solution was proposed by inputting diagnostic information other than disease images into the model in the form of text,so as to identify vegetable diseases on the basis of multimodal joint representation.The integration of images and texts can achieve complementary advantages between these two modal features.To verify the effectiveness of the above method,the multimodal representation learning models based on probability fusion and based on feature space fusion were constructed,respectively.The former model uses an image classifier to extract image features and a text classifier to extract text features first,and then derives the probability matrix for each mode.Thereafter,the probability matrixes are fused according to an appropriate ratio for disease identification.This dual branch structure with feature fusion at a later stage offers a flexible architecture and a low computational complexity.The latter model uses a transformer to map disease images and description texts into a unified feature space,where the image and text features are fused for disease prediction.The advantage of this method is that it has a high degree of integration.The experimental results show that the disease identification accuracy of these two multimodal fusion models is both better than that of the single modal identification model based on either images or texts alone.2.Aiming at the problem of "inexplicable" description of decision-making basis due to the lack of clear expression between the disease features and detection results in models based on deep learning,an image feature dense captioning method was proposed.This method can generate natural language descriptions for the observed image features to assist users in understanding and judging the rationality of the detection results.Based on this idea,a Chinese dense captioning model for vegetable leaf disease images,namely Veg-DenseCap,was constructed.This model consists of two parts.In the first part,a two-stage region-based target detector,Faster R-CNN,is used to extract the features containing visual information about the disease spots.In the second part,a language generator,LSTM,is used to generate a description statement with detailed information about the disease spots using the image features extracted by the detector as input.The experimental results show that the sentences generated by Veg-DenseCap have correct syntax and a good diversity,and the feature information of leaf diseases is accurately described.This kind of semantic description based on visual features can facilitate users in understanding the model’s decision-making basis,which is conducive to improving the transparency of model and establishing a trust relationship between the model and its users.3.Due to the problem of "difficult to integrate" between data and knowledge,the existing disease identification models usually rely heavily on the labeled data but have low utilization of the domain knowledge.In response to this issue,a method of diagnosing vegetable leaf diseases by integrating knowledge map and deep learning was proposed.Firstly,the feature words in the disease text description sentences are extracted and then transformed into word vectors through word embedding.Secondly,the entities and relationships related to the disease features in the disease domain knowledge map are extracted through structured knowledge extraction,which are further transformed into low-dimensional continuous vectors through knowledge map embedding.Finally,the disease feature word vectors and the related knowledge entity vectors are used as multi-channel inputs for the CNN network,so as to allow the model to learn more abundant disease features by fusing information from multiple channels.The experimental results show that,by adding disease domain knowledge,the model is able to capture more standardized,comprehensive,and straightforward disease features from the description texts,so that the diagnostic accuracy on different types of diseases is improved.Moreover,the physical connection between disease feature words and triples can help form a visual reasoning path on the knowledge map,which facilitates the users to better understand the model decision-making basis and enhances the model interpretability.To sum up,in this paper,a method of fused representation learning of vegetable disease features based on the correlation and complementarity between multimodal data was proposed for the purpose of obtaining distinguishable diseases features from multiple dimensions.This method has been verified to improve the accuracy and generalization ability of the disease identification model in an open environment.By using the dense captioning technique to map the relevant features in the image feature space into the understandable natural language semantic space,the users can better understand the model decision-making basis,which improves the credibility of the model:Furthermore,a knowledge fusion method was proposed,which increases the identification accuracy and interpretability of the model.The research outcome of this paper provides a novel method for disease identification based on multimodal data and domain knowledge,which is of important theoretical significance and practical application value for improving the intelligent level of crop disease identification and promoting the development of intelligent agriculture.
Keywords/Search Tags:Disease identification, Multimodal representation, Image description, Knowledge graph, Knowledge fusion
PDF Full Text Request
Related items