Font Size: a A A

Research And Application Of DNA Sequence Classification Based On Improved Support Vector Machine

Posted on:2022-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:J J GuoFull Text:PDF
GTID:2480306347956039Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Since the 1950s,human beings have been aware of the important role of protein.DNA guides the synthesis of protein and determines the expression form of protein,the interaction between them reflects the characteristics and differences of life activities.The purpose of DNA sequence classification is to predict the category of DNA.DNA sequence type determines gene structure and function,it can identify whether an unknown class is a new species,alien species or well-known species,it is great significance to discover new species and protect endangered species and it also has important research value in medicine,environmental science and other aspects.With the development of gene project and the gradual enrichment of biological information knowledge,the amount of biological information data is growing rapidly.How to extract effective information knowledge from massive DNA data and predict their sequence composition and structural characteristics,so as to find the law of gene composition and accurately reflect the biological characteristics and functions has become an important problem.This paper proposes a SVM classification model based on improved particle swarm optimization algorithm to solve the problem of insufficient accuracy and generalization ability in the process of DNA sequence classification.The improved particle swarm optimization algorithm introduced in this model is optimized from many aspects,including inertia weight factor and acceleration factor,and the idea of single particle falling into local optimal jump out iterative strategy is introduced.The improved PSO algorithm has faster convergence speed and better global search based on improved PSO algorithm.Experimental results show that compared with traditional ability than the existing PSO algorithm.Finally,this paper evaluates the effect of SVM classification model SVM,JAB-SVM and LS-SVM classification models,SVM classification model based on improved PSO algorithm has higher accuracy and stronger generalization ability in DNA sequence classification.In the process of DNA sequence classification,this paper carries out DNA sequence feature extraction from the statistical point of view,counts the frequency of single base,double base and triple base respectively,fuses 35 dimensional features as the initial feature vector of DNA sequence,and then selects PCA dimension reduction algorithm to reduce the dimension of features,finally obtains 10 dimensional feature vector for DNA sequence classification test.This paper develops a set of DNA sequence classification visualization display system based on SmartBI technology,,including DNA sequence data integration,DNA classification model selection,DNA classification model classification results display and other functions,encapsulates the DNA sequence classification process,decouples biology and informatics,so that biologists do not need to pay attention to the computer processing process and focus on the research of biological information.
Keywords/Search Tags:DNA sequence classification, improved particle swarm optimization, PCA, SVM, SmartBI
PDF Full Text Request
Related items