Font Size: a A A

The Application In Gene Function Prediction And Rubber Mixing Control Of Machine Learning Algorithm

Posted on:2015-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:S ChenFull Text:PDF
GTID:2271330452469771Subject:Chemical Process Equipment
Abstract/Summary:PDF Full Text Request
The macine learning algorithms have a wide range of applications in biologicalidentification and chemical control. This paper presents new improved algorithmsbased on partial least squares, can better solve three specific issues in the field of thebiometric and chemical control: prokaryotes short coding sequence identification,human snoRNA recognition, and rubber hardness measurement online.Significant efforts have been made to address the problem of identifying shortgenes in prokaryotic genomes. However, most known methods are not effective indetecting short genes. Because of the limited information contained in short DNAsequences, it is very difficult to accurately distinguish between protein coding andnon-coding sequences in prokaryotic genomes. We have developed a new IterativelyAdaptive Sparse Partial Least Squares (IASPLS) algorithm as the classifier toimprove the accuracy of the identification process. In comparison with GeneMarkS,Metagene, Orphelia, and Heuristic Approachs methods, our model achieved the bestprediction performance in identification of short prokaryotic genes. The experimentsalso proved that the IASPLS can improve the identification accuracy in comparisonwith other widely used classifiers, i.e. Logistic, Random Forest (RF) and K nearestneighbors (KNN). The accuracy in using IASPLS was improved5.90%or more incomparison with the other methods.SnoRNAs are small RNA molecules (60~300nt), we proposed a new ESDAmethod to identify snoRNAs from other RNAs in human genomes. ESDA is auser-friendly and practical method. Comparing with other algorithms can confirm thevalidity of ESDA. Compared with SnoReport, the proposed approach not onlyimproves the precision, and has the advantages of simplicity and computational speed.Furthermore, we compare ESDA with other widely used algorithms and classifiers:RF (Random Forest), DWD (Distance Weighted Discrimination) and SVM (supportvector machine).The highest improvement of accuracy was25.1%.Rubber production process monitoring is complicated due to the influence of thetime-varying. This paper proposes a rubber hardness of online quality monitoring andforecasting model updating method, which is based on kernel partial least squares.This method can get precise rubber hardness value, and improve the quality of the rubber. This method uses the monitoring alarm device can reduce the security hiddendanger.
Keywords/Search Tags:prokaryotic genes, machine learning, coding sequence, classification algorithm, snoRNAs, rubber mixing
PDF Full Text Request
Related items