Font Size: a A A

Life Language Processing:Exploration Of Gene Sequence Splicing Method Based On Deep Learning And Natural Language Processing

Posted on:2022-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhouFull Text:PDF
GTID:2518306494486404Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The current technology on gene sequencing splicing is based on overlap,which may introduce some simple error correction schemes.However,these simple information in-terpretations are still inadequate,compared with the complex information contained in gene sequences.Since designers need to control the entire transcription and translation process for synthetic genes,the complexity cannot be underestimatedNatural language processing has benefited from the influence of deep learning in recent years.Significant progress has been made in semantic understanding,dialogue systems,and machine translation.Natural language and gene sequence processing have some similarities in feature extraction and mathematical models.It is possible to find some algorithms in deep learning and natural language processing for the processing of gene sequences on the way to life languageIn bioinformatics,the existing research topics based on deep learning are mostly related to image or protein sequence information.There are few studies based on gene sequence information.The reason is that gene sequence information is relatively in the infrastructure level,and the length of sequence information is relatively much longer than that of protein sequence.It is more difficult to extract features from sequence data in a concise way like peptide structureThis thesis takes gene splicing as the starting point,and completes the whole pro-cedure of data set design,feature engineering,model analysis,and evaluations.In this thesis,the experiments are conducted by using NCBI gene data,and the results are dis-cussed,adapted and optimized according to the characteristics of gene sequences.The proposed algorithm does not rely on gene fragment overlap,only by relying on semantic information.The accuracy rate of splicing point is over 90%,and the scope of applica-tion for the proposed algorithm is that the similar classification clusters are within the same genus.
Keywords/Search Tags:Deep learning, Natural language processing, Life language processing, Gene splicing
PDF Full Text Request
Related items