Font Size: a A A

Functional Prediction Of Partially Unannotated Sequences Of Rapana Venosa Salivary Gland Transcriptome

Posted on:2019-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2370330542996788Subject:Biological engineering
Abstract/Summary:PDF Full Text Request
In this study,the data of salivary gland transcriptome from four feeding periods of Rapana venosa were used as the analysis materials.The protein function and structure of the sequence were analyzed and predicted by relevant biological software for the genes unannotated sequences in the transcriptome,which could provide a reference for the subsequent identification of these proteins functions.The feeding process of Rapana.venosa can be roughly divided into four phases:S1 is the Rapana venosa nesteia phase;S2 is the Rapana venosa catching phase;S3 is the Ra.pana venosa feeding phase,and S4 is the Rapana venosa just digestive absorption phase.The S1 phase transcriptome sequence is used as the control group to compare with the transcriptome sequence for phase S2,S3,and S4.According to the differential expression criteria,the different portions of the sequences between each experiment phase and control phase were screened out.The experimental sequences for this study are the different portions of the sequences that have no reference sequences in Nr,Nt,Swissprot and other databases.Three sets and the total of 2925 analyzable sequences are obtained.Various bioinformatics tools were used for screening and predictive analysis of these sequences.23 protein encoded sequences were identified and the protein functions were predicted.The specific analysis process is as follows:The no annotation information and differentially expressed sequences were analyzed by TransDecoder software to obtain its protein coding sequence(CDS)and its encoded amino acid sequence.In the 2925 sequence,406 valid CDSs were read for S1-S2 phase,257 CDSs were read for SI-S3 phase,and 337 CDSs were read for S1-S4 phase.It should be pointed out that repeated existence of CDS between samples,some of them are differentially expressed in each sample,and some are from different Contigs in the same cluster,but the identified CDS overlaps.InterProScan 5 software was used to determine the amino acid sequence domain and family information.For protein sequences with a sequence length of less than 150 amino acids,only InterProScan 5 software was used for the analysis;for protein sequences with a sequence length of more than 150 amino acids,online prediction tools were used for protein function prediction based on InterProScan 5.For S1-S2,Argot2.SIFTER,GoFDR,and FFPred were used to analyze the sequences.For S1-S3 and S1-S4,Argot2,SIFTER,FFPred,ESG,and PFP were used to analyze the sequences.Integrate the function prediction results of each forecasting tool,functional predictions were made for 10,8,and 5 sequences for the S1-S2,S1-S3,and S1-S4 samples respectively.The protein planar secondary structure was predicted using PSIPRED software.Protein three-dimensional(3D)secondary structure was predicted using BioSerf software.
Keywords/Search Tags:Bioinformatics analysis, Transcriptome, Protein function prediction, Salivary gland, Rapana venosa
PDF Full Text Request
Related items