Font Size: a A A

A Tumor Somatic Mutation Detection Method Based On Next-generation Sequencing Data

Posted on:2022-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y F MaoFull Text:PDF
GTID:2504306605473044Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Detecting variants related to various diseases is one of the main goals of genome sequence analysis.With the maturity and widespread use of next-generation sequencing technology,researchers can obtain a large amount of genome sequence information in a short time for mutation detection work.In recent years,as the concept of precision medicine and cancer targeted therapy has been proposed,more and more researchers have devoted themselves to the exploration and research of oncogenes.This paper proposes a new somatic mutation detection method: svm Somatic,which takes into account the impact of copy number variation on somatic mutation detection and uses support vector machines as a classifier to realize somatic mutation detection under a single sample of data.The main work and innovations of this paper are as follows:1.This article considers the influence of copy number variation on somatic variation detection.Traditional somatic mutation detection methods are often based on a single mutation.In this case,allele frequency is a very important indicator for detecting a somatic mutation.Under ideal circumstances,the allele frequency can distinguish germline variation from somatic variation.However,in actual tumor cells,the situation is complex and changeable,such as tumor heterogeneity,copy number variation that also exists in tumor cells,insertion and deletion of small fragments,etc.These factors will cause the allele frequency to deviate from the ideal situation.When svm Somatic distinguishes somatic and germline variants,copy number variants are taken into consideration.This not only reduces the impact of other variants on somatic variants but also makes the research method in this paper more applicable to the actual situation in tumor cells.2.This article uses a single sample method to achieve somatic mutation detection.In clinical medicine,paired-samples are very difficult to obtain.On the one hand,the reason may is that the need for subsequent use was not considered when the sample was collected,so the paired samples were not collected at the first time.Also,the acquisition of paired samples also requires the patient’s consent.Some patients refuse to provide paired samples for their privacy.Another reason is due to cost considerations.Although the advent of next-generation sequencing technology has greatly reduced the cost of sequencing,it is still a huge expense if both the tumor genome and the normal genome are sequenced.Also,data storage and computing requirements need to be increased,which virtually increases the cost of obtaining paired samples.Therefore,svm Somatic’s single-sample somatic mutation detection can not only solve the problem of difficulty in obtaining matched samples but also reduce sequencing costs,data storage,and calculation costs.3.In this paper,improved support vector machines are used to classify somatic mutations and germline mutations.Firstly,five features related to somatic mutation are extracted,and then labels are added for training based on simulation data.The training process uses 10-fold cross-validation to find the optimal classification model.In order to improve the model training speed of support vector machines in massive data,edge detection and K-means clustering methods are used to reconstruct the training set to reduce the amount of data in the training set to achieve the purpose of improving the model training speed.Finally,experiments were performed on simulation data and real data.To evaluate the performance of svm Somatic,four classic methods were selected for comparison.The results show that svm Somatic has a better effect on somatic mutation detection.
Keywords/Search Tags:Somatic mutation, Next-generation sequencing, Copy number variations, Allele frequency, Support vector machine
PDF Full Text Request
Related items