Font Size: a A A

The Research On Gene-Disease Associations Based On Text Mining

Posted on:2017-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:B Q FuFull Text:PDF
GTID:2310330536953103Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Gene-disease associations and the prediction is one of the hot topics currently.Gene sequences are one of the most important targets in Bioinformatics researches.Since genes are basic units of heredity,there are associations between genes and diseases.Researches on these associations are of great value for prophylaxis and treatments against diseases.With Bioinformatics methods,some possible unnoticed gene-disease associations could be revealed,providing guidance and targets for future researches.Researches on gene-disease associations and predictions relies on the knowledge of genes and diseases.Some unknown gene-disease associations could be discovered through analyzing and making use of data about genes and diseases.Currently,there are some researches based on graph theory or machine learning that tries to discover possible gene-disease associations.However,most of the biological researching data describe diseases using texts which are not formalized.Common methods could not process this kind of data.To solve this problem,this research mined texts in documents in the PubMed database,extracted information related to gene and disease,organized the related information,and tried to infer possible gene-disease associations among focused genes and diseases.The method in this research has been optimized for the texts in PubMed,especially texts about diseases.Furthermore,MeSH database and OMIM database are referred to achieve a more formalized information extraction and prove the effectiveness and value of the results.In this article,an effective method is given to reveal the hidden value of PubMed texts for the prediction of gene-disease associations.This method could be used for predictions of common genes and diseases without much complexity.The detailed processes and optimization of texts about genes and diseases could be a method for processing texts in related researches.The results could be used to reveal some hidden gene-disease associations,predicting the probabilities of associations.Furthermore,this research uses texts as data sources.Since these sources are rarely used in popular graph theory methods,the results have potential to be integrated into other models to improve overall performance,leading to deeperresearches of gene-disease associations.
Keywords/Search Tags:Bioinformatics, Information extraction, Text mining, Gene-disease associations
PDF Full Text Request
Related items