Font Size: a A A

Research And Design Of Web Crawler For Gene Expression Data Based On Python

Posted on:2018-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:Q FengFull Text:PDF
GTID:2310330536474249Subject:Public health
Abstract/Summary:PDF Full Text Request
Objective:Take an open database of integrated gene expression created by NCBI for example,the development of crawler programs can effectively solve the problems posed by experimental data for increasing high throughput gene expression.Web crawler mine and process of information,without being overwhelmed by vast amounts of information,improving database utilization;and reducing the waste of biomedical information resources,providing medical workers with comprehensive gene expression data information,and promoting the development of clinical bioinformatics.Methods:1.Literature Analysis Referring to the web crawler system,web crawling technology,GEO database and other related literature,in-depth study of the development of network crawler system,web crawling strategy and GEO database development status.It provides theoretical reference and practical experience for developing and designing crawler systems which are suitable for crawling RNA related data in GEO database.2.Programming Languages Using Python language to write crawler.3.Database Using My SQL database to manage data.Results:1.This research successfully develops a crawler program,and the crawler program is put into operation.2.Crawler grab GEO database of all gene expression data,a total of 71032,and stored in the Mysql database.Conclusions:Crawler implements the automatic capture of the data of gene expression information in GEO database,and avoids the tedious downloading by manual,and realizes the large-scale downloading of data effectively.Mining efficiently from the massive information database of effective information or biological knowledge,web crawler help clinical researchers browse biomedical literature,volume allows data download,convenient and reference research and biological information query largely.The results of its capture not only play a significant role in promoting basic medical research,but also play an important role in human disease prevention and gene location.
Keywords/Search Tags:GEO database, web crawler, Python
PDF Full Text Request
Related items