Font Size: a A A

The Application Of Estimating The Degree Distribution Of Complex Network In Predicting The Occurrence Of Influenza Virus

Posted on:2015-03-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y J ShengFull Text:PDF
GTID:1220330467965590Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
The theory and application of complex network have experienced more than half a century of development. Although many important models accurately describing the real-world networks, there are a lot of defects on the study to its basic theory yet, espe-cially the estimation to the degree distribution of complex network. The influenza virus has caused a serious threat to human being for many years, and the number of potential subtypes maybe198by now, which is against the research and development of drugs and vaccines and the design of prevention and control measures. To this end, this paper develop firstly a new method, called as NEPEDRE, estimating the degree distribution based on the relative entropy; Then based on our classification method to the influenza virus, NEPEDRE help us find that the time interval queue of the influenza virus infec-tion has dynamic power-law behavior obviously; Finally, to design a growing model predicting the next occurrence time of the influenza virus very well. Particularly,Chapter1reviews specifically the development history and research status of com-plex network, to sum up various difficulties encountered and possible countermeasures in modern. And outline briefly the main contents of this thesis.The main work of this paper establishes on some basic knowledge of graph theory and complex network, so that we introduce the graph theories and statistical properties of network in chapter2. Primarily, we give statistical properties and generation pro-cesses of some important models, such as rule network, ER random graph, small-world network, scale-free network and fitness network model, which has important inspiration significance for the presentation of our growth model.Chapter3is mainly about estimation to the degree distribution of complex net-work. After analyzing the defect of the related research, we put forward a new method NEPEDRE to judge the degree distribution of real-world networks based on the relative entropy. For simulated data sets, NEPEDRE can grasp accurately the parameters of the power-law distribution, and performs very well in determining whether a real-world network has power-law behavior or not. This method is easy to operate and simple to understand. At the same time, it gives a personalized bound for each considered net-work, an upper bound to reduce time complexity in some large-scale networks and an lower bound for excluding networks with insufficient data. In addition, compared with Newman2009, we also find unexpectedly three basic theories suitable for regulation to monopoly market.Currently, the number of influenza virus subtypes maybe198, which is against the research and development of drugs and vaccines and the design of prevention and control measures. Chapter4proposes a new classification strategy from the point of the strain family by our sequence alignment software MCABMSA. Only according to align the HA fragment sequence of influenza virus, it identifies comparatively few28kinds of strain families, which will be of great benefit to the future research.Based on the view of the strain family, Chapter5finds that the time interval queue of influenza virus infection has strictly the dynamic power-law behavior by NEPEDRE, and there is a strong linear relationship between the scale parameter of power-law distri-bution and the logarithm of the corresponding infection number. Under the assumption of the same linear relationship of each family, we design a growing model for the time interval queue of influenza virus infection. This model simulates the occurrence time of influenza virus very well with3days prediction error on average and about80%less than3days. We predict10times consecutively for the virus strain H7N9belongs to. Compared with update data in GISAID, the prediction errors are2,6,10days respec-tively, and the remaining results need time to be verified due to the lack of update data. Of course, we also can forecast only once each time. In that case, our prediction error is just2days, which should be very accurate. In fact, our forecasts are relatively later than real date, so that, according to prediction date by our method, it is more appropriate to carry prevention and control strategy about3days in advance.
Keywords/Search Tags:Power-law, Relative Entropy, Degree Distribution, Complex Net-work, Influenza Virus, Time Interval Queue, Growing Model
PDF Full Text Request
Related items