Font Size: a A A

Application Research On Astronomy Spectrum Classification Algorithms Under Distributed Environment

Posted on:2009-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y M XuFull Text:PDF
GTID:2178360245496447Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The spectra of celestial bodies contain important physical information of celestial bodies. Through researches on spectra, people can qualitatively or quantitatively measure the chemical components of celestial bodies, directly or indirectly confirm surface temperature, luminosity, diameter, and mass of celestial bodies and do research on radial movement and self revolution of celestial bodies. Thus spectral analysis plays an important role in astrophysics. After the expected completion of the LAMOST project, large amount of spectra of celestial bodies will be collected in each observation night. How to deal with these voluminous spectra and obtain useful scientific information becomes an important research topic.Data mining technology has been widely applied in many fields. Data mining is a course of extracting cryptic, unknown but potential useful information and knowledge that embedded in abundant, incomplete, noisy, fuzzy and random data. By data mining technology, the functions of correlative prediction, classification, and clustering, isolated point discovering and time-series analysis can come true. At present, many mining algorithms with high-dimensional data become research hotspots. The spectra of celestial data are also high-dimensional. Thus, data mining technology can provide good support for the classification of spectral data and parameter measurement.The astronomy spectral data is massive and stored in distributed way, and it need to be mined in parallel and distributed way to meet the demand of its needs. We divide the spectrum data into pieces and mine each segment of data in parallel or distributed way which stored in distributed way. In the process of mining, we use user agent to reduce the overlap computing, and use data mining agent to reduce the traffic between the computing nodes to improve the process efficiency.The main jobs of this thesis aim at classification of celestial bodies and main points are summarized as follows:(1) Create a distributed operating environment, the structures parallel computing environment based on MPI for the conduct of distributed classification algorithm for mining research. (2) Propose a Distributed Parallel Mining System considering load balancing, the proposed classification algorithm task allocation, in accordance with the network load and load conditions of computing nodes, to maximize the efficiency of parallel excavation. (3) Accord to the general process of Data Mining, use PCA dimensionality reduction processing to do feature extraction of late-type stars and quasars two quasars spectral data to meet classification needs. (4) Do research on SPRINT algorithm, and parallel algorithms to improve SPRINT, after the drop-dimensional spectral data in the distributed environment under the classification.
Keywords/Search Tags:Distributed Data Mining, Parallel Decision Tree Classification, Load Balance, spectrum
PDF Full Text Request
Related items