Font Size: a A A

Efficient Time Series Data Generation Method For Large-Scale Astronomical Catalogs

Posted on:2019-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:K LiFull Text:PDF
GTID:2370330626452396Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,researches on time-domain astronomy have attracted wide attention.It has become the research hotspot in astronomical informatics to develop efficient software systems for time domain information extraction.Observation data of each celestial object is determined by identifying celestial objects among homologous calibrated catalogues and sorted them according to their shooting time.Thereby obtaining time series data of each celestial body,which lays a foundation for the subsequent light curves fitting and time domain information analyses.However,traditional database management systems and Hadoop ecosystem generate a lot of space overhead when preloading before query,which is especially unacceptable in the application scenario of massive data.Therefore,it is particularly important to design a time series data generation system for homologous astronomical catalogues.To provide astronomers with time series data of all celestial objects in catalogs,this thesis proposes an efficient method for generating time series data,and at the same time improves the utilization of astronomical data and the yield of results in the field of astronomy in time domain.In this thesis,we optimize the index and layout strategy of original catalogs to achieve efficient location and memory access of the subset to minimize the time and space cost of the data transfer.In the cross-match operation,we adopt a phased design strategy,which based on fusion of the location and magnitude information to realize the precise allocation of accuracy and efficiency in each stage,and it reduced the dependence on distance calculation in traditional location-based cross-match,thus greatly decreasing the amount of computations.This thesis uses hierarchical architecture to implement ETSGS(Efficient Time Series Generation System),which includes user-level data retrieval,operation-level task allocation and authentication calculation,and data-level catalogues data ETL(Extract-Transform-Load)preprocessing.In order to evaluate the performance of time series generation method based on massive catalogues data,real astronomical observation data were used to test the time series generation system for astronomical catalogs.The results show that ETSGS is nearly 11 times faster than using MySQL,especially when the data is massive.
Keywords/Search Tags:Time domain astronomy, Cross-match, Astronomical catalogue data, Parallel computing
PDF Full Text Request
Related items