Font Size: a A A

Research On The Key Technologies Of Massive Data Storage For Solar Telescope

Posted on:2015-05-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B LiuFull Text:PDF
GTID:1220330422986086Subject:Astronomical technology and methods
Abstract/Summary:PDF Full Text Request
Currently, astronomical data processing technology has entered the era of data-intensiveastronomy informatics. Big data is a typical characteristic with large amount of data, fast datacapturing rate and continuous data growing in solar observation. Traditional local host data storagetechnology, such as DAS and other network storage technologies, such as NAS and SAN, performmany limitations under the background of astronomical big data storing, processing and datamanagement. It slows down the procedure of scientific research. Modern astronomical observationneeds advanced big data technologies to accelerate data processing. The storage system for thesedata processing technologies has to provide high performance and extendable parallel reading andwriting ability, efficient data indexing and querying and also should adopt to manage the fastgrowing of observation data.The New Vacuum Solar Telescope (NVST) has begun routine observation and produced over200TB solar observation data by using the mode of high speed, multi-channel and multi-wavelength. When two channels of photosphere and chromosphere are observed at the same timeunder proper observing conditions, the chromospheric channel can reach at the rate of60GB perhour, photospheric channel can reach the rate of190GB per hour. About2TB data can be producedin8hours continues observation. With high time and space resolution of data requirements ofNVST and multi-channel parallel working together in the future, single-direction writing speedcan reach at the level of TB per second. If the real-time data processing has taken into account, therate will be doubled. Through there are some storage technologies can achieve at goodperformance and can be extensible, but data characteristics of continuous storing ultimately limitthe use of these main stream technologies.Traditional local file systems such as Ext3, Ext4and ZFS are hard to satisfy the requirementsof NVST, so we need to find a storage technology which can manage massive data, has highperformance, be highly extendibility, can adopt to future data storage of NVST and can supportmassive high speed data processing. With devices like larger telescope in use, the storage systemneeds suitable technologies to support massive high speed data storing, reading and processing.Distributed parallel storage is the technology which can well satisfy these needs, becausedistributed architecture can supply high performance, parallel storing and has the ability of scale-out, which is more suitable for multi-channel, multi-waveband, high speed and massive datacontinuously growing like NVST. In this dissertation, key techniques of distributed storage are mainly researched. NoSQL based bitmap index is also studied to satisfy the needs of massive dataindexing and data retrieving.This dissertation research mainly covers the following aspects,1) Applying distributed storage to solar observation. We use experiments to verify thefeasibility of high performance and extensibility of distributed storage. We achieve at the dataacquisition rate of3.4GB/s by using bonding technology in the1GB network environment.2) High speed data storing may lead to inconsistency problem between metadata and datastored separately. How to take effective mechanism to keep the consistency of metadata and datais an ignored issue in data storage. This dissertation analyzed the reasons, the states and the modelsof the inconsistency.2PC algorithm is adopted to ensure the consistency.3) We design a distributed storage system called AstroFS based on the mechanism of RAID0under the network environment in order to perform high performance. Key technologies havecarried out. Such as data aggregation, splitting algorithms, data balance strategies and so on.4) This paper uses compressed word-aligned bitmap index to build index for massive solardata. We also design and realize an astronomical data archiving system (DAS) based on Fastbit.Compared to technique based on relational databases, DAS has many advantages, such as moreefficiently retrieval, faster index building and so on.The distributed storage and massive data retrieval technologies researched in this dissertationsatisfies the requirements of NVST data storing and management. The research methods also makea reference for the design of the massive data storage and data retrieval applications of the foreignand domestic large solar telescopes.
Keywords/Search Tags:Massive solar data, high-speed distributed storage, data consistency, mass dataretrieval
PDF Full Text Request
Related items