Font Size: a A A

Research On Processing Method For Seismic Waveform Data Based On Hadoop Platform

Posted on:2016-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:F M LiuFull Text:PDF
GTID:2180330461471614Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Since China pays much attention to the development of seismological field recently, seismic data analysis becomes the hot topic in the field of seismology. As science and technology develop greatly, precision of seismic data acquisition instrument is also advancing. Meanwhile the number of countrywide seismic monitoring stations is continuously increasing, which makes the amount of data collection vastly increase. Seismic waveform data collected by National Center of Digital Seismic Network is about 40 GB every day. Such massive data pose a great challenge for data storage and analysis. In order to give convenience to transmission, seismic waveform data are usually saved as SEED format. It is necessary to decompress SEED files so that we can obtain original sample series before analyzing. Original decompression method uses the thought of serial and each time it can just decompress one file, which is unable to decompress batch files. Since some interference signals are collected during the process of seismic waveform data acquisition, which will affect the accuracy of earthquake analysis, filtering the original sample series is thus essential to ensure the quality of seismic data analysis. Serial filtering method costs much time when facing massive data.Hadoop is an open-source software framework for distributed computing and one of the widely used technologies of cloud computing. As core modules, distributed file system can store massive data and Map Reduce programming model can process massive data in the parallel way and shorter time. So Hadoop platform can be used to solve problems above. The main contents of this paper are as follows:(1) Based on the format features of the seismic waveform data files, the feasibility analysis on the parallel processing of those data files running on Hadoop platform is carried out.(2) With the use of Map Reduce parallel programming model thought, this paper comes up with a design of the input format for matching the seismic waveform data files. Besides, it also proposes and implements the parallel decompression algorithm, which can process those files in batches. Among these, in order to assure the correctness of the data stitching, the arrangement order of the data records in the channels are confirmed by the secondary sort method.(3) After analyzing the principle of filtering operation, aiming at the decompressed data, this paper puts forwards a parallel filtering algorithm with combination to the features of Map Reduce parallel programming model. It fulfills the parallel filtering operation on the data in multi-channels.(4) A small Hadoop cluster is set up for the tests on the two algorithms mentioned above. Also, the experimental results and algorithm performance are analyzed later in the paper.
Keywords/Search Tags:Hadoop, Parallel decompression, Parallel filtering, Seismic waveform data
PDF Full Text Request
Related items