Font Size: a A A

Research And Design Of The Distributed File System Focused On Seismic Big Data

Posted on:2015-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:M H LuFull Text:PDF
GTID:2250330431450064Subject:Network Communication System and Control
Abstract/Summary:PDF Full Text Request
With the rapid development of scientific technology in modern society and the growing popularity of the Internet, people begin to understand this new definition of cloud. With the advent of the cloud era, big data has also attracted people’s attention. The applications of the big data have penetrated into all aspects of our society owing to the considerable progress of information technology. As for the seismic exploration, the amount of data created by seismic exploration has increased greatly in order to satisfy the social needs. Although the growing number of seismic data has well reflected the strong social demands for oil, natural gas and other resources, how to deal with those massive data has also brought a very serious challenge for seismic exploration.There are many aspects for the problems brought by seismic big data, including storage, read, redundancy, extraction etc. This paper mainly focuses on storage and read.In practice, reading the seismic data should take the users’specific circumstances into account in order to satisfy the users’needs, it is generally reflected by the speed and efficiency, and include the features of seismic data at the same time. As for these problems, this paper designs an architecture, which adopts the strategies of distribution and hierarchy. Distribution means distributed storage of seismic data: spread the whole amount of data into several nodes to store separately, and manage these nodes using one master node. Hierarchy means query the data hierarchically:in order to get the data, users should query these data in those nodes hierarchically from master node to storage nodes.As for the actual storage format in seismic data:SEG-Y, this paper makes some improvements based on this format, and then compares the new format with the old format, the results show that the improved format, to some extent, works better than the original one. Besides, based on the architecture proposed before, the paper adds two-level index structure into the architecture. In this case, users can quickly find out the specific location of data by querying the index and carry out reading operation, thus promising the speed and efficiency of what the users want. These are some related implementation details based on the strategies of distribution and hierarchy, and also the innovations of this paper.Above all the several researches which have been discussed before, this paper uses two kinds of distributed file systems to carry out researches:Fast DFS and Hadoop DFS. Based on the architecture using distribution and hierarchy strategies, combing the characteristics of the seismic data and the actual needs, this paper put all these elements into this two distributed file systems, to make them more capable of dealing with problems in the field of seismic exploration. What is more, this paper carries out several experiments to test the file operations by these new file systems, and also make some comparisons between these new file systems and the original ones. All the results show that the new distributed file systems created by this paper are more suitable for dealing with seismic big data, with a better reading speed, and the overall process can be very efficient at the same time. What is more, since there are some advantages when operating seismic big data using the new distributed file systems created by this paper, together with easy to operate and user-friendly at the same time, the new systems can have an extensive application prospect for seismic exploration.
Keywords/Search Tags:massive data, distribution and hierarchy, small files, SEG-Y format, two-level index, Fast DFS, Hadoop DFS
PDF Full Text Request
Related items