| Geological borehole data is an important resource of national geological industry,which has the features of large amount of data,various types and high value.Based on the problems in the management and utilization of geological drilling data in our country,such as decentralized management mode,low efficiency of methods,difficulty in sharing and utilization,this paper put up a research on data management method of geological drilling based on distributed cluster through literature research,experimental comparison,algorithm improvement,and proposed a method using data management and query optimization to build the geological drilling data platform,which is efficient and convenient on data management and utilization.In this paper,the geological drilling data management method can be divided into platform architecture,drilling data organization and drilling data query optimization.First of all,we build a Citus DB distributed cluster architecture combined with the business process of borehole data management and literature research,which using node division and data fragmentation to form the system of data separate deployment,centralized management and comprehensive sharing,and also implemented data synchronization among multiple nodes through raft algorithm which is used to manage replication logs.Then,In the light of the spatial data of drilling,we proposed Improved Geo Hash index algorithm and uniform grid algorithm based on mesh edge that improved the efficiency on point peripheral judgment and point determination in polygon through calculating the envelope rectangle to filter the waiting points and saving intersection coordinate set.In order to solve the problem of storage and loading efficiency of high-resolution drilling histogram,a multi-thread slicing algorithm is designed.According to the drilling attribute data,the paper designs the storage mode of file and database,realized the data lightweight description with XML structured text,and improved the efficiency of data network transmission and publishing.Furthermore,in order to solve the problem of Low efficiency of drilling data query,this paper proposed a query optimization algorithm based on Improved Genetic Algorithm through Fisher Yates scrambling algorithm improving the quality of the initial population,the probability density of the slant selection operator keeping the high quality parent,and the optimal crossover operator selected to search the optimal connection path.Meanwhile we proposed a improved semi join based SDD-1 algorithm for distributed queries which used the equivalence principle of semi-join and relational algebra,we improved the efficiency of query by increasing the way of local projection and parallel execution to improve the half connection income,and minimizing the network transmission cost.Above all,the research results are applied to the design and development of the geological drilling service platform to achieve efficient management and sharing of millions of geological drilling data. |