| Geology is a natural science that studies how the Earth evolves.With the lithosphere as the main object of study,geology explores the material composition,internal structure,external features,interactions between the circles and the evolutionary history of the Earth’s layers,which are of great importance to socio-economic life of human.Geological research has so far accumulated large amounts of unstructured data,such as text,graphics,images,audio,video,etc.Among them,textual data is an important part of geological big data,containing rich expert experience and domain knowledge.How to effectively manage and fully utilize this knowledge,expand the cognitive space of geology,improve the intelligence,and complete cognitive ability of geology is an inevitable trend of current geology research.Therefore,this paper takes geological knowledge in natural language as the research object,aims at the spatio-temporal analysis and visualization of the earth evolution process,takes into account the variability of geological phenomena in spatio-temporal semantic scales,and establishes and evaluates the modeling and objectified representation methods of geological knowledge in unstructured data.The current spatio-temporal data models,such as real time GIS spatio-temporal data model,spatio-temporal cube model,object-oriented model,etc.are all organized and stored for different geological applications or practical geological problems from the perspective of data.The representation of geological action and spatio-temporal structure has been deepened with the degree of research;however,the existing spatio-temporal data models can hardly describe the “complexity” and “non-linear” characteristics of geological processes.How to effectively extract and express the factual geological knowledge that exists in natural language in the form of discrete data is beneficial to the deep mining and knowledge discovery of unstructured geological text data.In view of the background and current situation mentioned above,this Ph D dissertation takes geological knowledge in natural language as the research object,starts from the multitemporal scale cognition of geological phenomena,takes the objectified expression of geological phenomena as the core of the research,and constructs an objectified expression model of geological objects based on multi-scale spatio-temporal constraints;studies the effective extraction of geological knowledge in unstructured text,and provides data support for the constructed geological object knowledge model.In the information extraction based on unstructured text,this dissertation firstly starts from the basic Chinese geological disambiguation and establishes a Chinese geological disambiguation model based on domain knowledge and cyclic self-learning strategy to address the current problem of insufficient annotated corpus in the geological domain;then,a geological entity recognition model based on trigger words is established by using the trigger features of geological entities through an efficient sequence of Chinese geological disambiguation results.After that,a sentence-level relationship extraction model based on pre-trained language model(i.e.,BERT)and a crosssentence relationship extraction model based on document features and graphical convolutional neural network model(i.e.,GCN)are built for sentence-level and documentlevel relationship triad features,respectively,using a remote supervision strategy.Finally,based on the constructed geological knowledge base,this dissertation designs and implements a prototype application.Based on the visual analysis of geological application results such as stratigraphic cognition and distribution of ophiolite,the real-world value of the application and the potential of the constructed objectified knowledge model are discussed. |