| The unique pore structure gives zeolite materials excellent catalytic performance,brilliant thermal stability and unique tunability which make zeolite materials industrial status cannot be underestimated,in crude oil cracking,gas adsorption,ion exchange and other areas of great splendor.However,due to the diversity of raw materials for zeolite synthesis and the complexity of the reaction process,the zeolite synthesis mechanism has not yet reached a consensus,and the traditional synthesis method relies heavily on the chemical intuition of scientists,which has a high cost of trial and error.With the arrival of the information generation,the field of materials databases is bursting with unprecedented vitality.A considerable amount of scientific and technical literature contains abundant unstructured information,and the traditional manual extraction cannot guarantee the accuracy and high efficiency.Natural language processing fills the bridge between the unstructured information of scientific and technical papers and the regularized structured database information.It is the endeavor of many scientists to collate existing unstructured experimental data,construct a regular database of material synthesis,and utilize the massive structured data to explore laws that are not easily perceived by human beings.However,due to the high degree of text complexity and the variety of writing expression techniques,it is extremely difficult to efficiently and accurately extract zeolite synthesis information from massive data.Therefore,there is no text mining synthetic database in the field of zeolite yet.Construction of a workflow for automated construction of zeolite synthesis databases contributes to facilitating the exploration of novel zeolite materials.Classification of complex zeolite synthesis passages and targeted extraction compresses the high information entropy of synthesized content concealed in zeolite synthesis scientific and technical texts.Refining and integrating highly structured zeolite synthesis data is an optimal process for materials database construction.The exploratory meta-analysis,data mining,and machine learning of the highly structured database information as input data are expected to assist experimentalists in exploring the synthesis patterns and configuration-effect relationships,providing experiences and guidance for pertinent experiments.This thesis focuses on the development of a method for automatically extracting zeolite synthesis information and the construction of a zeolite synthesis database.We deconstruct textual information about zeolite synthesis in scientific and technical papers through rule-based natural language processing techniques.The automatic construction process of zeolite synthesis database is built.The data information is analyzed in depth using machine learning-based data mining technology to explore the zeolite synthesis law and provide theoretical guidance for experimental synthesis.The specific work includes the following three aspects:1.The automatic construction pipeline of zeolite synthesis database was constructed to realize the automatic extraction of scientific and technological literature to the synthesis database.The literature was first collected and uniformly nomenclature,after pre-processing,the synthetic text of zeolite was disassembled and classified.After that,a lexicon is built to assist in the labelling of information.Dictionary categorization of compounds that may be used in the zeolite synthesis process.The update module is also constructed to identify unclassified raw materials.And the passages are labelled for easy extraction and identification.Deconstruct the information of zeolite synthesis passages,formulate targeted strategies for each expression,process zeolite synthesis passages with different strategies,carry out effective information extraction for simple expressions,expressions containing proportions,and expressions containing unknowns,collect synthesis conditions,and formulate extraction strategies for the nature of the product’s name.The resulting database was evaluated and self-calibrated to find possible missing data and automatically supplemented with notes for each entry.The automatic construction process of the zeolite synthesis database has been established,which can continue to absorb newly published literature and extract information to supplement the database,realizing the automatic processing of data.2.The largest existing zeolite synthesis database has been constructed.It contains5031 synthesis paths extracted from the zeolite synthesis literature collected from January 2015 to December 2023,including 96 topologies.The database has a huge amount of data and a strict and uniform format,which can be accessed and searched by experimentalists and has certain practical attributes.The data distribution of the database was analyzed step by step,and the proportion of skeletal elements and the proportion of corresponding zeolite species were analyzed as a whole;the database was divided into silica-aluminum zeolite,SAPO zeolite and pure silica zeolite,and the data analyses were carried out for zeolite species,topological types,heteroatom types and commonly used organically oriented templating agents.Specific zeolite species were explored for the five most common types of zeolites,which helps to provide theoretical guidance for the synthesis of different types of zeolites in experiments.3.The relationship between synthesis parameters and product properties was investigated by applying machine learning techniques.The distribution of different kinds of zeolite synthesized without template in the ternary phase diagram was explored.The distinction between different silica-aluminum zeolite synthesized without templates in terms of the percentage of reactive elements and crystallization conditions were compared.Theoretical guidance was provided for the experiments and guided the exploration direction.The correlation between reaction parameters and product properties was explored for ZSM-5 zeolite and SAPO-34 zeolite.The data were also analyzed using machine learning models to provide theoretical guidance for the experimental synthesis of ZSM-5 zeolite and SAPO-34 zeolite with different product properties.With the combination of ternary phase diagram and decision tree model,the synthesis region corresponding to different silica-aluminum ratios of template-free synthesized ZSM-5 zeolite was successfully predicted,and the proportions of feeding elements corresponding to other points in the region can be used as another attempt to synthesize a specific range of silica-aluminum ratios,which can provide a new synthesis idea for the experimentalists. |