| With the rapid development of information technology,data is showing an explosive growth trend,and the application of databases is becoming more and more widespread,which drives the vigorous development of the entire database industry and continuously optimizes database performance.Database query optimization is crucial for database performance,and cardinality is the core parameter of query optimization,and its estimation accuracy directly affects the quality of query optimization.Traditional cardinality estimation methods need to be based on various assumptions and are not suitable for complex query optimization scenarios,which can cause significant estimation errors and seriously affect query performance.With the rapid development of artificial intelligence,learning based database technology has rapidly emerged,including learning based cardinality estimation,which can effectively handle cardinality estimation problems in complex query environments.However,the interpretability of learning based models used to evaluate cardinality is poor,leading to inaccurate cardinality estimation.This thesis proposes a causal inference based cardinality estimation method called Causalcard to address the issues of poor interpretability and significant errors in learning based cardinality estimation in scenarios with high data correlation.The main research content of this article is as follows:1.Build a tree model algorithm for query plans.By conducting research on mainstream articles related to cardinality estimation,analyzing and comparing them,the research plan is determined to start with a query plan tree,propose rules for constructing a tree model,carry out tree model construction and bottom-up optimization processing,and conduct in-depth analysis of model details.2.Propose a tuning algorithm based on causal inference.Through in-depth analysis of theoretical research on causal inference,the DAG with no trees algorithm was selected as the causal graph construction algorithm,and a cardinality estimation algorithm based on causal inference was proposed.The algorithm intervenes in the selection of the optimal operator algorithm in the tree model from the perspective of causal inference,evaluates the causal effects after intervention,obtains the current optimal operator node algorithm,updates the operator nodes from bottom to top,and obtains the cardinality estimation result of the entire tree structure model.3.The method has been extensively validated through experiments on both real and synthetic datasets,and compared with existing classical learned cardinality estimation algorithms and the cardinality estimation in mainstream databases.The experimental results show that this algorithm can effectively improve the accuracy of cardinality estimation.4.Based on the algorithm proposed in this article called Causalcard,a cardinality estimation system based on causal inference was designed and implemented.This system visualizes the cardinality estimation model and achieves online cardinality estimation.It can accurately estimate a given SQL statement in real-time and present the results. |