| Cardinality estimation is an important part of query optimization,which directly affects the selection of execution plan.With the explosion of data volume,the error rate of cardinality estimation in traditional databases has reached several orders of magnitude.In recent years,the booming development of machine learning technology makes it possible to apply artificial intelligence technology in the field of cardinality estimation.In recent years,research shows that the learning cardinality estimation method is far more accurate than the traditional model.However,the existing learning models have very high requirements for computing resources and data,and it is difficult to make accurate estimates in the case of multi-table connections.In the era of cloud database,the frequency of multi-table query application scenarios will only increase.How to use the learningbased cardinality estimation method to make accurate cardinality estimation in multi-table queries has become an urgent problem in the current industry.To solve the above problems,this paper proposes a Cardinality Estimator Based on Join Encoding Sum-Product Network and Multi-head Attention(JMA-CSB),which can effectively complete multi-table queries under the premise of ensuring accuracy and efficiency,and has good stability.First of all,the model uses the Join-encoding Sum-Product Network(JSPN)to learn the joint distribution of data,and constructs the relational connection graph between tables by calculating the Randomized Dependence Codeficient(RDC),and performs the graph embedding operation for tables with strong connections.Secondly,the model uses the multi-head attention mechanism to learn the query sequence and graph vector,and finally obtains the estimated cardinality.Compared with other learning networks,the training of attention mechanism takes less time and has higher accuracy.In addition,the model also adds positional embedding in the process of word embedding,so that the multi-head attention mechanism can obtain the position relationship of the word vectors in the sequence,thus solving the problem of information loss in multi-table query.In this paper,the accuracy and efficiency of the model are verified on the real data sets IMDB and JOB-light respectively.The experimental results show that the accuracy of the model is more than 10 times higher than that of the traditional cardinality estimation method;Compared with the learning cardinality estimation method,the delay of estimating a query cardinality can be reduced to less than 5 milliseconds,and the error rate is also the same as the most advanced learning cardinality estimation method.Moreover,the model is outstanding in multi-table query.On the JOB dataset with multi-join relationship,the estimation accuracy of the model is 1.29 times of the current optimal model.To sum up,JMA-CSB model has good efficiency and accuracy,and effectively solves the cardinality estimation problem of multi-table queries. |