Font Size: a A A

Research And Application Of NL2SQL Model Based On Semantic Path Attention Network

Posted on:2024-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:H R ZhouFull Text:PDF
GTID:2568307076485334Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
NL2SQL(Natural Language to SQL)technology automatically converts natural language questions into SQL queries in the database.NL2 SQL has a wide range of applications in daily life.The single-table query model in the current NL2 SQL task has achieved 93.0% accuracy on the WIKISQL dataset,surpassing the human level.However,the performance of existing multi-table query model is worse than single table model because of the complex relationship between the tables.At present,most multi-table query methods are to encode questions and database information into heterogeneous graphs,then generate SQL statements based on tree structure for decoding.These methods have the following problems: 1)When dealing with heterogeneous graph encoding,important semantic path information is ignored,which is likely to cause semantic loss.2)Connections can not be established for nodes with similar semantics,and important edges are ignored in heterogeneous graphs.The pruning strategy in the decoding process is not perfect,and it is easy to cause invalid searches.This paper conducts the following research on the problems found:1)This paper introduces the concept of semantic path to better represent the multi-level relationship between question sentences and tables,tables and fields.On this basis,this paper proposes an NL2 SQL model based on semantic path attention network to deal with the problem of missing semantics.The model adopts a two-layer heterogeneous graph neural network.The lower layer calculates node-level attention,and assigns different weights to neighbor nodes through node-level attention under the semantic path to distinguish important neighbor nodes.The upper layer calculates the semantic path-level attention,and distinguishes important semantic paths through the semantic-level attention network to fuse semantic information and enrich the encoding representation of heterogeneous graphs.2)When constructing a heterogeneous graph,this paper introduces weighted edges to represent the similarity between words,and considers the relationship between synonyms and nodes to connect important nodes in the heterogeneous graph.In the decoding stage,this paper proposes a new pruning strategy to improve the correctness of generated SQL from two aspects:the validity of SQL keywords and the legality of SQL syntax.Finally,this paper experiments the entire model on the public English data set.By comparing the benchmark model,the execution accuracy of the model in this paper reaches 76.1%,and the keyword matching accuracy reaches 75.8%.3)This paper further explores the application of NL2 SQL in the field.By applying the model in this paper to the data set in the electric power field,the model method in this paper is adjusted and optimized according to the unique problems in the field.Through comparative experiments with benchmark models in the field,the experimental results show that the model proposed in this paper can not only perform well in general fields but also has good domain adaptability,and can be easily extended to other fields to solve problems.
Keywords/Search Tags:two-layer attention network, heterogeneous graph encoding, semantic path, semantic analysis
PDF Full Text Request
Related items