| Since the introduction of Knowledge Graph(KG),with its advantages in knowledge density,expression ability,reasoning efficiency,etc.,it has made great progress in fields of finance,security,medicine,and agriculture.However,the applications of KG in military field are mainly in construction stage,and its in-depth application still needs further exploration.Meanwhile,Question Answering(QA)can display target information in a more intuitive and friendly way,so as to achieve excellent responses to military information demand.To this end,this thesis focuses on the technologies of Question Semantic Parsing(QSP)for QA on military knowledge.QSP aims to transform a natural language question into a logical form which can be executed on knowledge base.However,the task requires two crucial prerequisites: 1)a knowledge base and a QA corpus for Chinese military field;2)a QSP model with superior performance.Therefore,the main work of this thesis actually revolves around preparing the two prerequisites.However,the process faces the following challenges:(1)For preparing knowledge base and QA corpus.Existing candidates are usually aimed at English general field,it is necessary to construct a new knowledge base and QA corpus for Chinese military field.However,the facts in military field often involve elements such as time,space,quantity,state,etc.The existing methods of expressing knowledge in the form of triples cannot fully express these facts,while causing obstacles to knowledge storing and updating.In addition,QA on such facts introduces new complexity dimension,which existing corpus cannot fully support.(2)For preparing QSP model.The mainstream models are mostly based on sequence-to-action framework.These models have achieved significant results,but still have certain shortcomings.For example: 1)Action sequence generation and instance sequence generation are based on pipeline,facing error accumulation;2)They need to define complex action grammar,and annotate action sequence for each question;3)They need to design excessive templates to convert the action sequence and instance sequence into a query;During decoding,4)decoder cannot fully incorporate the sequence type information.5)the target vocabulary is too large.To meet the above-mentioned challenges,this thesis carries out the following work:(1)To solve the shortcomings of knowledge representation of triples,this thesis proposes to expand triples into multiples by adding attributes to relation,use multiples to represent knowledge.This thesis defines the knowledge represented by traditional triples as entity-centric knowledge,defines the knowledge represented by multiples as event-centric knowledge in a broad sense.Based on above arguments,this thesis constructs Mil KB,a Chinese military knowledge base,which covers both the two knowledge types.Based on Mil KB,this thesis constructs Mil KBQA,a Chinese military QA corpus.Mil KBQA contains various types of questions,involving reasoning of logic,quantity,calculation,and probability.Importantly,Mil KBQA introduces a new complexity dimension,that is,QA on event-centric knowledge.Three recent strong baseline models in natural language understanding(NLU)are used as benchmark models,and two other NLU datasets are used as contrasts to carry out QSP.The experiment proves that the questions in Mil KBQA,especially those for event-centric are accompanied with higher semantic parsing complexity.(2)To solve the shortcomings of sequence-to-action models,this thesis proposes Form Cypher,a Encoder-Decoder based QSP model oriented to the characteristics of logic structure.To address the error accumulation,the high cost of annotation,and the inability to directly obtain query,Form Cypher utilizes question format instead of semantic graph(tree)to represent the logic structure of question,and decomposes QSP into two parallel tasks,that is logic structure matching and slot sequence generation.Considering the insufficient integration of sequence type information in decoder,this thesis proposes a Type-guided GRU based decoder which exploits sequence types to guide the decoding process.To deal with the problem of excessively large target vocabulary in decoding,this thesis proposes a uniform encoding space mechanism,which encodes the schema elements and mentions in question in a unified space.With sequence-to-action model as the benchmark,this thesis verifies the performance of Form Cypher on Mil KBQA.Experimental results prove that the performance of Form Cypher on Mil KBQA is better than the benchmark.At the same time,this thesis also proves the validity and necessity of technical details of Form Cypher through verification experiment and ablation experiment respectively.Based on the QSP route and by carrying out the above work,this thesis constructs a QA model which can answer common questions in military field. |