Font Size: a A A

Research On Sentence Level Text Processing Technology For Military Field

Posted on:2019-09-04Degree:MasterType:Thesis
Country:ChinaCandidate:N W SiFull Text:PDF
GTID:2382330566471015Subject:Military Equipment
Abstract/Summary:PDF Full Text Request
Unstructured texts are the main carrier of our military's policies,orders,instructions and other information,and have played an important role in military information transformation.For a long time,the military text information is mainly processed artificially,which was limited by human expertise and work efficiency,and was increasingly incompatible with the developmental requirements of the military informalization and intelligentization.How to realize the automatical and intelligent processing of military text information has become an urgent issue to be researched.Natural language processing,which is an interdiscipline combining linguistics,mathematics and computer science,has gained a rapid development in recent years.It aims to process and understand human's language text accurately with computers.As an efficient text information processing method,natural language processing has a broad application prospect in military text information processing field.Based on analyzing the characteristics of military texts,this paper utilizes statistical learning and deep neural network models in natural language processing field,to conduct in-depth research on military text segmentation,part-of-speech tagging,and dependency parsing,aiming at handling unstructured texts into an intermediate form which is easy for the computer to understand,and lay the foundation for the following works.Due to the large number and length of field terms in military text,the performances of ordinary segmentation model are not satisfactory if used directly.To solve this problem,by analyzing the features of texts and terms in the military field,a word segmentation scheme combining statistical model with domain dictionary is designed.Based on the existing Conditional Random Field(CRF)word segmentation model,the scheme uses long word position tagging method for the long domain terms,and uses a special domain dictionary to recorrect the initial results to improve recognition rate of the field terms.The experiments were conducted on a small-scale domain corpus,and the results show that the scheme achieves better performance than that of the CRF direct segmentation result,and has good scalability.Aiming at solving the problem of relying on artificial feature in the traditional statistics based part-of-speech tagging model,this paper proposes an efficient attention-based Long Short-Term Memory(LSTM)for part-of-speech tagging.The attention mechanism was introduced in hidden layer to assign different weights to the hidden units at different moments.In this way,the hidden layer could pay more attention to important features.The state transition probability matrix was introduced in the output layer,which uses the transfer feature between tags to improve the ability to decode globally.The experimental results show that the tagging accuracy of the model is close to the existing best model.Furthermore,the model has a simpler structure and does not require human design features.Aiming at the insufficient attention for the global structure features in existing LSTM dependency parsing model,this paper proposes a parsing model combining global vector features.In this model,a segment-based pooling Convolutional Neural Network(CNN)is designed to extract the global vector feature,and add this feature into the LSTM dependency parsing model to improve its ability to watch globally.The experimental results show that compared with existing dependency parsing model which only use LSTM or CNN,the proposed model can effectively improve parsing accuracy,as well as maintaining the parsing efficiency.
Keywords/Search Tags:natural language processing, word segmentation, dependency parsing, conditional random field, neural network
PDF Full Text Request
Related items