Research Of Kazakh Parsing Based On Span

Posted on:2020-06-29

Degree:Master

Type:Thesis

Country:China

Candidate:W Chai

Full Text:PDF

GTID:2415330590954692

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of neural network technology,parsing on Kazakh has made great progress.From the rule-based parsing and statistical-based parsing methods have gradually integrated to neural network technology,the accuracy of parsing on Kazakh improves a lot.Parsing technology divide into two main method: transition-based parsing and chart based parsing.In this paper,the methods the author used based on these two methods and the accuracy on Kazakh parsing improves.In this paper,the span is the minimal unit on the transition-based method.There are two main operations in the shift-reduce system: structure action and phrase label action.The structure label is storing the split points of the span,and the phrase label is mainly for the label the phrase tag of the span.Based on this transition system,using the Bi LSTM neural network to acquire the span feature,and training parameter trained with the multi-layer perceptron.We use dynamic programming,the greedy algorithm and the beam search to decode,and efficiencies are compared with each other.According to the experimental result,we can get the following conclusions:1)When using the Bi LSTM neural network to acquire the span feature,the two-layer Bi LSTM can acquire more feature information than the only one layer Bi LSTM.2)When the greedy algorithm is used in decode,the decode speed is faster,but the accuracy is not well.When using the beam search for decode,the parsing accuracy is well.3)Choosing the appropriate beam size on decode is very vital,the beam size will affect the accuracy of parsing.Through the experiment in this paper,we choose the beam size is 20.We also use the chart based parsing method,Bi LSTM neural network is used for feature acquiring.Multi-layer perceptron is used for training the parameter.The structure score and label score are separately to train,and the penalty function is set separately.The CKY algorithm is selected to decode.In the chart based parsing,the Bi LSTM hiddenlayer size influence the parsing accuracy and sentence length is also effect parsing.The following conclusions can be get through the experiments:1)when choosing the size of the hidden layer of Bi LSTM,the number of hidden layers increase,the parsing result also increase.But the number of hidden layers more than 200,the parsing accuracy improve is not obvious,so we choose the number of hidden layers is 200.2)Through experiments result,the length of the sentence also effect the parsing accuracy.Usually the longer sentence not have the good result.The main reason is that the long sentence have the complex phrase structure,so the parsing is difficult.

Keywords/Search Tags:

span, BiLSTM, multi-layer perceptron

PDF Full Text Request

Related items

1	Design And Implementation Of Virtual Restoration System For Damaged Cultural Relics
2	Research And Application Of Individual Multi-layer Movie Recommendation Algorithm Based On The Graph
3	Research On Multi-layer Complements In Mandarin And Its Classroom Teaching Of Chinese As A Foreign Language
4	A Study On The Multi-layer In Chopin's Preludes (Op.28)
5	The Multi Material Application And Performance In Tempera Substrate Layer
6	Cross-Layer Interaction Network For Chinese Calligraphy Style Classification
7	The Relationship Between Perception, Memory And Reading
8	Study Of Chinese Classcial Literature Text Theory
9	A Study On Word Order Of Multiple One-layer Prepositional Phrases In Modern Chinese
10	Research On The Emotion Of Infographic Design