| With the continuous development of natural language processing technology,semantic analysis has become a hot and difficult problem in the field of natural language processing,and frame disambiguation task,as a key part in framework semantic analysis,has also received extensive attention from researchers.At present,in the field of frame disambiguation,most studies regard it as a classification problem.Through the commonly use of machine learning in the existing classification model(such as support vector machine(SVM),the maximum entropy model,etc.)to classify the disambiguation target words,good results have been achieved.However,the existing classification models also have some problems.For example,the model classifies the target words as independent individuals,so it can’t make use of the implicit relationship between the target words effectively.The classification results depend on the model’s performance and parameter setting.Feature markers are independent of each other and have little correlation.All of the above problems affect the stability of frame disambiguation results.Therefore,this paper constructs a two-stage frame disambiguation model based on SVM and CRF,and introduces LDA topic feature as implicit feature of text.Experimental results show that the method proposed in this paper can effectively solve the above problems.Aiming at English frame disambiguation task,the main research contents and results of this paper are as follows:(1)This paper proposes an English frame disambiguation model based on SVM and CRF two-stage model.Different from the traditional classification model,SVM and CRF model are constructed as a two-stage model in this paper.The first stage SVM model carries out rough classification of the input corpus and gets the classification label sequence.The second stage of CRF model takes the text sequence and the classification label sequence of SVM model as input,and adds the classification label to the feature template for further sequence labeling.In this paper,35 representative ambiguous words in Frame Net are screened,and a total of 3492 sentences were extracted from Frame Net and ACL conference papers for frame disambiguation research.In the contrast experiment,the accuracy of frame disambiguation of the two-stage model proposed in this paper was compared with that of other four different two-stage models.The comparison results show that the proposed model based on SVM and CRF double optimal frame disambiguation accuracy,can be reached 82.71%,and than using the SVM classification label features and 2.68% higher than that of CRF double model,prove that the proposed twostage model to establish the link between the features,then improve the English frame disambiguation accuracy.(2)In this paper,LDA theme features are introduced based on SVM and CRF two-tier models.The method uses background corpus to extract LDA topic features and discretize them.Then,the processed LDA thematic features are taken as one of the semantic features of the text,and the supervised frame disambiguation is carried out in combination with the semantic features such as part-of-speech features and syntactic features.In this paper,70 full-text annotated corpora from Frame Net and 124 papers from ACL Conference were selected as experimental corpora,and some of them were selected as background corpora to extract LDA thematic features.The effect of the number of topics on the performance of the frame disambiguation model was verified through experiments.Meanwhile,in the contrast experiment,the frame disambiguation model with LDA topic features was more accurate than the frame disambiguation model with basic features only,which accuracy is 86.54%.The experimental results show that introduct the LDA theme topic feature can extract the implicit features of text,and also can improve the performance of the framework disambiguation model.In this paper,frame disambiguation based on SVM and CRF two-stage model is studied.The proposed method provides a new solution for English frame disambiguation task. |