Font Size: a A A

Research On N-gram Feature Selection For Structured Prediction

Posted on:2019-03-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L RenFull Text:PDF
GTID:1368330548955289Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Structured prediction,also called structured learning,aims to learn a complex structure from data.It is a research hotspot in the fields of natural language processing,data mining and machine learning.N-gram features largely reflect the “structure” information in structured prediction,and comprise a family of special and important features used in structured prediction.However,most of studies on n-gram feature selection treated them as ordinary features,without considering the nature of the n-gram features,and their essence is still ordinary feature selection.A few studies considered the nature of the n-gram features,but not enough;moreover,they focused on one or several types of features in certain tasks,and thus,their methods were not general.Therefore,there is very little research on n-gram feature selection for structured prediction and it can be said that it is almost a blank space.This thesis treats the n-gram feature selection for structured prediction as a new research topic,and aims to propose a general framework for automatically selecting n-gram features in structured prediction.The thesis presents a prototype framework in the introduction,which determines the feature selection type(wrapper-based),the basic unit(a feature template rather than a feature function),the search strategy(heuristic method),and the search order(bottom-up),and then discusses the possible problems caused by the prototype framework,including efficiency,robustness,and overfitting.Then,the thesis gives the solutions to each problem.The main contributions of the thesis are as follows.1)We define the n-gram feature templates used in structured prediction,studies their property systematically,and then discusses and empirically verifies the rough significance distribution for the n-gram features.2)We propose a single n-gram feature selection method(SNFS).It consists of three sub-methods,i.e.,n-gram ranking,horizontal search,and feature template pair combination.Among them,the most critical is the feature template pair combination algorithm,and its core idea involves using the rough significance distribution for the n-gram features to estimate the two most likely candidates in the subsequent step.We compare these two candidates and their union,and then we can estimate the detailed significance distribution in a local space according to the result of the comparison.Finally,we prune the search space efficiently according to the detailed significance distribution in the local space.3)We propose a multiple n-gram feature selection method(MNFS).The SNFS method processes one type of n-gram feature at a time.When k types of n-gram features need to be selected in a task,it must be run k times,and the union of the selected feature templates is the goal feature template set.However,this independent selection method does not take into account the interdependencies among multiple types of n-gram features.Thus,it tends to select extra(redundant)feature templates.The MNFS method solved this problem.We compared the MNFS method with other classic wrapper-based methods.The experimental results showed that the MNFS method achieved feature selection performance that was close to those of other classic wrapper-based methods.Moreover,the MNFS method was extremely efficient,the most resistant to overfitting,and the most robust.4)We propose a general method to speed up the wrapper-based feature selection process.The core idea involves “relaxing” the parameters closely associated with training time,and using a similarity metric TopMatches to balance the accuracy of the model and feature selection performance.A coordinate descent algorithm is used to search for a set of balanced parameters.5)We propose a path-constrained Viterbi algorithm to substitute the use of the timeconsuming state-transition features used in structured prediction,thereby raising efficiency of the feature selection.
Keywords/Search Tags:Structured prediction, Feature template, N-gram feature, Feature selection, Natural language processing
PDF Full Text Request
Related items