Font Size: a A A

Research And Implementation Of Chinese Resume Parsing System Based On Deep Neural Network

Posted on:2019-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:W LiFull Text:PDF
GTID:2428330590465546Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Resume is a kind of common and important text in life.It is essential for many applications,such as job recruitment,association analysis and so on,to analyze the personal information,education,work experience and other entity information and processes them in a structured manner.It parses the personal information,education,work experience,and other entity information and processes it in a structured manner.It is very important for applications such as job search and recruitment and related analysis.Although the existing resume parsing system introduces the statistical model,it still mainly relies on artificially customized rule templates,which results in the difficulties of its development and maintenance,poor generalization ability,and difficulty of transplantation.It has been difficult to cope with the current demand for large amount of resume processing.In order to solve the above problems,this thesis studies the method of resume annotation based on deep learning,uses the powerful learning ability of deep neural network,and implements the corresponding Chinese resume parsing system,as follows:1.After the systematic analysis of the current resume parsing techniques,this thesis proposes a resume annotation parsing method based on neural network and probability graph model.First,the unstructured resume text is preprocessed into a word sequence.The word vector table,obtained by Word2 vec training in large-scale corpus,is used to map the words into low-dimensional real number vectors.Then the bidirectional LSTM layer is used to merge the context information of the unlabeled word.The CRF layer introduces tag constraints to generate the optimal tag sequence,and finally uses the tag matching to parse out the corresponding resume entity.The model is trained using the SGD algorithm and supplemented with Dropout to prevent overfitting.This method not only does not rely on artificially customized feature templates,but the experimental results show that compared with the previous methods,the labeling performance F1 value increases by nearly 8%.2.Based on the analysis of Chinese word segmentation,Chinese characteristics,and the word sequence annotation experiment based on neural network,a neural network annotation method using attention mechanisms to fuse word features is proposed.This method uses the previously proposed neural network word sequence as the main frame,introduces the fusion of the attention mechanism to dynamically regulate the fusion of word sequence features,and integrates the feature of the word vector to further mine and utilize all of the feature information to improve the labeling effect.The experimental results show that this method further enhances the effectiveness of resume annotation.3.Based on the core model design and experimental analysis of the resume in the previous two parts,a Chinese resume parsing system is designed and implemented based on the actual application requirements.The whole system mainly includes data preprocessing,model training,reasoning labeling,labeling correction and other modules.
Keywords/Search Tags:resume parsing, sequence labeling, long short-term memory, conditional random fields, attention mechanism
PDF Full Text Request
Related items