Chinese Resume Information Extraction And Requirement Matching Algorithm

Posted on:2022-04-01

Degree:Master

Type:Thesis

Country:China

Candidate:C Z Wang

Full Text:PDF

GTID:2518306476990759

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

With the fast development of big data,enterprise recruitment has gradually changed from the traditional offline recruitment mode to the online recruitment mode in recent years.Online recruitment has become the mainstream mode of enterprise recruitment with the advantages of low cost,easy operation and the ability to send resumes without leaving home.In addition,due to the impact of this year’s epidemic,many domestic enterprises have experienced a decline in benefits and a corresponding decrease in recruitment personnel,while the annual number of college graduates is gradually rising,leading to a more severe employment situation this year.The number of resumes received by enterprises is also far more than in previous years,which poses a greater challenge to the matching and screening of online resumes.In order to solve the problem that recruitment websites can’t realize automatic parsing and intelligent matching of resume information,this thesis proposes an algorithm of automatic parsing of Chinese resume,as well as a matching and screening algorithm for resume requirements in recruitment field.In this thesis,personalized matching is combined with enterprise recruitment scenarios.Based on automatic information extraction,personalized matching algorithm and evaluation algorithm,automatic matching of resumes is realized according to enterprise recruitment needs.Second screening can be carried out according to the enterprise’s individual needs of the employer to achieve the precise match between the enterprise and the job seeker.The main research contents and innovations of this thesis are as follows:(1)The text information will be extraction according to the hierarchical structure of Chinese resume.First of all,different file format resume unity to TXT format,secondly due to text the relatively unified format of resume and recruitment requirement,extracting keywords to block the text after the advanced research,text block is divided into the following two types:text block contains keywords and text block does not contain attribute keywords.Text blocks containing attribute keywords can be extracted in turn according to attribute keywords.For text blocks that do not contain attribute keywords,they need to be extracted according to experience customized rules obtained from research on resume data.(2)Matching the information between the recruitment text and the resume according to the semi-structured characteristics of the resume,divide the resume information into structured and unstructured information,and adopt the idea of "divide and conquer" for matching respectively.Different algorithms are used for different types of text in structured information: discrete numerical matching is used for numerical text;Domain knowledge text adopts ontology based domain knowledge algorithm;The text of post name is based on the text similarity algorithm between characters.The matching degree of structured text is obtained by the weighted summation of different attribute values according to the preference of enterprises for different attributes of applicants’ resumes.(3)Based on the pre-training model ELMO and sentence vector SIF,keywords were extracted from the resume text to achieve keyword retrieval and matching.Key words were extracted from work experience,self-evaluation and other contents.After extracting industry keywords with rich semantics,the keyword set of the resume was generated.According to the recruitment keywords provided by the enterprise and resume keyword set to match.(4)In view of the problems of simple and template information matching existing in the current resume online recruitment,the TOPSIS algorithm is used to conduct the secondary screening of resumes.Enrich the selection dimension of resume,according to the company’s different personalized recruitment needs,such as educational background,work experience,school honors and other personalized content,and screen out more suitable candidates.Compared with traditional 0-1 rule matching,the F1 value of the structured information matching algorithm proposed in this thesis is improved by 3%.Doc2 vec vector model is adopted for unstructured text.Compared with the traditional text similarity algorithm,F1 value is increased by more than 10%.Compared with the mainstream algorithm,the keyword extraction algorithm is more complete and the keywords extracted are more complete.The secondary filtering of resumes makes the information matching algorithm more practical.

Keywords/Search Tags:

Information extraction, Information matching, Keywords extraction, Evaluation algorithm

PDF Full Text Request

Related items

1	User Web Information Collection And Analysis System Based On The Smart Router
2	Design And Implementation Of Web Information Extraction Rules
3	Research On Related Technologies Of Domain Information Extraction
4	The Design And Implementation Of Web Information Extraction System
5	Study On Information Extraction And Analysis Of Logistics Bills
6	Research On The Method Of Deth Information Extraction Based On Binocular Vision
7	Research On Intelligent Information Extraction Approaches For Business Value Evaluation
8	Research On Chinese Keyword Extraction Algorithm Based On News Report
9	Adaptive Web Information Extraction Method Research Based On Ontology
10	Design And Implementation Of Information Extraction System Based On Improved TF-IDF Algorithm