| In the era of big data,data,as an important production factor,has been integrated into all walks of life,and all walks of life have begun to transform into data,and enterprises and society have an even more urgent need for data talents.Countless college graduates and social job seekers flock to the data analysis industry to catch up with the craze.However,most job seekers do not have the skills required by data analysts and do not know much about this position.This has led to the fact that although the number of job seekers for data analysis positions is increasing day by day,there are still obstacles for companies to recruit suitable talents,which eventually leads to a situation in which talents and job requirements do not match and there is a large talent gap on the demand side.The purpose of this paper is to analyze the specific needs of data analysis positions and the influencing factors of their salary,so as to help job seekers who are interested in related positions fully understand the requirements of the position,answer their doubts,and choose a suitable career path.This paper uses web crawler technology to crawl the recruitment information of "data analysis" related positions in the Internet recruitment platform giant "Worry-free",and uses text mining technology and econometric model to analyze the current demand for data analysis-related talents and the factors that affect the salary of data analysis positions..In terms of job demand analysis,this paper firstly explores the job demand status of "data analysis" through descriptive analysis and word frequency analysis;then uses the LDA topic model to extract the subject and corresponding keywords of data analysis job information,and explores job demand characteristics and benefits treatment.In the research of salary influencing factors,this paper selects the keywords in the results of the topic model as seed words,uses word2 vec technology to expand the seed words,extracts indicators from unstructured text,and constructs an index system of salary influencing factors together with structured indicators.Stepwise regression method and Lasso regression method are used for feature screening,and the screened indicators are used to establish an interval regression model to explore the influencing factors of salary and the salary premium brought by each factor,and analyze the impact mechanism of different variables on salary.In terms of the research on the influencing factors of salary,this paper is different from the previous research that fails to fully consider the influencing factors of salary.Word2 vec is used to calculate the semantic similarity,to expand the seed words selected from the keywords of the topic model,and to extract software skills from unstructured text.and professional and other related indicators,the factors considered are more comprehensive,and it also reflects the advantages of data fusion.In the research of analyzing the recruitment demand data obtained from the Internet,methods and models with weak interpretability are often used,such as clustering,association rules,etc.,and the obtained model results have poor interpretability and cannot be quantitatively analyzed.In this paper,the measurement model is applied to the data crawled from the web,and the results obtained are more interpretable,and the interpretation of the results can be more abundant.When the explained variables are data in the form of intervals,most of the processing is to take the median value of the interval to represent the explained variables,but this loses the effective information of the original data,and the model results are not accurate.Different from such research methods,this paper retains the upper and lower limits of the salary level,and adopts an interval regression model,which can better retain the original information of the data and obtain more reliable model results.Through the research of this paper,the following conclusions are drawn:The more developed cities,the more demand for data analysis positions;the demand for positions is mainly concentrated in the two industries of Internet/e-commerce and computer;a bachelor’s degree can be qualified for most data analysis positions;Sql,python and BI tools are The three most popular software for data analysis;statistics,computer and mathematics are the most popular in the data analysis industry;in terms of soft power,companies expect data analysis talents to have business understanding,market insight,communication skills,logical thinking skills,and learning skills In terms of hard power requirements,enterprises hope that data analysis talents can master machine learning and deep learning algorithm models,and implement these algorithms through programming languages.Personal education,major,work experience,hard power,the city and industry where the company is located,etc.all have an impact on the salary of the post.The salary premium for data analysis positions brought by education and work experience is the largest,followed by the city of work,the hard power of data analysis talents,and finally the profession and industry. |