Font Size: a A A

Research On The Influencing Factors Of Movie Box Office Based On Classification Model

Posted on:2022-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:M N CaiFull Text:PDF
GTID:2515306722481894Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Due to the continuous improvement of social living standards,followed by the pursuit of higher quality of life,the film industry as one of the representatives of the entertainment industry in the new era is more and more loved by the masses.However,due to many reasons such as inappropriate release time,inadequate production and publicity,there are also a lot of film box office revenue Waterloo phenomenon,which not only causes a waste of resources,but also affects the profits of production companies.Therefore,to explore the influencing factors of film box office is not only a practical problem concerned by the major film producers and investors,but also an interesting problem worthy of study by statisticians.This paper intends to use statistical and machine learning methods to explore and study the influencing factors of movie box office.The main work is as follows: in the first chapter,the research background,research purpose and significance of this paper are discussed and introduced.In the second chapter,the methods and theories of statistics and machine learning used in the analysis and research of this paper are given.In the third chapter,the python crawler technology is used to crawl data from Douban,cat's eye professional edition and other websites.The data set consists of1000 films released at home and abroad from 2011 to 2020 and from the film itself,production and distribution,review and scoring three angles to build the film box office influencing factors index system.In the fourth chapter,the box office is divided into five categories according to the numerical value,and then the logistic regression,decision tree,random forest and XG-Boost are used to model and analyze.Finally,we get that the box office of the first week,the number of people participating in scoring,the box office of the first day,the average number of participants and 3DIMAX have a significant impact on the box office.In the fifth chapter,the author gives the conclusion,shortcomings and feasible suggestions for film producers and distributors.The innovation of this paper is: from the data volume and its dimensions,it is more comprehensive than the existing research,not only limited to domestic films,but also has a large amount of data.New variables such as the average person time,film reviews,material play times are added into the research.This article also improves the index construction,from the perspectives of film itself,production and distribution,reviews and scoring to build the film box office influencing factors index system,introducing Snow NLP to establish emotional analysis factors.From the choice of research methods,this paper uses the classification model to explore the influencing factors of the film box office,and traditionally based on the regression method for factor analysis.From the conclusion point of view,a comprehensive variety of influencing factors based on the classification method,the box office of the first week,the number of people participating in scoring,the box office of the first day,the average number of participants and 3DIMAX have a significant impact on the box office.
Keywords/Search Tags:Box office, Emotional analysis, Logistic regression, Decision tree, Random forest, XG-Boost
PDF Full Text Request
Related items