With the development of digital informatization,medical data has exploded exponentially,making the medical industry one of the most data-intensive industries.According to IDC Digital’s forecast,the medical industry’s data volume will reach 40 trillion GB in 2020,which is expected to be 30 times the amount of medical data in 2010,These data have enormous potential value.“Difficult and expensive to see a doctor” has become a major problem at present in China with the huge population,mainly because medical care resources are scarce,their distribution is seriously uneven,and the efficiency of medical treatment is low,which cannot meet the growing needs of the people.From the supply level,quality therapeutic resources are more tense and more obvious.In the above context,the recommendation prediction system becomes one of the solutions to this problem.At present,the recommendation system has been extensively used in various industries and has achieved good results.However,with the increase of data volume and rich data relationships,the recommendation system also faces problems such as solvability,scalability,and cold start.In view of the above problems and the characteristics of medical and health data,this paper proposes a hybrid recommendation algorithm based on big data,and builds a disease prediction system based on this algorithm.The main research contents of this paper are as follows:(1)Acquisition of health data sets.The particularity of medical data lead to the absence of a publicly available standard data set,and the lack of clear evaluation of diagnostic results or symptoms by patients bring about the sparseness of the data,which leads to the problem of cold start.Generally speaking,the larger the data size,the more space it is.Therefor,High-quality data integration is an indispensable prerequisite.In this paper the required medical and health data is obtained from the business system through data collection,and these data are cleaned according to rules and standards.Finally,the data quality control algorithm is used to evaluate the quality of the data,so as to obtain high-quality medical and health data sets.(2)Research on hybrid collaborative filtering recommendation algorithms based on big data.At present,most of the research on cold start in therapeutic recommendation is mainly from the perspective of users,solved by classification,user feedback or the experts mark.Although these measures help to optimize the interests of users,they all need to be built-in advance,which not only requires a lot of effort but also affects the scalability of the algorithm.This article starts with content,established an association between users and diseases through content information(keywords),and uses the vast data to get the preferences of the public instead of the preferences of users.Based on this finding,this paper combines collaborative filtering and content-based recommendation algorithms in the context of big data,and proposes a hybrid collaborative filtering recommendation algorithms based on big data.The analysis demonstrates that the algorithm has good solvability,and the experimental results show that the algorithm studied in this paper can not only make recommendations for new users,but also has better performance than traditional algorithms.(3)The disease prediction system is built.Based on the above research,this paper builds a disease prediction system,applies the hybrid recommendation algorithm based on extensive data to the actual,and tests the prediction system to meet the daily use. |