Font Size: a A A

Prediction Of Depression In The Elderly Based On Machine Learning

Posted on:2024-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2544306923473094Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
At present,the aging of the population is a serious problem,and in the face of the huge elderly population,China’s aging industry has ushered in unprecedented development opportunities and challenges.With the policy as well as economic support at the national level,the next ten years will be an important opportunity and strategic window period for vigorous development of the aging industry,especially in mental health.Most previous studies analyze the influencing factors of depression in the elderly with social structure,while the prediction of depression in the elderly based on Machine Learning can still be further refined.To address such problems,conduct experiments on geriatric data and study the prediction of depression in the elderly based on Machine Learning.On the one hand,it can provide ideas for the industrial aspect of psychological protection of the elderly with certain economic value;on the other hand,it can provide time and space for timely identification and intervention of depression in the elderly,thus reducing the cost of social care and providing a reference for achieving efficient early warning of diseases in public health management.The main research work of this paper is as follows.1.Using the Random Forest to deal with the missing data and make the data more regular.After that,feature engineering is carried out to construct effective population characteristics by referring to correlative information,and feature screening is carried out by combining Lasso regression and Stepwise regression.2.Since the data in disease are often unbalanced and can affect the prediction effect of the model,two different approaches are used to deal with the imbalanced data based on the data level and the algorithm level respectively:on the one hand,we first combine the Synthetic Minority Over-sampling Technique(SMOTE)with the Random Under-sampling(RUS),which is a hybrid sampling approach called SMOTERUS,to deal with the imbalanced data and use cross-validation to find the optimal combination ratio.Then build two single Machine Learning models and two Ensemble Learning models respectively;on the other hand,from the algorithmic level,the Mean-uncertain LR model is used,which introduces the sublinear expectation to logistic regression without changing the distribution of the original data.The empirical results show that the Mean-uncertain LR prediction is much better than the logistic regression model based on SMOTERUS,and the Ensemble Learning models also have better integrated prediction effect.3.Based on the work above,an improved two-layer model is constructed.Two Ensemble Learning models are used separately,and the results of their leaf nodes are input to the Meanuncertain LR model as new features for classification to improve the prediction ability.The empirical results show that the two-layer prediction model of Gradient Boosting Decision Tree(GBDT)with Mean-uncertain LR has the optimal prediction effect on the dataset of this paper.Its weighted average Precision and weighted average F1-score are improved compared with original models,and the weighted average Recall is also relatively high,which effectively improves the performance of the model,enhances the generalization ability of the model,and has better results in the prediction of depression in the elderly.
Keywords/Search Tags:Depression in the elderly, Disease prediction, Ensemble learning, Sublinear expectation
PDF Full Text Request
Related items