Font Size: a A A

Construction Of Prediction Models Of Dental Caries In The Elderly Based On Artificial Neural Network Technology And Validation Of The Generalization Ability Of Methodology Comparisons

Posted on:2022-07-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:1484306560498824Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective: Dental caries is one of the most common chronic diseases observed in elderly patients.According to the latest national oral health epidemiological sampling survey in China,the prevalence of dental caries in the elderly in China is 98.0%.The development of preventive strategies for dental caries in elderly individuals is vital.Therefore,it is necessary to find out the related risk factors of dental caries in the elderly,to build an effective prediction model,and to verify the generalization ability in the extrapolated population.The objective of the present study was to screen out the risk factors affecting the occurrence and development of elderly caries through the urban and rural elderly caries database of Liaoning Province,to construct a generalized regression neural network(GRNN)prediction model and a back propagation neural network(BPNN)prediction model for the risk assessment of dental caries among the geriatric residents based on artificial neural network technology,and to validate the prediction performance of the new models.Methods:1)A stratified equal-capacity random sampling method was used to randomly select 1144 elderly(65-74 years)residents(gender ratio 1 : 1)of Liaoning,China.Data for the oral assessment,including caries characteristics,and questionnaire survey from each participant were collected.Multivariate logistic regression analysis was then performed to identify the independent predictors.In the analysis of factors affecting the occurrence of dental caries,univariate analysis(Chi-square test)was used to select the independent variables with P < 0.05,and then all the statistically significant independent variables were included in the multivariate logistic regression model.Then the tolerance and variance inflation factor were used to diagnose the multicollinearity of the statistically significant variables screened by the multivariate logistic model.2)The variables with statistical significance of chi square test in the training set were taken as the input,and the outcome variables were taken as the output.The GRNN generalized regression neural network early warning model was established by using the neural network toolbox programming in MATLAB 2012 software.SPSS 22.0 was used to draw the ROC curve of the model prediction results.The statistical significance level was set as 0.05.The establishment of BP neural network was completed by RSNNS package of R software.The hidden layer of BP neural network was one layer.In order to find the appropriate number of neurons in the hidden layer,we started with three neurons in the hidden layer and increased one neuron at a time to 20 neurons.Sigmoid function was used as the activation function of hidden layer and output layer.The learning rate was set to 0.01.The maximum number of iterations was set to 1000.When the MSE of the verification set reached the minimum,the training stopped.The standard error back propagation algorithm was used to train the model.The prediction effect of GRNN generalized regression neural network early warning model,BPNN artificial neural network model and unconditional logistic regression prediction model were compared.The accuracy,sensitivity and specificity of the unconditional logistic regression and the GRNN early warning model and the BPNN artificial neural network model were compared,meanwhile,the area under ROC curve was analyzed.3)Using the independent data set different from the previous part,the established prediction model was used to predict the risk of elderly caries in the oral health sample survey database of Liaoning Province,Jilin Province and Heilongjiang Province in Northeast China(data from Stomatological Hospital Affiliated to China Medical University,Stomatological Hospital of Jilin University and Heilongjiang stomatological hospital),including the oral health examination data and oral questionnaire of each elderly subject.According to the survey data,1236 cases were included in the final results.For risk prediction,the classification consistency rate,sensitivity and specificity of GRNN generalized regression neural network model and BP neural network model were compared with the prediction results of unconditional logistic regression model to verify the prediction ability of the model.The area under ROC curve was analyzed.Results:1)A total of 1144 patients fulfilled the requirements and completed the questionnaires.The caries rate was 68.5%,and the main associated factors were toothache history,residence area,smoking,and drinking.In elderly individuals,a history of toothache in previous years(yes vs.no,OR = 1.550,95%CI: 1.164-2.063),use of a upper jaw dental prosthesis(yes vs.no,OR = 4.320 with 95% CI: 2.647 to 7.051),use of a lower jaw dental prosthesis(yes vs.no,OR = 4.420 with 95% CI from 2.477 to 7.885),smoking(yes vs.no,OR =1.469,95% CI: 1.084-1.992),and drinking alcohol(yes vs.no,OR = 1.591 with 95% CI from 1.130 to 2.240)were predictors for dental caries.On the other hand,living in a rural area(rural area vs.urban area,OR = 0.676 with 95% CI from0.503 to 0.908),good self-oral hygiene evaluation(good vs.not good,OR = 0.606,95%CI: 0.423-0.868)were protective factors for participants against dental caries.2)We randomly divided the data for the 1144 participants into a training set(915 cases)and a test set(229 cases).The optimal smoothing factor was 0.7.The optimal cut-off value for the predictive probability of the logistic regression model in the present study was found to be 0.606,with a corresponding Youden's index of 0.370.The optimal cut-off value for the GRNN model's predictive probability was 0.680,with a corresponding Youden's index of 0.638.The areas under the ROC curves for the logistic regression and GRNN models were 0.578 and 0.777,respectively,with corresponding P values of 0.056 and < 0.001 compared to the baseline.The P value for the comparison of the area under the ROC curve for the two models was < 0.001.The establishment of BP neural network was completed by rsnns package of R software.Fifteen variables with statistical significance were selected by single factor chi square test as the input of BP neural network,and the number of input neurons was fifteen.The outcome variable was used as the output of BP neural network,and the number of output neurons was one.When the number of hidden layer neurons of BP neural network was 14,the MSE of verification set reached the minimum,so the number of hidden layer neurons in this study was set to 14.The best cut off value of BP neural network model is 0.703,and the corresponding Youden's index is 0.591.When the best diagnostic value was selected,the area under the ROC curve of logistic regression model and BP neural network model were 0.578 and 0.721,respectively.Compared with baseline,the corresponding P values were 0.056 and<0.001.The area under ROC curve of the two models was statistically significant(with corresponding P value of 0.012).The GRNN model and BP neural network model were better than the unconditional logistic regression model in terms of accuracy.3)When the established GRNN generalized regression neural network model and BP neural network model were used to extrapolate and verify the risk prediction of three provinces in Northeast China,the results showed that the classification consistency rate,sensitivity and specificity of the two artificial neural network models were higher than those of the unconditional logistic regression model.The area under the ROC curve of unconditional multivariate logistic regression model in Jilin Province is 0.608,95%confidence interval is(0.544,0.673),with corresponding P value of 0.001.The area under the ROC curve of BP neural network model is 0.734,95% confidence interval is(0.675,0.793),with corresponding P value of 0.000.The area under the ROC curve of GRNN neural network model is 0.776,95% confidence interval is(0.719,0.832),with corresponding P value <0.001.The area under the ROC curve of unconditional multivariate logistic regression model in Liaoning Province is 0.672,95% confidence interval is(0.612,0.731),with corresponding P value of 0.000.The area under the ROC curve of BP neural network model is 0.816,95% confidence interval is(0.767,0.864),with corresponding P value <0.001.The area under the ROC curve of GRNN neural network model is 0.855,95% confidence interval is(0.809,0.900),with corresponding P value of 0.000.The area under the ROC curve of unconditional multivariate logistic regression model in Heilongjiang Province is 0.665,95% confidence interval is(0.607,0.722),with corresponding P value of 0.000.The area under the ROC curve of BP neural network model is 0.782,95% confidence interval is(0.731,0.832),with corresponding P value <0.001.The area under the ROC curve of GRNN neural network model is 0.817,95% confidence interval is(0.769,0.864),with corresponding P value of 0.000.In terms of consistency,sensitivity,and specificity,the GRNN model and the BPNN model were better than the traditional unconditional multivariate logistic regression model.Conclusions: Geriatric(65-74 years)residents of Liaoning have a high rate of dental caries.Residents with a history of toothache,using a upper or lower jaw dental prosthesis,drinking habits,a poor self-evaluation of oral hygiene,living in the city and smoking habits are more susceptible to the disease.The GRNN early warning model and the BPNN early warning model are accurate and meaningful tools for screening,early diagnosis,and treatment planning for geriatric individuals with a high risk of caries.
Keywords/Search Tags:cross-sectional studies, back propagation neural network, dental caries, generalized regression neural network, generalization ability, geriatrics, logistic models, oral health
PDF Full Text Request
Related items