Font Size: a A A

Analysis Of Discrimination Effect And Influence Factors Of Bayes Two-categories Linear Discriminant Analysis

Posted on:2015-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:S J XiaoFull Text:PDF
GTID:2180330434456109Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Bayes linear discriminant analysis is one of the most classicdiscriminant model. It is suitable for multivariate normal distribution andcould be influenced by many factors. The present study was focused onhow to choose the best function in accordance with certain researchobjective and contents, taking the data distribution into consider.Based on the check-up data of middle-aged population from ahospital, the simulation was performed with the use of Monte Carlo. Thecross validation misjudgment probability of every combination among2access to determine priors (equal and proportion),6population prevalencelevels (0.04,0.1,0.2,0.3,0.4,0.5),5learning sample size levels (50,100,200,500,1000) and different variable correlations (independent, mediumcorrelated, high correlated and full model) were systematically simulated.Univariate analysis, factorial design analysis of variance and linearregression were performed. As shown in the simulation result, access todetermine priors and population prevalence had obvious influence uponmisclassification rate. Using proportion as priors and lower populationprevalence bought lower misclassification rate. The effect of learningsample size and variable correlation were as much distinct. Based on Monte Carlo simulation and the related conclusion, realdata were used to perform verification. Real data verification included twoparts. Firstly, the route of Monte Carlo simulation was exactly followed.Sampling was performed upon the real data. Regarding variables wereselected to compare the misclassification rates of4models under differentsettings, including sample size and priors. The second part focused on3practical diseases. Human indicators concerning certain diseases wereused as independent variables to formulate discriminant model. Number ofindependent variables was4and sample sizes were50,100,200,500,1000respectively. Equal and proportion were used to determine priors.The result of verification was as followed. Exact same result with theexpectation was observed in part one. The minimum misclassification ratewas noticed when the sample size was200and proportion was use todetermine priors. The second part result was basically consistent withexpectation. Trend of misclassification rate when using proportion wassatisfactory. But not for using equal method. Model1and2presentedgood discriminant effect at size=200while model3was larger than200,although the decrease of misclassification rate was little.
Keywords/Search Tags:Bayes linear discriminant analysis, Monte Carlo, Datasimulation, Influencing factors, Misclassification rate
PDF Full Text Request
Related items