Font Size: a A A

A Noval Rank Based Non-parametric Method For Longitudinal Ordered Data

Posted on:2015-03-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhuangFull Text:PDF
GTID:1224330431970096Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Research background and objectiveLongitudinal ordinal data, also known as ordered categorical longitudinal data, belongs to the class of longitudinal data, and it is obtained by repeatedly measuring each individual or entity subject to the orderly classification at different times, in other words, it is a repeated measurements data with cross-sectional data and time series together. Such data may reflect not only differences between treatments at different time points (cross-sectional effects), but also the change to reflect this difference in trend (longitudinal effect) over time. Longitudinal ordinal data are very common in the field of medical research, such as the overall efficacy evaluation (cured, effective, effective, ineffective) in follow-up data.The approaches for quantitative analysis of longitudinal data are more and more mature, but these methods demand the distribution of the data with a more stringent restrictions. And there is for a deficiency of effective methods of analysis for longitudinal ordinal data.Therefore, this study intends to propose a method to involve the longitudinal ordinal data with two factors based on the idea of rank, and build nonparametric rank statistics for corresponding effect according to the graphical rule presented in the null hypothesis. And it will be used to deal with repeated measurements of complex experimental designs, analysis not only the main effect of each factor, but also the interactions between them. The advantage of proposed method is no distribution constraints, which applies to a wider range, and we hope provide new ideas for the analysis of longitudinal ordinal data, supply new approaches and strategies to improve the limitations of existing methods.MethodsThe study is divided into two parts. The first part is the theoretical derivation of the non-parametric statistical method. The contour features of the data in the form under HO is used to a mathematical formula to express. We construct interaction effect statistic, the main effect Statistic that’s non-significant in interactions effect, and the main effect statistic that significant in interaction effect. We derive the statistical distribution of those statistics, calculate the corresponding degrees of freedom, and derive the statistic "Correction coefficient" between the nominal data and the more rank. The second part is non-parametric method validation and comparative phase. It randomly generated simulation data, set parameters, estimate type I error rate α value and the power1-β value under different parameters; introduction of common rank converted repeated measurements ANOVA statistical method, compared with the number of non-parametric methods, and discuss the impact of the sample size of the two methods, further illustrate the advantages and disadvantages of each method.The first part, the theoretical derivation of the statistic construction.Statistic for interaction effect:HO is the null hypothesis, therefore we assume longitudinal data has no interaction effect but exists between the treatment effect and the performance that the effect of trends in different time under HO.Accordingly, the tendency chart will be charactered as coincident or parallel in profile. Due to the difference between the level of disaggregated data represent, it does not have a difference the size of the quantitative distinction, so first serialize rank of the data, and then parallel to the characteristics expressed by mathematical formulas. The serialization methods of the Rank is relevant to statistic construction. So the statistics to describe the relationship of interaction is parallel to the trend line graph, thus A, so B is in the sample size of N samples of C rank. Therefore, the statistics to describe the relationship of interaction is parallel to the trend line graph, thus R1·2-R2·2=R1·3-R2·3=…=R1·T-R2·T=0, make Rg1t,Rg2t,…,RgNt is the rank of the samples Δg1t,Δg2t,…,ΔgNt, sample size is N. Δgit=Ygit-Ygit-1wherein g said groups,i said individual, t represents time. If this parallel relationship occurrence possibilities is small, then reject HO. On this suppose, according to the central limit theorem, we calculate the expectation and variance of these mathematical formulas to make standardized transform the square, their obedience distribution, Rank and Verification often need to make the correction processing the longitudinal level data, the main purpose is to correct the affect to the statistic which brings from the variance of a random variable decrease due to the same more rank. In other words, we observe the same rank occur under many circumstances, the test statistic overall average does not change, but the variance is small, and thus the statistics changes, medical statistics put this change called" correction". Based on the statistics we can make further "correction" to its denominator, the " correction coefficient" is deduced,ωtk represents the number in the time t of the same rank k in the N rank, so interaction test statistic isSimilarly we get test statistic of treatment effect and the time effect. Since the main effect in graphical feature interaction effects are statistically significant under performance or not is different, so two main interaction test statistic results were given based on the different test statistics. it’s also make a "correction" to the same rank.Treatment effect test statistic. When the interaction effect was not statistically significant, it’s When same rank is more,"correction cofficient" was deduced, so it’s When the interaction effect is statistically significant, When the same rank is more," correction cofficient" deduced, so there’sTime effect test statistic. When the interaction effect was not statistically significant, it’sWhen the same rank is more," correction cofficient" was deduced, so there When the interaction effect was statistically significant, When the same rank is more," correction cofficient" was deduced, so there the second part, Analog portion of method validation evaluation. This research is do some comparison between the nonparametric rank-based approach proposed and The method which is now more commonly used in the treatment of longitudinal data classification repeated measures analysis of variance rank conversion.On the one hand by comparing Normally distributed data and Uniformly distributed data, selecting a different standard deviations and the correlation coefficients, Investigate the effect of Distribution Type, The size of individual variability and Correlation between the strength of repeated observations of the two methods; On the other hand, for uniform distribution of data, by changing the sample size, we study the stability of two different methods of under the circumstances. We get the performance statistics from the proposed methods by statistical evaluation, and the comparations between two methods.Resultstype I error rate a:1) Test for the interaction, when the sample size n<30, the value of non-parametric methods a is small, as the sample size increases, the bias disappears; While the standard deviation and correlation effects on the size of the interaction effect of differences is very small, almost negligible.2) Test for treatment effect when interaction meaningless, the smaller the standard deviation, the more stable non-parametric methods, and as the standard deviation becomes larger, the result appears greater volatility;Contrary to the results of analysis of variance.When meaningful interaction, in the case of non-parametric method n<30, a values are small. When the sample amount increases, the bias resistance disappears. With the increasing standard deviation, variance analysis of fluctuations in the value α becomes large, suggesting that large individual variations, analysis of variance using the rank transformation results less stable.3) Test for time effect, when interaction meaningless, the correlation coefficient is small, and the sample size is small, the a value of non-parametric method is also smaller, relatively significant level of0.05will be biased. When the sample size increases, the bias sexual disappeared; while the correlation coefficient increases, even if the sample size is larger, the variance analysis will produce bias, variance analysis prompted rank conversion when dealing with inter-individual variation in repeated measurements of small strong correlation, so it need to be cautious to use. When meaningful interaction, when the correlation coefficient is small, the a value of non-parametric methods occasionally small, and as sample size increases, the case will be weakened;When the correlation coefficient becomes larger, the bias appears only in the case of n<30, as sample size increases, α bias disappears. power1-β:1) For the interaction effect, when the small sample size, test the effectiveness of the two methods are very similar, non-parametric analysis is slightly better than the variance methods; With the sample size increases, especially n>60, the advantage of non-parametric analysis is more obvious than the variance methods, and non-parametric methods of testing the effectiveness increases with sample size increases, when the sample size of100basic stabilized, tend to test the effectiveness of90%. Effect of the method and the correlation coefficient can be approximated ignored.2)For the treatment effect, whether it is meaningful interaction effects or not, nonparametric methods are superior to analysis of variance in each sample, and the greater the variance, testing the performance differences between the two methods is also greater. 3)For the time effect, when the interaction meaningless, small sample size, the performance analysis of variance test method is superior nonparametric methods; As sample volume increases, particularly n>60, the two methods’ esting the performance are substantially identical. When meaningful interaction, different sample sizes, analysis of variance test performance is always better than non-parametric methods, but the gap decreases between the two ones with sample size increases, and power of the two methods in the case of n>60are both stabilized.Conclusion1)Nonparametric rank-based method can analyze longitudinal data, it is not restricted by data distribution, especially for non-normally distributed data, such as vertical level data.2) Non-parametric method for rank-based interaction establishes a separate test statistic. This method can effectively analyze interactions and get statistical interaction inference, improve the previous method which make the analysis of each effect mix, but can’t give each the effect of each result.3)Non-parametric rank-based approach build on the basis of the central limit theorem, so the application should be met n>30andwhen n>60,the result is more robust.4) Non-parametric rank-based methods in testing the power of the treatment effect is not ideal, we can improve and perfect it in later studies.5) It is now more commonly used classification of longitudinal data processing conversion rank repeated measures analysis of variance, but if the data processing approximately uniform distribution of each grade, considering the method of individual variation in size and in strength of repeated impacts of correlation between the observations, so it should be used cautiously.
Keywords/Search Tags:ordered longitudinal data, ranks, non-parametric method, CentralLimit Theorem
PDF Full Text Request
Related items