Some topics in multivariate data analysis with special reference to rank order data | | Posted on:2005-10-20 | Degree:Ph.D | Type:Thesis | | University:Rensselaer Polytechnic Institute | Candidate:Sol, Sinan | Full Text:PDF | | GTID:2458390008980846 | Subject:Statistics | | Abstract/Summary: | PDF Full Text Request | | Rank order data is a common occurrence in social sciences research. There exists a need for straightforward multivariate methodologies that can be applied to these types of data sets, yet the research to date has produced a disparate mix of methodologies and results. The research presented in this study avoids many of the classical arguments presented against the analysis of ordinal scale data by reconsidering what the term data means for many types of multivariate analyses. The main idea of the technique is that in many multivariate analyses, the correlation matrix is the raw data for the analysis (Richie & Raghavachari, 2000). They used transformed rank correlation matrices instead of Pearson correlation matrix. However, in some cases, the transformation procedure led to non-positive semi-definite matrices. This research resolves the problem by obtaining the "closest" correlation matrix by formulating the problem as a mathematical programming technique, called semi-definite programming (SDP). The SDP-based approach is evaluated for principal component analysis and canonical correlation.; For principal component analysis, the results of this study agree with the results of Richie (2000). The transformed Kendall and Spearman matrices perform better than the non-transformed matrices. The results obtained by using transformed Kendall and Spearman matrices are very similar to the results obtained from Pearson correlation matrices.; Non-positive semi-definiteness (NPSD) is more widely observed and therefore a more significant problem for canonical correlation applications. The results obtained from the SDP-based method shows several significant differences from what Richie (2000) found. The concept of concordance is the main topic of the second part of the research. Q was introduced by Raghavachari (2002). In this thesis, the case of m independent normal populations with the same variance but different means is investigated.; The Canonical Correlation Coefficient (CCC) was proposed by Lin (1989) to measure the agreements between two sets of observations, Q is proposed here as an alternative to the CCC, and their comparison was carried out in this study. Asymptotic power considerations suggest that Q can be implemented as an alternative to the CCC. Raghavachari (2002) proposed Wtau, for rank order data associated with Kendall's tau. In order to gather more information on W tau, its exact distribution is given here for up to 6 judges and 3 subjects. For larger values of subjects a function of Wtau called the S statistic is proposed. It is shown that the asymptotic distribution of S can be approximated by a Chi-square distribution with (n-1) degrees of freedom when the rankings are independent. The correlation coefficient between Wtau and W is derived. It is shown that this correlation is independent of the number of judges and tends to unity as the number of subjects goes to infinity. | | Keywords/Search Tags: | Data, Order, Multivariate, Correlation, Rank | PDF Full Text Request | Related items |
| |
|