Some topics in multivariate data analysis with special reference to rank order data

Posted on:2005-10-20

Degree:Ph.D

Type:Thesis

University:Rensselaer Polytechnic Institute

Candidate:Sol, Sinan

Full Text:PDF

GTID:2458390008980846

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Rank order data is a common occurrence in social sciences research. There exists a need for straightforward multivariate methodologies that can be applied to these types of data sets, yet the research to date has produced a disparate mix of methodologies and results. The research presented in this study avoids many of the classical arguments presented against the analysis of ordinal scale data by reconsidering what the term data means for many types of multivariate analyses. The main idea of the technique is that in many multivariate analyses, the correlation matrix is the raw data for the analysis (Richie & Raghavachari, 2000). They used transformed rank correlation matrices instead of Pearson correlation matrix. However, in some cases, the transformation procedure led to non-positive semi-definite matrices. This research resolves the problem by obtaining the "closest" correlation matrix by formulating the problem as a mathematical programming technique, called semi-definite programming (SDP). The SDP-based approach is evaluated for principal component analysis and canonical correlation.; For principal component analysis, the results of this study agree with the results of Richie (2000). The transformed Kendall and Spearman matrices perform better than the non-transformed matrices. The results obtained by using transformed Kendall and Spearman matrices are very similar to the results obtained from Pearson correlation matrices.; Non-positive semi-definiteness (NPSD) is more widely observed and therefore a more significant problem for canonical correlation applications. The results obtained from the SDP-based method shows several significant differences from what Richie (2000) found. The concept of concordance is the main topic of the second part of the research. Q was introduced by Raghavachari (2002). In this thesis, the case of m independent normal populations with the same variance but different means is investigated.; The Canonical Correlation Coefficient (CCC) was proposed by Lin (1989) to measure the agreements between two sets of observations, Q is proposed here as an alternative to the CCC, and their comparison was carried out in this study. Asymptotic power considerations suggest that Q can be implemented as an alternative to the CCC. Raghavachari (2002) proposed Wtau, for rank order data associated with Kendall's tau. In order to gather more information on W tau, its exact distribution is given here for up to 6 judges and 3 subjects. For larger values of subjects a function of Wtau called the S statistic is proposed. It is shown that the asymptotic distribution of S can be approximated by a Chi-square distribution with (n-1) degrees of freedom when the rankings are independent. The correlation coefficient between Wtau and W is derived. It is shown that this correlation is independent of the number of judges and tends to unity as the number of subjects goes to infinity.

Keywords/Search Tags:

Data, Order, Multivariate, Correlation, Rank

PDF Full Text Request

Related items

1	Multi-user Multivariate Multi-order Markov Multi-modal Prediction And Its Efficient Computing
2	Efficient Visualization Of Multivariate Simulation Data
3	Simulation of an equivalent reduced order system from large, imprecise, and uncertain data system using multistage multivariate analysis and neuro fuzzy approach
4	Implementation And Analysis Of Rank Attack In Multivariate Public Key Cryptosystem
5	Method And Application Research On Association Analysis Of Multivariate Data Based On Maximal Joint Information Coefficient
6	Research And Application Of Multi-set Canonical Correlation Analysis Based On Low Rank Theory
7	Developments in rank correlation procedures for trend detection in the analysis of water quality parameters
8	Research On Time Series Data Analysis And Network Compression Based On Tensor Calculation
9	Application of multi-technique correlation and multivariate analysis to heterogeneous polymer systems
10	Research On Algorithm And Application For Low-Rank High-Order Tensor Recovery Based On Singular Value Decomposition