Font Size: a A A

Ridgelized Hotelling Test On The Mean Vectors Of A Large Dimension

Posted on:2021-01-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:G F HaFull Text:PDF
GTID:1487306452998739Subject:High-dimensional statistics
Abstract/Summary:PDF Full Text Request
Ever-accelerated development of science and technology catalyzes big data technology's evolvement,and during which,people in practical process might encounter with various types of massive data frequently.These massive data types include but not limited to multimedia graphics and video data,securities market transaction data and biometric data.These various types of data are also identified as high-dimensional data,in which data dimension is high and the sample size is comparatively small.Improving traditional and classical data test approaches to cope with these high dimension data becomes increasingly critical to satisfy the needs of rocketing development of science and technology.One of the most commonly used tests,known as Hotelling's T2 test,it has been widely applied in testing high dimensional data,although until the present,several significant limitations associated with this testing method are recognized,including its incapability of handling data with its dimension p equal or greater than n,simply attributed to the singularity of sample covariance matrix.Meanwhile,Hotelling's T2 test has also presented poor performance in testing high dimensional data,where p<n with p/n close to unity(Bai and Saranadasa[1]),as the higher the data dimension presents,the less accuracy could be expected with classical statistical methods.Eruption of high dimensional data and inaccuracy as well as incapability of classical high dimension data testing methods have greatly encouraged emergence of new data testing methods,and Dempster's non-exact test for a two-sample problem,Bai and Saranadasa's simplified test having equivalent asymptotic power to Dempster's test are two of the outstanding testing approaches.Nevertheless,new methods appear with new limitations,and for a simple stance,the two testing approaches a forementioned must have differences of the eigenvalues not appearing far,and the null distribution of the test statistics might end up miles away from their asymptotic approximations if the sample size is small or the high rate of missingness presents in data.The regularized Hotelling's test(RHT)which is proposed by Chen and Qin[2]in 2012 and based on the cross products(ridges methods)that inspired us the most.It can be applied to both p>n and p<n scenarios.However,the limitation is the underlying distribution of population is under the normal assumptions.Our aim is to solve this limitation and find the asymptotic distribution under the non-normal situation.To show the universality of the asymptotic law for the RHT,we propose a simplified four moment theorem with fewer and simple conditions,which is a simplified version of Tao and Vu's(2011)[3]work,and as application we establish the central limit theorem of the RHT to complete the hypothesis test without the normality assumption.Simulation studies show that the performance of our test is robust compared with the traditional Hotelling's test,as well as other tests.
Keywords/Search Tags:Random matrices, Hotelling's T~2 test, four moment theo-rem, central limit theorem
PDF Full Text Request
Related items