Font Size: a A A

On The Spectral Properties Of Large-dimensional Spiked Population Model

Posted on:2016-12-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q W WangFull Text:PDF
GTID:1220330464972379Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In this thesis, we establish some asymptotic properties related to large dimen-sional sample covariance matrix, especially for the spiked population model, e.g. the central limit theorem for linear spectral statistics, the almost sure limit as well as the asymptotic distribution of the extreme eigenvalue and eigenvector that correspond to spikes, etc.The first chapter concerns some background information and existing results in the field of random matrix theory, especially for large dimensional sample covariance matrix, which is the exact matrix ensemble that we focus on along all the context of this thesis. Particularly, we are interested in one certain structure called spiked population model, which has attracted lots of attention since the last few years.The second chapter is about hypothesis testing problem that related to the population structure. Suppose we have a sample Y1,..., Yn, each is of p-dimension. We want to test the following hypothesis (called sphericity test): H0:∑p= δ2Ip vs H1:∑p≠δ2Ip, where δ2 is unspecified. Under the null hypothesis, it means that the observation {Yi} are uncorrelated and have a same variance.Let Sn = n-1∑iYiYi* be the sample covariance matrix of Yi, whose eigenval-ues are denoted as {li}1≤i≤p.In the classical large sample asymptotics and with additional Gaussian assumption, there exist two methods for sphericity test:the first is the likelihood ratio test (LRT), whose test statistic is and the other is put forward by John in the year 1971 (John’s test), whose test statistic is T2=(p2n)/2tr{Sn(trSn)-1-p-1Ip}2. Under the null hypothesis,it holds-2 log Ln(?)xf2 and T2(?)xf2,where xf2 is a chi-square distribution with degree of freedom t=1/2p(p+1)-1.It has been well known that classical multivariate procedures are in general challenged by large-dimensional data. Therefore, the goal of this chapter is to propose novel corrections to both LRT and John’s test to cope with the large-dimensional context. We derive the asymptotic normality of both test statistics under the null in the framework of "large p,large n":-2n-1 logL n+(P-n)·log(1+p/n)-p (?)N{-(k-1)/2 log(1-y)+1/2βy,-k log(1-y)-ky} and 2/pT2-p(?)N(k+β-1,2k), where k is an indicator function that indicates whether the observations are real or complex and β=E|xij|4-1-k, A thing to mention here is that the above two asymptotic distributions are universal in the sense that they depend on the distribution of the observations only through its first four moments.The third chapter is devoted to the asymptotic expansion for the centering term that appears in the central limit theorem for linear spectral statistics in large dimensional sample covariance matrix under the spiked population model,that is, the eigenvalues of the population covariance matrix ∑p is as follows: here,M and the multiplicity numbers(nk)are fixed,satisfying n1+…+nk=M. Since M is fixed,the limit spectral distribution of its corresponding sample covari-ance matrix Sn is the same as the non-spiked case,which is δ1.Now we consider its linear spectral statistics:Xn(f)=p{FSn(f)-Fyn,Hn}.Existing theory says that (Xn(f1),...,Xn(fk))converges weakly to a Gaussian random vector(Xf1,...,Xfk), whose mean and covariance function are fully identified,which are only related to the limiting spectral distribution of Sn but not those spike ai. For the centering term pFyn,Hn(f), it depends on a particular distribution Fyn,Hn which is a finite-horizon proxy for the LSD of Sn. The difficulty is that FVn,Hn has no explicit form. Therefore, the purpose of this chapter is to give an asymptotic expansion of this term.As for application, we recall the sphericity test presented in the second chapter, and the two asymptotic distributions are derived under the null hypothesis. How-ever, their power functions remain unknown because the distributions of the two statistics under the general alternative H1 are ill-defined. Therefore, we restrict this sphericity test to the following: Ho:∑p ∝ Ip vs H1*:∑p ∝ the structure in (0.0.1) Using the asymptotic expansion established in this chapter, we are able to get the explicit expression of their power functions.In chapter four, we derive a joint central limit theorem for several random sesquilinear forms. To be more specific, for l= 1,…, K, define where X(l) and Y(l) are n-dimensional vectors, An and Bn are two different Her-mitian (or symmetric) matrices, both the vectors and the matrices have some ad-ditional restrictions. We are able to establish the joint normality of the random vector (U(1),…,U(K),V(1),…,V (K)) The outline of the proof is to establish the central limit theorem for the linear combinations of these random Hermitian sesquilinear forms, and moment method is then involved.As for the applications, we consider two problems that related to the spiked population model:(1) asymptotic joint distribution of two groups of extreme sample eigenvalues that correspond to two different population spikes;(2) asymptotic joint distribution of the largest sample eigenvalue and its corre-sponding sample eigenvector projection.As a special case, if we consider the marginal distributions of the above two results, it reduces to the known results concerning the almost sure limit or the central limit theorem of one group of extreme sample eigenvalues or eigenvector (correspond to the same population spike), related references are [7], [8], [10], [11] and [50].In chapter five, we give a summary of the whole context of this thesis.
Keywords/Search Tags:random matrix, sample covariance matrix, spheric- ity test, spiked population model, linear spectral statistics, extreme eigenvalue, ex- treme eigenvector, Stieltjes transform, moment method, central limit theorem, joint distribution
PDF Full Text Request
Related items