Font Size: a A A

Constrained Variable Selection And Copula Feature Screening In High Dimensional Data

Posted on:2024-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:B Y XieFull Text:PDF
GTID:2530307145954439Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the development of information technology,it’s convenient and fast to obtain massive and high dimensional data.How to extract valuable information from the high dimensional data has attracted a lot of attention.Meanwhile,big data is easily destroyed by outliers,or contains variables with heavy-tailed distribution,which makes the classical statistical methods no longer applicable.Therefore,it’s important to study statistical model under high dimensional data both in theory and application.To deal with the structural information and outliers in high dimensional data,based on the fused lasso penalty with smoothness property,we propose an adaptive Huber regression model with fused lasso penalty,and prove the nonasymptotic statistical property.Considering the special structure of the model,we propose an alternating direction method of multipliers to solve nonsmooth convex optimization problem,and analyze the convergence of the algorithm.we verify the effectiveness of the proposed method by numerical simulation both in low and high dimensions.Furthermore,we apply the model to three genetic real datasets,and analyze its interpretability.To deal with the problem of ultrahigh dimensionality in high dimensional data,we design an ultrahigh dimensional feature screening method based on the semiparametric Copula estimation with the model-free assumption,and prove the sure screening property under some assumptions.Moreover,we verify the effectiveness of the proposed method by using nonlinear and heavy-tailed data.Together with penalty estimation methods and the coordinate descent algorithm,we analyze the real dataset and clarify the interpretability of the screening method.
Keywords/Search Tags:Huber function, Fused lasso, Alternating direction method of multipliers, Copula estimation, Sure screening property
PDF Full Text Request
Related items