Font Size: a A A

Multivariate outlier mining using cluster analysis: Case study - National Health Interview Survey

Posted on:2011-11-01Degree:M.SType:Thesis
University:Duquesne UniversityCandidate:Sharker, Md Monir HossainFull Text:PDF
GTID:2448390002969178Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Outlier mining is a fundamental issue in many statistical analyses, especially in multivariate cases. Outliers may exert undue influence on outcomes of the analysis. In most cases, it is a big challenge to reveal the pattern of the outliers and the “outlyingness”. There are several approaches and methods to detect anomalous data points in data. But no single method is perfect for every data set especially when the data dimension and volume is high. In this thesis, I review distance-based clustering methods for multivariate outlier mining and demonstrate the usefulness of it in a medical setting. Specifically, I discuss Hierarchical clustering and the multivariate methods of determining appropriate cluster(s). After mining the multivariate outliers, I examine and describe the characteristics of the variables for those outliers. Finally, I demonstrate the application of these methods using the National Health Interview Survey (NHIS) 2008 database for the purposes of studying adolescent obesity.
Keywords/Search Tags:Multivariate, Mining, Outliers, Methods, Data
PDF Full Text Request
Related items