Feature Selection Based On Multivariate Mixed Erlang Model

Posted on:2022-01-22

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Huang

Full Text:PDF

GTID:2568306323471744

Subject:Probability theory and mathematical statistics

Abstract/Summary:

PDF Full Text Request

Clustering is a common unsupervised learning technique used to discover the category structure in a set of data.Although there are many algorithms for clustering,they rarely involve the issue of feature selection,that is,which features of the data should be used by the clustering algorithm.Different from the supervised learning technology,the feature selection of clustering is more difficult.It does not have the category label of the data,and there is no obvious criterion to guide the search.At the same time,it is necessary to determine the number of cluster categories,which will also affect the feature selection problem.In this article,we use the Multivariate Mixtures Erlang with irrelevant features for feature selection,first use the CMM algorithm to select the initial value with higher quality,and then use the GECM algorithm with feature saliency to fit the model parameters,and then the minimum message length(MML)criterion is added,which reduces the feature saliency of irrelevant features and tends to 0,which is in line with the purpose of feature selection.The algorithm can estimate feature saliency and the number of clusters at the same time.Finally,it is applied in simulated data and real data to verify the GECM-MML algorithm,and compare it with the feature selection results of other models.It can be obtained that the model’s fitting performance and clustering performance are all have been optimized after using this algorithm for feature selection,which can effectively reduce the prediction error rate of the model.

Keywords/Search Tags:

Feature selection, Clustering, Multivariate Mixtures Erlang, Minimum message length, EM algorithm

PDF Full Text Request

Related items

1	Finite Mixtures Of Models, Nonlinear Two-Dimensional Principal Component Analysis And Their Applications To Pattern Classification
2	Research On Feature Selection Algorithm Of High-dimensional Data Based On Intelligent Optimization
3	Research On Text Clustering And Its Application In Topic Detection Analysis
4	The Research And Application Of Clustering Feature Selection Methods
5	Research On Feature Selection Algorithm Based On Maximum Weight And Minimum Redundancy
6	A Study On An Optimal Feature Selection Algorithm Using Minimum Joint Mutual Information Loss Criterion
7	Human appearance modeling in visual surveillance
8	Message Text Clustering Based On Frequent Patterns
9	Implementation And Improvement Of Multivariate Selected-subset Mean Feature Algorithm
10	Ib Theory And Parameter Validation Study