Research On Fisher Score-based Algorithms For Feature Selection

Posted on:2023-07-02

Degree:Master

Type:Thesis

Country:China

Candidate:M Gan

Full Text:PDF

GTID:2558306629974579

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the advent of the era of big data,data updates fast and is very diverse.How to extract effective features from data is an essential step in data preprocessing.As one main method for dimension reduction,the purpose of feature selection is to select some useful fea-tures from original data and then form an optimal feature subset with a good discrimination ability for subsequent learning tasks.Fisher score is a filter method for supervised feature selection and seeks the optimal feature subset by minimizing the withinclass scatter and maximizing the between-class scatter.Fisher score does not depend on any learning mod-els;thus,it can be used as a data preprocessing method for all classifiers.However,Fisher score calculates the within-and between-class scatters from a global perspective and ignores the local structure of data.Thus,Fisher score has a poor performance on manifold data.In addition,Fisher score cannot deal with increment learning.To solve these issues,this thesis makes a research on Fisher score for feature selection.The main work is summarized as follows:(1)This thesis proposes an iteratively local Fisher score(ILFS)for feature selection.In view of the issue that Fisher score ignores the local structure of data,ILFS calculates the within-and between-class local scatters of data by constructing two local nearest neighbor graphs.At the same time,ILFS iteratively searches the optimal feature subset by considering the correlation between features.In each iteration,ILFS selects a feature that is the most correlation with the current feature subset from candidates.Experimental results show the optimal feature subset obtained by ILFS has a better classification performance in subsequent classification tasks.(2)This thesis presents a Q-learning-based Fisher score(QLFS)algorithm for feature selection.It is well known,Fisher score cannot be applied to increment learning,which means that Fisher score needs to be rerun when new samples are collected.Therefore,this thesis introduces Q-learning,one of reinforcement learning algorithm,and combines it with Fisher score to propose QLFS,a new reinforced feature selection algorithm.QLFS can deal with increment learning.In each iteration(episode),QLFS learns with a small number of samples in batch mode;thus,QLFS does not deal with all samples at once.When there are new samples,QLFS can directly update the existing strategies based on them.Experimental results show QLFS can maintain a stable learning speed and has a better classification performance when selecting the same number of features.(3)This thesis proposes a dynamic Q-Learning-based Fisher score(DQLFS)for feature selection.Because QLFS cannot control the size of feature subset well,this thesis proposes DQLFS to solve it by combining QLFS with any classifier.DQLFS has two agents to manage the processes of feature selection and stop selection so as to dynamically select the optimal feature subset in various situations.Experimental results show that DQLFS can select less features and achieve better classification results for subsequent classification tasks.

Keywords/Search Tags:

Feature Selection, Fisher Score, Q-learning, Reinforced Feature Selection

PDF Full Text Request

Related items

1	Multi-Label Feature Selection Algorithms Based On Fisher Score
2	A Research Of Feature Selection Methods Based On Fisher Score And Genetic Algorithm
3	Improved Fisher Score And Hyper-heuristic Differential Evolution Feature Selection Method
4	Research On Feature Selection Method Of Radar Signal Recognition
5	Research On New Feature Selection Algorithm
6	Research On Feature Selection Based On Muti-Objective Optimization
7	Application Of Image Classification Based On Fisher Vector With Feature Selection
8	Feature Selection Methods For Image Classification
9	Feature Selection Mechanism For Multimodal Social Media Data With Privacy Protection
10	Research On Optimization Algorithm Based On Machine Learning In Intrusion Detection