Font Size: a A A

Research And Application Of Cost Sensitive Feature Selection Algorithm

Posted on:2020-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:C C LiFull Text:PDF
GTID:2404330596987328Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Feature selection is a very important part in the processing of disease data.However,during this task,we often encounter problems of small sample size,high dimension of features,imbalanced data sets and no distinction of subtypes.Some useful features can be ignored because the characteristics of disease data sets are not taken into consideration by commonly used feature selection methods.Therefore,focusing on issues above,this paper has carried out some researches as follows:1.In order to solve the problem that the statistical-based evaluation indicators in commonly used algorithms do not adapt to the characteristics of disease data,on the theoretical level,a filtering feature selection method based on cost sensitive model and characteristics of disease is proposed.By comparing with other commonly used algorithms in public data sets,it shows that this algorithm can generate a feature subset that improves the performance of classifiers by choosing the effective features and excluding the redundant.2.On the application level,in order to find key features that can be used to recognize depression in speech,the algorithm proposed in this paper is applied to our own speech data sets.According to the results of feature selection,it is concluded that patients with depression have the characteristics of slowness and hoarseness in speech.Meanwhile,speeches from different tasks are also different,which lead us to a conclusion that speeches from tasks of interview and text reading are more effective for recognition of depression.3.According to conclusions above,a new model for depression recognition is designed.By taking both features and speeches into consideration,class rate of the new model reached 80.7%/74.7%(male/female)in first-stage dataset and 66.7%/67.8%(male/female)in second-stage dataset.In summary,this paper focuses on feature selection algorithm,and proposes a method based on cost sensitive model and the characteristics of disease data.Then the algorithm is applied to the research of depression detection in speech.After feature selection and analysis of speech data,we propose a new model that takes both speeches and features into consideration.
Keywords/Search Tags:Feature Selection, Cost Sensitive, Depression, Speech
PDF Full Text Request
Related items