| The rules extraction of high-dimensional uncertain data and classification of high-dimensional unbalanced data are two challenging problems in the field of information technology.As a typical methods of dimensionality reduction,attribute reduction is one of the hot topics in rough set based granular computing theories.Attribute reduction can effectively remove redundant attributes in information system and shrink knowledge space.It can be applied to high-dimensional fuzzy rule extraction and unbalanced data classification in data mining.Based on the rough set attribute reduction,this paper focuses on two kinds of reduction methods.One is to obtain a single reduction in high-dimensional fuzzy systems and the other is to dynamically obtain multiple reductions using binary discernibility matrix.And we applied them to non-redundant fuzzy decision rules extraction and multiple brain functional connection pathways extraction,respectively.Finally,these two kinds of reduction methods are combined with the random forest to construct an ensemble reduced forest,which is used to solve the problems of high-dimensional unbalanced data classification.The main contents of this paper are summaried as follows:(1)An interval type-2 fuzzy rough set single reduction algorithm and its fuzzy rules extraction algorithm are proposed.The Gaussian kernel function is introduced into the interval type-2 fuzzy rough set to construct the fuzzy similarity relation.The key concepts such as the upper,lower approximation and positive region of the Gaussian kernel based interval type-2 fuzzy rough set are defined,and a single reduction algorithm is designed.Three theorems for extracting non-redundant rules are proved,which guarantee the non-redundancy of the extracted decision rules by proposed fuzzy rule extraction algorithm.The experimental results show that the proposed algorithm is better than the typical algorithms in the size of reduction subset and classification accuracy,and can extract non-redundant decision rules.(2)A dynamic multi-reduction algorithm based on binary discernibility matrix is proposed.A reduction equivalence theorem is proposed and proved to guarantee the correctness of the algorithm.By introducing a dynamic update mechanism,the amount of calculation is reduced as the size of the binary discernibility matrix is dynamically reduced in the the process of attribute reduction.At the same time,the proposed multi-reduction algorithm is applied to brain data,and three functional connection pathways related to text-image cognition are successfully extracted from brain cognitive functional magnetic resonance imaging data.(3)An ensemble reduced forest algorithm for unbalanced data classification is proposed.Combining attribute reduction with random forests,all attributes are replaced by the reduction results,and a preferred selection strategy is used to improve the classification accuracy and the true negative rate.With the introduction of multi-reduction results,the ensemble reduced forest can do classification on different knowledge granularity spaces.Combining with SMOTE over-sampling algorithm,the classification performance of ensemble reduced forest can be improved in terms of the algorithm and data.At last,a rectified combination voting mechanism is proposed in the classified voting stage to solve the problem of the decrease of accuracy rate after SMOTE.The experimental results show that the ensemble reduced forest proposed in this paper significantly improves the classification performance compared to typical methods. |