| With the increasing scale of data and the speed of data generation,feature selection techniques can effectively alleviate the data space dimensional disaster problem,but still face the following challenges:(1)The number of samples is large.A single data space may have thousands or tens of thousands of samples,making model training time costly.(2)The feature environment is fluid.The feature space changes over time.(3)The semantic relationship between markers is complicated.The categories are interconnected with each other and have special hierarchical relationships such as parent-child and sibling,which makes it difficult for traditional techniques to cope with hierarchical data.To solve the above problems,this thesis fully investigates the online feature selection model for category-oriented hierarchical data based on the stream feature environment.The main research work is as follows.(1)Feature selection algorithm for hierarchical single granularity in streaming feature environment.We use sibling strategy and exclusion strategy to divide the similar and dissimilar categories of leaf nodes,and combine with the hierarchical tree structure of categories to propose a feature weight calculation method for hierarchical classification problems.To address the problem of variability in the feature environment,the size of the weights is used to measure the ability of features to partition the decision space to select important features,and the covariance between features is used to assess the independence of feature sets and design the corresponding hierarchical feature selection framework in the stream feature environment.Experiments show that the proposed algorithm is more effective in adapting to hierarchical data than the traditional stream feature algorithm.(2)Feature selection algorithm for hierarchical multi-granularity in streaming feature environment.Work(1)considers sibling relationships among leaf nodes,but there are other hierarchical relationships among hierarchical nodes internally,so the affinity relationship is defined through the set of sibling and common ancestor nodes among nodes,and the feature weights calculation method is redefined.To address the problem of feature mobility,different sets of sub-features are selected for each hierarchical node by searching whether the parent-child nodes retain online features,and the covariance between the new features and the original candidate set is calculated to remove redundant and irrelevant features.Experiments show that the proposed algorithm is better able to form a dynamic model with high execution efficiency and high partitioning capability than the hierarchy-oriented single-grain flow feature algorithm. |