Font Size: a A A

Enhancing Attribute Reduction Searching Efficiency And Generalization Performance With Random Partitioning

Posted on:2024-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z ChenFull Text:PDF
GTID:2568307157452834Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
Amidst the ever-accelerating global informationization,there confronts us the challenge of gargantuan,high-dimensional,intricate,assorted and heterogeneous data.To cope with this,attribute reduction has been extensively employed in several fields such as pattern recognition,data mining,and others,during the stage of data preprocessing.It can trace an attribute set that has the minimum redundancy and correlation under specific circumstances,thereby resolving the issue of attribute selection in large-scale high-dimensional data.This type of attribute reduction not only simplifies how data is represented,thereby providing a more concise and effective representation,but also helps to explore the latent patterns and knowledge of data,thus fostering better accuracy and efficiency in data analysis.Attribute reduction has become an indispensable technical tool for the processing of big data and the facilitation of intelligent analysis,when faced with the growing complexities of data sets today.However,attribute reduction technology continues to encounter various challenges as it makes further advancements.These issues can be broadly categorized into two types: the efficiency and effectiveness of attribute reduction.To address the derivative branch problems of these two types of challenges,this research has comprehensively investigated the implementation of random partitioning techniques.On the one hand,we introduce the notion of guiding learning to progressively simplify the process of reduct by incorporating random partitioning techniques at the sample level,thus segmenting large-scale data into more manageable subsets.On the other hand,we integrate random partitioning techniques at the attribute level,incorporating the addition of intentional perturbations to generate more diverse reducts,thereby enhancing the effectiveness of attribute reduction.Specifically,the research content and innovative results of this thesis mainly cover the following two points.Firstly,the using of random partitioning techniques has significantly improved the efficiency of solving attribute reduction problems.Previous research on enhancing attribute reduction efficiency,especially from the perspective of samples,has often neglected an important underlying issue,that of attribute reduction algorithms being heavily dependent on sample distribution.Such a reliance on sample distribution can result in issues such as overfitting,underfitting,high time and computational costs when applied to large-scale datasets,all of which limit the effectiveness of feature selection algorithms in certain situations.To address this problem,this thesis proposes a sample-level random partitioning algorithm framework.In this framework,deriving reducts no longer requires the entire sample information coverage of the domain,which clearly reduces the dependence on sample distribution to a certain extent,without losing the ability to learn from unique data information.Furthermore,the introduction of guiding learning further enhances the efficiency of the proposed algorithm for deriving attribute reduction.Secondly,the using of random partitioning techniques has significantly enhanced the effectiveness of attribute reduction.Previous research on enhancing attribute reduction effectiveness has often been hampered by two issues: single constraints and identical search starting points.The former restricts the perspective on mining data information,whereas the latter often causes search strategies to become trapped in local optimal solutions.To attempt to solve these two problems,this thesis introduces a feature-level random partitioning algorithm framework.In the framework we designed,controlled attribute perturbations have been deliberately added to generate multiple reducts starting from different search points,and then these reducts are synthetically evaluated using a combination approach,resolving the issue of identical search starting points to a certain extent.Moreover,as our framework synthesizes and evaluates the results of reduct from multiple search starting points,this also addresses the problem of single constraints to a certain extent.
Keywords/Search Tags:Attribute reduction, Rough set, Random mechanism, Generalization performance, Random partitioning techniques
PDF Full Text Request
Related items