Font Size: a A A

Combinatorial enhancements of logical analysis of data: Applications to genomics, proteomics and biomaterial design

Posted on:2005-05-03Degree:Ph.DType:Dissertation
University:Rutgers The State University of New Jersey - New BrunswickCandidate:Alexe, GabrielaFull Text:PDF
GTID:1450390008499987Subject:Operations Research
Abstract/Summary:
Extracting knowledge from data has received in the recent years considerable attention from scientists and practitioners, and has provided a strong motivation for contributions from computer scientists and mathematicians.; Among other data mining methods, Logical Analysis of Data (LAD) proved to be promising for data analysis. Numerous computational studies have shown that LAD compares favorably with other machine learning methods. Initially designed for the analysis of binary datasets, LAD was further developed, allowing its application to areas including economics, industry and medicine. Using concepts and techniques from combinatorics, optimization, and Boolean logic, LAD extracts from datasets of past observations large collections of patterns characterizing the positive or negative character of the observations.; The goal of this study was to design efficient algorithms to enhance the LAD method, to find new applications of patterns besides classification, and to add new modules for feature selection, making LAD capable of handling large size datasets such as those occurring in genomics and proteomics.; This study presents two new combinatorial algorithms, running in total polynomial time, for the generation of all maximal bicliques of a graph, and for the generation of all spanned patterns in a dataset. Extensive computational experiments provide evidence for their efficiency and usefulness. For example, spanned patterns can be used for classification, as well as for providing new information about the dataset, e.g., the importance of attributes and new class discovery.; This study also introduces a pattern-based method for feature selection for genomic and proteomic datasets, which identifies collective biomarkers, capable of distinguishing different classes of patients (e.g., poor prognosis vs. bad prognosis breast cancer cases, ovarian cancer cases vs. controls). The novelty of this method consists of the use of patterns to detect combinatorial biomarkers. These combinatorial biomarkers are shown to provide additional information compared to that provided by individual ones.; This study concludes with three case studies presenting the application—for the first time—of the enhanced LAD method in genomics, proteomics and biomaterial design. The findings of these case studies provide valuable information for the field experts, currently some of them being implemented in biomedical research laboratories.
Keywords/Search Tags:Data, Provide, Combinatorial, Proteomics, Genomics, /italic
Related items