| As a branch in the field of machine learning,multi-label learning system has the characteristic of single input sample while multiple output semantic topics.The research object has multiple attributes is very common in the real world,such as one picture may have multiple topics simultaneously;after getting into the cell,a protein molecule always locates in several specific subcellular organelles;a gene tends to have a variety of functions at the same time;a patient may also have a variety of complications.To solve such special problems,the researchers have proposed many multi-label learning algorithms.Ensemble learning and deep learning are both the most popular machine learning methods in recent years,this thesis attempts to combine them with multi-label learning to design novel multi-label learning algorithms,and use them for mining biomedical data.Firstly,this thesis makes a broad and deep investigation in the theory,method and application of multi-label learning,and then a series of innovation and application work is done.The main points of the study are as follows:1.A hybrid multi-label learning algorithm is proposed.Effective ensemble learning requires the base learner “good but different”,based on this idea,this thesis designed a new hybrid multi-label learner,which is composed of two heterogeneous base learner.One is feature-driven method,while the other is neighbor-driven;one utilizes the global information of all data,while the other only use the local information of the neighbor;one belong to linear algorithm,while the other belong to nonlinear algorithm,in which the label correlation is also considered.The two methods complement each other and work well as a whole.Through the experiments on several multi-label datasets we can find that the hybrid model is better than any base learner,and has significant advantage when comparing with existing methods.2.A novel deep multi-label learning method is proposed,which is based on Re LU activation function.The loss functions used in single label learning,such as cross-entropy or hingle loss,are changed to be multi-label loss functions in order to cope with the need of multiple output.The iterative optimization for the proposed method is deduced,and the relationship of multi-label loss and log loss is also studied in this paper.Through experiments on several biomedical multi-label datasets,we can find that the proposed method is superior to the traditional methods,including the best known ensemble ones.3.Multi-label learning for the prediction of antimicrobial peptide activities.Antimicrobial peptides are kind of biological small molecules with broad-spectrum antimicrobial activities,which demonstrate potential as novel therapeutic agents.This thesis attempts to predict antimicrobial peptide activities by machine learning method,which is a typical multi-label learning problem because any natural antimicrobial peptide may have multiple activities.In this thesis we build a new data set about antimicrobial peptide activity based on the latest antimicrobial peptide database,and compare several feature extraction and multi-label learning methods.Through detailed experimental comparison,it can be found that the best performance is obtained by the proposed multi-label learning algorithm together with the features of Amino acid composition and dipeptide composition.4.Multi-label learning for the prediction of chronic diseases.Chronic patients tend to have a variety of complications then multi-label learning can be used here for modeling.Though the experiments on a chronic disease data set consisted of 19733 patients and 10 kinds of chronic diseases coming from MIMIC-II database,it can be found that the proposed deep multi-label learning method is significantly better than the fourteen traditional algorithms.Maybe this data set is relatively big and the deep learning can fit it well.What’s more,the proposed method is also very competitive in running time,which is mainly due to the efficiency of multi-label loss function and Re LU activation function. |