Font Size: a A A

Research On Machine-Learning-Based Cell Microscopic Image Analysis

Posted on:2019-09-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:W ShaoFull Text:PDF
GTID:1360330590466695Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cell is the basic structural and functional unit in human organism,and optical microscopy is an important tool that is used to analysis cells.Recently,with the rapid development of the optical imaging technology,various types of microscopic detection equipments can collect cell images in high-throughout way.In the face of the amounts of cell imaging data,one of the research hotspot is how to analysis them more effectively.Machine learning methods,considered as the powerful tools in the data-driven based association studies,can construct mathematical models basing on the existing cell images to help biologist analysis the morphology,function and the inner components(e.g.,protein)distribution of cells,which can in further discover the disease mechanisms and strengthen the understanding of life information for biologist.Generally,the analysis of microscopic can be divided into image preprocessing,image pattern analysis and clinical output prediction these three steps.Based on the machine learning driven intelligent methods,this dissertation tries to use microscopic cell images to solve the following three problems within the above three steps 1)neuron cell segmentation.2)determine the subcellular localization of proteins.3)the prognosis of early-stage cancer,and the main contribution can be summarized as followed(1)We propose an new active-learning based neuron segmentation method,by which we only select the most valuable super-pixels from the un-annotated neuron images for labelling,and thus can reduce the efforts for annotation.Specifically,we firstly apply the simple linear iterative clustering(SLIC)algorithm to aggregate the pixels in neuron images into super-pixels,then we propose a novel query strategy to select the most representative and informative super-pixels from the un-annotated super-pixel set for pixel-level annotation.Finally,based on the annotated pixels,we build manifold regularized Gaussian Mixture Model(LapGMM)to accomplish the neuron segmentation task.Experimental results on both 2D and 3D neuron image datasets have demonstrated that the proposed method could achieve comparable segmentation performance with the state-of-the-art method,while can save 40 pecentage annotation efforts for experts.(2)We propose a novel ECOC(Error Correct Output Coding)driven protein subcellular localization algorithm by introducing prior hierarchical information among different cellular compartments.Specifically,we firstly define the codeword matrix of ECOC according to the hierarchical structure of different cellular compartments that are defined basing on their function similarity or space distribution.Then,according to the pre-defined codeword matrix,we transform the subcellular localization problem(multi-class classification problem)into series of binary classification problems.Next,we use multi-kernel SVM(support vector machine)that combine different types of image feature to solve these binary classification problems.We compare the proposed algorithms with other state-of-the-art methods on Human Protein Atlas dataset,and the experimental results demonstrate the effectivity of our method.For predicting the multi-label based protein subcellular localizations,we make use of the important structural correlation among different cellular compartments and propose an organelle structural correlation regularized feature selection method.Specifically,our method treats different cellular compartment prediction problem as different tasks,and then use group-sparsity term to ensure that the features are co-important to different tasks can be jointly identified.In addition,by considering the intrinsic relatedness among different cellular compartments,we also introduce a Laplacian regularized term that are induced by introducing prior hierarchical structure among different cellular compartments that can select more distinguishing features.We evaluate the performance of our method on Human Protein Atlas dataset,and the results shows that using the identified feature by our method can better predict the image-based multi-label protein subcellular localizations.(3)We propose an ordinal sparse canonical correlation analysis algorithm to simultaneously select histopathological image and genomic features for the prognosis of the early-stage cancers.Specifically,we formulate our framework basing on sparse canonical correlation analysis framework to ensure the projections of imaging and genomic data are correlated.In addition,since the survival time of different patients are ordinal,and in order to preserve this ordinal information in the projected space,we also add inequality constrains to ensure that the average projections for long-survival patients should be larger than that for short-survival patients.The experimental results on several early-stage cancer datasets derived from The Cancer Genome Atlas(TCGA)have demonstrated that the selected features correlated strongly with survival,by which we can improve the capability of prognosis prediction for early-stage cancer patients.
Keywords/Search Tags:cell microscopic image, machine learning, feature selection, active learning, subcellular location, neuron segmentation, prognostic prediction
PDF Full Text Request
Related items