Font Size: a A A

Research On Self-supervised Learning Model And Its Application

Posted on:2023-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:K LiFull Text:PDF
GTID:2568306794954999Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Supervised deep neural network models have performed well in many machine learning tasks,such as image classification and segmentation in computer vision,pre-trained language models in natural language processing,question answering systems,and sentiment analysis.However,when processing complex data,the problem of too large data dimension is often encountered.Although some existing deep models can perform feature extraction on complex image or text data to obtain low-dimensional representation,they cannot ensure its extracted low-dimensional features are semantic information useful for the downstream task it is to solve.In addition to the dimension problem,traditional supervised deep learning methods also face the problem of insufficient labeled data.Some real-world data is expensive to label.For example,medical image data needs to be labelled by professionals with rich clinical experience,and its expensive labeling cost limits the construction of models for medical image analysis.Self-supervised learning can obtain semantic information related to downstream tasks by constructing auxiliary tasks without manual annotation by additional professionals.It performs well in computer vision,natural language processing and other fields,and is a hot research area in the AI field in recent years.Cluster analysis is a common method for processing unlabeled data,which divides the data naturally by analyzing the inherent distribution structure of the data.However,due to the limitation of the data dimension,when dealing with high-dimensional complex data,it is often unable to divide the data well.Because the data has too many dimensions,it is difficult to classify by the distance used in the clustering in the original data space.The problem of image synthesis is also an important research field,especially the recent hot GAN model has brought great technological innovation to this field.However,existing generative models often need to generate real samples in the target image distribution,and it is difficult to obtain real images in some fields.Considering the above problems,this paper proposes a deep soft clustering self-supervised model without labeling by combining deep autoencoders and traditional soft clustering algorithms.Besides,a synthetic image model based on self-supervised learning is proposed for the cell imaging problem.The main contributions of this paper are as follows:(1)Combining DNN dimensionality reduction and soft clustering: We propose an optimization method based on the combination of autoencoder and soft clustering.We use the encoder part of the autoencoder to reduce the dimension of the data,and perform soft partition clustering on the reduced data.Replacing membership in soft assignments with discrete hard assignments is crucial for optimizing the parameters of the entire network at the same time.Its structure is also very flexible.The autoencoder can be replaced by other network structures,such as deep convolutional neural networks,and we can also use various membership-based algorithms in the soft clustering part,such as FCM and MEC.When hard clustering is combined with a deep neural network,the cluster centers and parameters of the deep neural network cannot be updated synchronously.The reason is that hard assignments are discrete and cannot be updated with gradients.The cluster centers of the clusters are set as variables so that they are no longer updated through iterations,but are updated along with the parameters of the deep neural network.The method is simple to implement and has high scalability.(2)Aiming at the problem of cell signal synthesis image,we propose an image synthesis method based on self-supervised learning,which can use one-dimensional signals to synthesize two-dimensional images.And no ground-truth images are needed during training.It solves the problem that traditional generative models require data sampled from the real distribution of the generated images.We set up a self-supervised task for image generation using traditional image modulation methods,enabling the network to learn semantic feature representations for signal-to-image mappings.We apply the proposed method to the field of cell image synthesis.The experimental results show that our method can not only restore cell images with more details and less noise from one-dimensional signals,but also are less time-consuming than traditional iterative image reconstruction algorithm.
Keywords/Search Tags:Self-supervised Learning, Clustering, Image Synthesis, Autoencoders
PDF Full Text Request
Related items