| Viral hepatitis B is a worldwide disease that severely threatens the human health,and hepatitis B virus(HBV)has infected more than 86,007,000 people in China.The infected people are difficult to be completely cured and need to take drugs for life,so more effective antiviral drugs or therapeutic vaccines are urgently needed.As an important components of HBV,Hepatitis B Virus surface Antigen(HBsAg)is not only an important clinical marker,but also a major component of hepatitis B vaccine.Indepth investigation of the working mechanism of HBsAg requires high-resolution 3D structure,which has not yet been determined due to its structural heterogeneity.Single-particle cryo-electron microscopy(single-particle cryo-EM)can reconstruct molecular structures at high-resolution,and also has great advantages in sample preparation and analytical range of molecular weight,so it is also an important tool for HBsAg structure determination.The single-particle analysis(SPA)includes particle picking,2D clustering and 3D reconstruction,among which 2D clustering can generate class averages with high signal-to-noise ratio(SNR)to facilitate the observation of molecules and discover morphological features,symmetry and other key structural information,which may guide the 3D reconstruction.This thesis analyzed the difficulties in the structure determination of HBsAg,and focused on the research,establishment and preliminary application of clustering algorithms.Details are as follows:(1)Establishment of cryo-EM image dataset of HBsAg and preliminary research of HBsAg image clustering by main clustering algorithms.Frozen samples of HBsAg were prepared,and a dataset of 67,797 HBsAg cryo-EM images was constructed through data collection of electron microscope,electron micrographs preprocessing and particle picking;the clustering research of the HBsAg dataset was performed by four main clustering algorithms to gain a preliminary understanding of the morphological characteristics of HBsAg and to verify the reliability of the dataset.After analyzing the clustering results,some problems were found,such as large intra-class errors,unbalanced distribution of particles between classes(the first 10 classes accounted for more than 94%),and the generated average image quality was poor.(2)Based on the analysis of the aforementioned clustering results and morphological analysis,the HBsAg Ring Feature(H-RF),Dynamic Denoising Variational Autoencoder(DDVAE)and Balanced K-Means(BK-Means)algorithms were proposed.On the basis of these proposed methods,a clustering scheme H-RDK for HBsAg cryo-EM images was designed.(3)To objectively evaluate the performance of H-RDK,simulated datasets with SNR of 0.5,0.1,and 0.05 was generated for testing by projecting the known structures,so that the clustering result can be evaluated by accuracy and normalized mutual information.The performance of H-RDK on three datasets is better than the three commonly used clustering algorithms.(4)With the application of H-RDK to the HBsAg dataset,denoised images and class averages with higher SNR were obtained.The results were quantitatively evaluated by combining the image quality with morphological features,and the average values of clarity and spike coverage reached a better performance compared with the four main clustering algorithms.Combined with the clustering results of H-RDK,a HBsAg structure with 16.89? resolution was reconstructed by software EMAN2,which has better potential for further exploration compared. |