| Hepatitis B virus (HBV) infection and its sequelae are now recognized as serious global problems. Worldwide, over 350 million individuals are chronically infected with HBV and 15-25% of them are at risk of developing and dying from HBV-related chronic liver disease, including cirrhosis and hepatocellular carcinoma.In China, HBV infection is even more serious. The infection rate is almost 57% throughout the country; more than 700 million individuals have been infected, and there are 130 million HBV carriers. At present, there are 23 million chronic hepatitis B patients in China and 230 thousand deaths every year. Each year, the Chinese government had to spend over 50 billion RMB for prevention and treatment. Unfortunately, there have been no effective means of eliminating this problem. As a result, it has been both a health and a national economic disaster.In view of the situation, it is critically important to find a new method to diagnose and classify hepatitis B accurately, and a new biomarker which can be used in detecting hepatitis B and associated-disease. At present, although there are many testing methods to diagnose acute hepatitis B, chronic hepatitis B and its associated-diseases, they have poor diagnostic efficiency. In this study, our purpose is try to find a new thread, a new method, and new biomarkers which can be used to diagnose hepatitis B and associated-diseases in clinical practice.Part One: Application of the Tree Structure Model in diagnosing chronic hepatitis B and associated-diseasesObjective: Data mining based on the tree structure model is a procedure to extract concealed, unknown, and potentially useful information and knowledge from massive, incomplete, noisy and fuzzy data. Because hepatitis B and its associated-diseases, for example liver cirrhosis (LC) and hepatocellular carcinoma (HCC), were complicated, changeable and long pathogenesis, so there must be a mass of clinical testing results and multi-dimensional data on every one patient. Although there was lots of valuable information about the disease in the result, but it was seldom used in diagnosing in clinical by doctors. The consequence is that much valuable information about the disease were lost, incorrect diagnosis were made, and patients were not treated properly. In this study, we constructed a tree structure model and try to mine potential information about the disease from patient's clinical data.Materials: Totally there were four algorithms in tree structure model construction, include CRT (Classification and Regression Tree), QUEST (Quick, Unbiased, Efficient, Statistical Tree), CHAID (Chi-Squared Automatic Interaction Detection) and EXHAUSTIVE CHAID. In this study, we selected two of these algorithms (CHAID and CRT), as representives to construct tree models which can be used in classifying CHB, LC and HCC. Meanwhile, the tree structure model must be evaluated by risk and accuracy analysis. Its practicability was validated by Principal Component Analysis (PCA).Result: The two tree structure models made in CHAID and CRT algorithms which be used to classify severe type from mild and moderate type CHB, yielded correct classification rates of 77.5% and 78.4% respectively; risk evaluation were both 0.353. Model's accuracy was evaluated by ROC (Receiver Operating Characteristic), the area of ROC curve were 0.787 (CI95%: 0.695-0.879) and 0.802 (CI95%: 0.714-0.890) respectively.The two tree structure models made in CHAID and CRT algorithms which be used to classify LC and HCC from CHB patients, their correct classification rate, risk evaluation, the area of ROC curve were 71.4% and 74.2%,0.451 and 0.352,0.785 (CI 95%: 0.692-0.878) and 0.807 (CI 95%: 0.720-0.894) respectively. The model's practicability validation was studied with Principal Component Analysis (PCA). Taking chronic hepatitis B as an example, after analysis by PCA, five main components were defined. The first main component (AST, ALT) was as same as the first predictive variable in tree structure model.Conclusion:Tree structure model is a promising method in data mining and re-utilizing.Part Two: Standardized approach to proteome profiling of human serum based on magnetic bead separation and MALDI-TOF MSObjective: Magnetic bead purification for the analysis of proteins in body blood serum facilitates the identification of potential new biomarkers with matrix-assisted laser desorption/ionization time-offlight mass spectrometry (MALDI-TOF MS). The aim of our study was to establish a proteome fractionation technique and to validate a standardized blood sampling, processing, and storage procedure for proteomic pattern analysis.Materials: Serum sample's protein spectrum was detected by MALDI-TOF MS after serum samples were purified by IMAC-Cu magnetic bead. In order to evaluate the reproducibility of MALDI-TOF MS at different times, nine proteins in different mass ranges were selected and their CV% was calculated. The influence of freeze-thaw cycles and the ratio of sample to matrix were evaluated by comparison of their mass spectrum. Of course other factors that may be disturbing the test results be optimized. Before large scale patient's serum testing, two kind of magnetic bead (WCX and IMAC-Cu) were compared about their peak number, peak area and peak intensity of mass spectrum. The best one was selected and used in large scale testing later. During the process, flexControlMS3.0, ClinProToolsTM2.1 and flexAnalysis3.0 software were used respectively, in instrumentation control, data analyzing and mass spectrum collection.Results: when the ratio of sample to matrix keep at 1:5(ul), laboratory temperature and humidity keep at 20℃-25℃, 10%-30%, the sample crystallize was the best. More freeze-thaw cycles had more influence on mass spectrum, especially in small range proteins, so the samples should be operated within 3 cycles in order to get good results. Reproducibility of MALDI-TOF test in this study was quite satisfactory; its CV% was within 1.26%-30.79%. After comparison, WCX was better than IMAC-Cu magnetic bead.Conclusion: MALDI-TOF MS as a high-tech method in proteomics analysis, quality control of operating sequence in whole procedure is very important. WCX was better than IMAC-Cu magnetic bead in protein purification and should be selected in large scale testing later.Part Three: The comparison of serum proteomics of chronic hepatitis B and associated-diseasesObjective: To search biomarkers and evaluate their diagnostic value by comparing different patient's serum protein with the aim to provide information for clinical use.Materials: According to the experimental design, serum was collected from patients with 14 acute hepatitis B (AHB), 76 chronic hepatitis B (CHB), 41 liver cirrhosis (LC) and 14 hepatocellular carcinoma (HCC). After separation and purification by magnetic bead WCX, their mass spectrums were detected by MALDI-TOF MS. During the process, flexControlMS3.0, ClinProToolsTM2.1 and flexAnalysis3.0 software were used in instrumentation control, data analysis and mass spectrum collection respectively. The serum protein profiles of different patients were compared and different proteins were searched. In this study, we only selected the top ten significant different proteins in each comparsion and calculated their efficiency for classifying or diagnosing.Result: 49 distinguished proteins were found by comparison between AHB and CHB, 4154Da and 4210Da were better in diagnosing AHB among them. Combined them together and used in diagnosing AHB, its sensitivity, specificity and correct rate validation were 100% (14/14), 84.21% (64/76) and 86.67% (78/90) respectively. In follow-up study of four AHB patients, 2952Da, 4210Da, 5337Da and 5904Da proteins were quite different in serum expression according to patient's outcome. Because there was no difference between mild and moderate types of CHB patient's serum protein, combined them together as one group and compared it with severe type. 1866Da protein was found to be the best one in diagnosing severe type CHB; its sensitivity, specificity and correct rate validation were 100%, 88.89% and 93.42% respectively. After comparing CHB and HCC group, 9 distinguished proteins were found. Among them, 11243Da and 4210Da proteins were better in diagnosing HCC. Combined them together and used in diagnosing HCC, its sensitivity, specificity and correct rate validation were 92.86% (13/14), 77.63% (59/76) and 80% (72/90) respectively. There was only one protein different in comparison between HCC and LC group.Conclusion: There was significant difference between AHB and CHB patient's serum protein profile. 4154Da and 4210Da protein can be combined as a biomarker used in diagnosing AHB and should be studied further. In addition, 2952Da, 4210DDa, 5337Da and 5904Da also need study further. 1866Da can be used as a biomarker in classifying severe type among CHB patients. 11243Da and 4210Da protein are valuable in diagnosing HCC from CHB, exhibiting significantly more diagnostic value than LC.Part Four: Identification of the distinguished proteinsObjective: To identify 4210Da and 1866Da protein, analyze their mechanism of action in disease development after virus B infected.Materials: Liquid chromatography tandem mass spectrometry (LC-MS/MS) was used in protein identification. The data were analyzed by BioworksBrowser3.3.1SP1 and DataAnalysis3.4, searched from IPI Human (3.45) and Mascot database.Result: Considering the result in part three, 4154Da, 4210Da, 2952Da 5337Da, 5904Da and 1866Da proteins were valuable in diagnosing virus B infection and associated disease; among them, 4210Da and 1866Da were most important. 1866Da and 4210Da protein were identified as"DRC3f"and"Eukaryotic peptide chain release factor GTP-binding subunit ERF", respectively.Conclusion: 1866Da was identified as"DRC3f"which was a derivant from complement C3. The expression of C3f or DRC3f in serum maybe reflected some kind of mechanism of liver inflammatory reaction. 4210Da protein was identified as"Eukaryotic peptide chain release factor GTP-binding subunit ERF", a portion of eRF3b (36 peptides). The expression of 4210Da in serum may also reflect some kind of mechanism of liver infection. Complement system is a very important part in human immunity and ERF plays an important roal in participate protein synthesis. Although the mechanism of the two proteins work in liver inflammatory reaction still unknow, but showed a significant potential value in clinical diagnosing as biomarkers. |