Font Size: a A A

Theoretical Simulation And Artificial Intelligence Prediction Of Protein Photo-spectroscopy

Posted on:2022-09-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:S YeFull Text:PDF
GTID:1481306323981939Subject:Physical chemistry
Abstract/Summary:PDF Full Text Request
Proteins are the cornerstone of life.Versatile functions of biological systems originate from various forms and degrees of protein expressions.Protein structure determination is the key to understand protein functions.Revealing dynamic changes of proteins structures is crucial for understanding their physical and chemical properties.How to obtain dynamic structural information of proteins in order to establish a structure-property relationship for proteins has become the "holy grail" issue for proteins structures study.Molecule spectroscopy provides powerful toolkits to decipher molecular structure information.The chemical structures of the molecular systems can be deduced from response signal of light-matter interaction measured by experiments and interpreted by quantum chemistry calculations.However,for macromolecule like proteins that consist of thousands of atoms,it is prohibitive to simulate the optical response signals directly by quantum chemistry calculations.Moreover,the life functions of proteins mainly depend on the dynamic evolution of their own structures,which then requires quantum chemistry calculations on thousands of their dynamic configurations in a fluctuating environment to understand how they work.The theoretical interpretation of spectroscopic signals and ensuing assignment of structural details are an iterative,tedious and expensive task.Therefore,it is pressing to develop a cost-effective approach for simulating optical properties of proteins to support rapid interpretation proteins spectra.In this dissertation,I have presented an alternative computation protocol by using data-driven machine learning techniques in conjunction with density functional theory(DFT)calculation and molecular dynamics(MD)simulation,which addressed the computational chemistry challenge of spectra simulation and structure prediction,and established structural-property relationship between spectra signals and molecular structures.My thesis is comprised of five chapters,as follows:In Chapter 1,I give the outline of development of protein spectroscopy.Structure determination has always been at the heart of protein science.The developed experimental molecular spectroscopy has been widely used to characterize the proteins structures.However,the theoretical simulation of protein spectroscopy faces serious computational bottlenecks.Because the structure of the protein in solution reflects the overall effect of interaction between the solute molecule and the surrounding environment,it is necessary to calculate the local photoelectric response of a large number of nanoscale peptide bonds.The repeated expensive quantum-mechanical calculations in a fluctuating environment posed a great challenge for spectroscopy simulation.Machine learning technology based on precise and controllable scientific data training become a powerful tool for physical chemistry and protein research.In Chapter 2,I introduce density functional theory(DFT)and time-dependent density functional theory(TDDFT).DFT is based on the Hohenberg-Kohn theorem and obtains the properties of many electron systems by solving Kohn-Sham equation.With various of approximate exchange-correlation functional,DFT has become the most widely used and practical method in quantum chemistry.TDDFT aims to obtain properties of excited states by introducing time-dependent perturbation theory into DFT and has also become the workhorse for studying excited states properties.In Chapter 3,I introduce molecular dynamics(MD)simulations.With the increase of the number of atoms in the system,it will become increasingly difficult to conduct quantum mechanics level calculations.Especially for bio-macromolecules like proteins,it is virtually impossible to carry out quantum mechanical simulation on such large systems.MD simulation uses classical force fields to describe various intramolecular and intermolecular interactions and let the system of interest evolve with time at classical mechanics' level,which can handle the dynamic behavior of a large system such as protein relatively easily.In Chapter 4,I introduce my recent work on machine learning(ML)protocol for simulating the electronic properties of protein peptide bonds.The structure of the protein system consists of a backbone of peptide bonds connected to various amino acid residues.The ultraviolet absorption spectrum of the peptide bond skeleton was widely used to detect the secondary structure information of a protein.However,the theoretical simulation of ultraviolet spectrum of the protein peptide bond faces serious computational bottlenecks.Here,I have constructed the structure-activity relationship between molecular ground state information of protein peptide bond model and its electronic excited state properties by using neural network technology,which can predict the ultraviolet spectrum of the protein peptide bond very efficiently and accurately.In Chapter 5,I introduce my recent work on machine learning protocol for simulating infrared(IR)spectra of proteins.To reach the aim,I followed the precedent work on peptide bond model and developed a cost-effective protocol combing model Hamiltonian method for spectra simulation and neural networks models for predicting Hamiltonian matrix elements.This ML protocol only needs a few key structural descriptors as inputs and can rapidly predict amide I IR spectra of proteins in good agreement with experiment.Compared with traditional quantum chemical calculations,the machine learning protocol developed here is about 4 orders of magnitude faster.Further,the ML protocol is transferrable and applicable to a variety of proteins,which enables us to distinguish different protein secondary structures,detect the temperature effect on protein infrared spectroscopy and monitor protein folding process.
Keywords/Search Tags:quantum chemistry, proteins spectra, artificial intelligence, machine learning
PDF Full Text Request
Related items