Font Size: a A A

Research And Application Of Data Format Of Analytical Instrument And Serarching System Of Mass Spectrum

Posted on:2007-08-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q HuFull Text:PDF
GTID:1102360185954799Subject:Measuring and Testing Technology and Instruments
Abstract/Summary:PDF Full Text Request
After several years' development, the analytical instrument industry has theability of studing, developing and producing in China. But the total technique levelhas gap with international advanced technology. The software industry of analyticalinstrument in China is more behindhand than hardware industry. For example, today,some instrument users still use softwares of ten years ago. The standard ofanalytical instrument data file has not been founded up to now. In recent years,there are many management and game softwares in Software CopyrightRegistration Department of China. But the analytical instrument softwares are notfound. To change the present condition of analytical instrument software in China,depended on "fifteen" national science and technology great item "the research anddevelopment of mass spectrometers" (item number: 2004BA210A04), combinedwith author's professional background in analytical chemistry and analyticalinstrument fields. The part key technical problems of general data processingsystem of analytical instrument (exposition of analytical instrument data formats,the standard file format of analytical instrument and searching system of massspectrum) were studied and resolved.1 Exposition and translation of analytical instrument data formatsThe mostly data processing systems of analytical instrument usually adoptprivate and unopened file formats to save analytical data. The analytical data filescan't be analyzed and processed in common PC because of the unkown data formats.To increase some complicated data processing functions, or process analytical datawithout workstation software. A method of exposition of analytical instrument dataformats was put foward. Thirty kind of unknown and typical data formats wereexposed by this method (including the analytical instrument data formats ofchromatogram, mass spectrum, FTIR, UV, Roman and so on), and realized thefunction that the exposed file formats were translated to SPC file format.2 Studying of the standard file format of analytical instrumentThe uniform data exchange standard should be founded to unify the dataformats for analytica instrument data sharing. This paper introduced the generalstandars of analytical instrument--JCAMP and SPC file formats, the GAML(General Analytical Markup Language—GAML)based on XML was studied. Thestorage modes and optimization of binary data in GAML were studied. A newmethod was put foward to improve the parsing speed of GAML in this paper. Theparsing content could be directly read and the parsing speed was improved byindexing the storing addresses of key elements.Considering the advantages of storage and exchange analytical instrument data,GAML has been applied in the workstation of quadrupole mass spectrometer. Thedata in the workstation was stored in GAML format. Because of the characters ofsecrecy and variety of analytical instrument formats, the data sharing of analyticalinstrument was prevented. To promot the data sharing of analytical instrument, thefile formats exposed in this paper were translated to GAML format.3 Studying of mass spectral database systemThe searching technology of mass spectral database includes three partcontents--predigestion and coding of mass spectral data, foundation of massspectral database and realization of searching method. A mass database wasfounded in this paper (The mass data comes from NIST02 mass database, includeds147198 compounds with Spectra). A coding rule by the characters of mass data wasput foward. The storage volume of the database can be saved by this rule. Analgorithm for library searching of mass spectrum was put forward. Which can beused similarity searching of mass spectrum. In consideration of the advancesearching based on baseline peak might lose the possible similar compounds. Theadvance searching ways of ten top peaks, molecular weight and molecular formulawere used to improve the speed of library searching.The mass spectral database, coding rule and algorithm for library searchinghave been used in the workstation of quadrupole mass spectrometer. This systemhas two advantages than other home mass spectral databases. (1) This system cansupport several formats(Currently, almost mass spectral atabase systems can onlysupport one file format). Now this sytem can directly open RAW file of Finnigan,MS file of HP/Agilent, DAT file of Waters and SMS file of Varian. (2) This systemcan save storage volume. The storage volume of the mass spectral database codedby the common coding rule (the m/z and I are stored by 2 byte short integer) is 62M.But the storage volume is 23M by our coding rule.4 Studying of molecular structure database systemThe molecular structure database systems that are reported in China almost arebased on Oracel or MS SQL server. To diminish in software price, reduce storagevolume and improve searching speed. A molecular structure database was foundedwith binary files by coding two-dimensional information of molecular structure(The molecular structure data comes from NIST02 mass database, includeds 147198molecular structure information). The substructure searching system based on VFalgorithm was developed. A method was put forward to choose substructure ofhigher or lower frequency as sifter. The sifter technology and other kinds ofadvance searching ways were applied in substructure searching to enhance thespeed of substructure searching.An integrated mass spectrum searching system was constituted by themolecular structure database and the mass spectral database that were put forwardin this paper. Which was used in the workstation of quadrupole mass spectrometer.This system has two advantages than other home molecular structure databasesystems. (1) The molecular structure database was founded with binary files toredcue software cost and software storage volume. (2) The sifter technology wasapplied in substructure searching to enhance the speed of substructure searching.Based on the above research findings, the problems that need further study areas follows:(1) Only 30 kinds of file format were exposed in this paper at present. Butthere are hundreds of file format. Therefore the file format of analytical instrumentshould be studied. The analytical instrument data standard with independentintellectual property also should be established.(2) The mass spectrum searching system should be perfected in universality,graphics input of molecular structure and so on. The spectrum searching includesmass spectrum, FTIR, NMR and so on. Only the mass spectrum searching systemhas been realized in this paper. So the spectrum searching of FTIR, NMR, and so onshould be studied in future work.(3) The algorithms of chemometrics are not studied in this paper because oftime matters. But which is an indispensable part of general analytical instrumentdata processing software. So the algorithms of chemometric should be studied infutue work.
Keywords/Search Tags:Analytical Instrument, Data Exchange Standard, JCAMP, SPC, GAML, Mass Spectral Database, Searching Techlogy of Mass Spectrum, Molecular Structure Database, Substructure Searching.
PDF Full Text Request
Related items