Font Size: a A A

Biological Sequences Sites Recognition Based On Filters And Time-Frequency Analysis

Posted on:2009-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y XuFull Text:PDF
GTID:2120360248454769Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The application of theories and technologies of signal processing to molecular biology produces a new field-genomic signal processing, which provides a new way to analyze the enormous data of genomics, proteomics and RNA. It enhances the efficiency of biological experiments and lowers the experiment cost. In this dissertation, filters and time-frequency is utilized to recognize exons in gene sequences and hot spots in protein sequences, which are two fundamental structures in biological sequences.In this thesis, principles of filters and design of FIR digital filter is introduced, which provide theories for design of exons prediction filter. Meanwhile, definitions and calculation methods of short-time Fourier transform and pseudo Wigner-Ville distribution are also introduced, which provide calculation methods for identification of protein hot spots using time-frequency analysis. Some existed problems are explained.For exons in gene sequences, it can be recognized using filters. The property 'period-3 ' of exons is utilized to design a FIR digital filter for exons recognition. Through coding gene sequences, filtering gene sequences, the locations of exons can be observed on the squared function figure. Meanwhile, for the short-length sequences or sequences of weak 'period-3 ' property, its peaks is relatively small and can not be observed clearly, which is the key point of reseach for the next step.For hot spots in proteins, it can be recognized using time-frequency analysis. The string sequences of proteins can be converted into numerical sequences, using Resonant Recognition Model model. It can be utilized to calculate time-frequency transform and characteristic frequency. The disturbing terms can be lessened and reduced by multiplying characteristic frequency and transform matrix. The hot spots can be observed through the peaks in the clear spectrum. Due to the high randomicity of protein sequences, there still exist cross-terms in the pseudo Wigner-Ville distribution. Therefore, suppressing the cross-terms are the problems we should resolved in the next step. The application of signal processing technology to biological sequences sites analysis is provided elementarily, where many problems still exist. However, it offers a new way to recognize these sites, which holds guide for biological experiments.
Keywords/Search Tags:Genomic Signal Processing, Exons, Hot Spot, FIR Digital Filter, Time-Frequency Analysis
PDF Full Text Request
Related items