The role of signal processing concepts in genomics and proteomics. (English) Zbl 1094.92044

Summary: With the enormous amount of genomic and proteomic data that is available to us in the public domain, it is becoming increasingly important to be able to process this information in ways that are useful to humankind. Signal processing methods have played an important role in this context, some of which are reviewed in this paper. First we review the role of digital filtering techniques in gene identification. We then discuss the topic of long-range correlation between base pairs in DNA sequences. This correlation corresponds to a \(1/f\) type of power spectrum. We also describe some of the recent applications of Fourier methods in the study of proteins. Finally we mention the role of Karhunen-Loève like transforms in the interpretation of DNA microarray data for gene expression.


92C55 Biomedical imaging and signal processing
92C40 Biochemistry, molecular biology
Full Text: DOI


[2] Alberts, B.; Bray, D.; Johnson, A.; Lewis, J.; Raff, M.; Roberts, K.; Walter, P., Essential Cell Biology (1998), Garland Publishing Inc: Garland Publishing Inc New York
[3] Trifonov, E. N.; Sussman, J. L., The pitch of chromatin DNA is reflected in its nucleotide sequence, Proc. Nat. Acad. Sci. USA, 77, 3816-3820 (1980)
[4] Tiwari, S.; Ramachandran, S.; Bhattacharya, A.; Bhattacharya, S.; Ramaswamy, R., Prediction of probable genes by Fourier analysis of genomic sequences, CABIOS, 13, 3, 263-270 (1997)
[5] Li, W., The study of correlation structures of DNA sequencesa critical review, Computers Chem., 21, 4, 257-271 (1997)
[6] Fickett, J. W., The gene prediction probleman overview for developers, Computers Chem., 20, 1, 103-118 (1996)
[9] Herzel, H.; Trifonov, E. N.; Weiss, O.; Groβe, I., Interpreting correlations in biosequences, Physica A, 249, 449-459 (1998)
[10] Crochiere, R. E.; Rabiner, L. R., Multirate Digital Signal Processing (1983), Prentice Hall, Inc: Prentice Hall, Inc Englewood Cliffs, NJ
[11] Vaidyanathan, P. P., Multirate Systems and Filter Banks (1993), Prentice Hall, Inc: Prentice Hall, Inc Englewood Cliffs, NJ · Zbl 0784.93096
[12] Regalia, P. A.; Mitra, S. K.; Vaidyanathan, P. P., The digital allpass filter: a versatile signal processing building block, Proc. IEEE, 76, 19-37 (1988)
[13] Oppenheim, A. V.; Schafer, R. W., Discrete-Time Signal Processing (1999), Prentice Hall, Inc: Prentice Hall, Inc Englewood Cliffs, NJ
[15] Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E., Long-range correlations in nucleotide sequences, Nature, 356, 168-170 (1992)
[17] Voss, R. F., Evolution of long-range fractal correlations and \(1/f\) noise in DNA base sequences, Phys. Rev. Lett., 68, 25, 3805-3808 (1992)
[18] de Sousa Vieira, M., Statistics of DNA sequencesa low-frequency analysis, Phys. Rev. E, 60, 5, 5932-5937 (1999)
[20] Li, W., Expansion-modification systemsa model for spatial \(1/f\) spectra, Phys. Rev. A, 43, 10, 5240-5260 (1991)
[21] Wornell, G. W., A Karhunen-Loeve-like expansion for \(1/f\) processes via wavelets, IEEE Trans. Inform. Theory, 36, 4, 859-861 (1990)
[22] Hausdorff, H.; Peng, C.-K., Multiscaled randomness: a possible source of \(1/f\) noise in biology, Phys. Rev. E, 54, 2, 2154-2157 (1996)
[23] Cosic, I., Macromolecular bioactivityis it resonant interaction between macromolecules?—theory and applications, IEEE. Trans. Biomed. Eng., 41, 12, 1101-1114 (1994)
[24] Pirogova, E.; Fang, Q.; Akay, M.; Cosic, I., Investigation of the structural and functional relationships of oncogene proteins, Proc. IEEE, 90, 12, 1859-1867 (2002)
[25] Murray, K. B.; Gorse, D.; Thornton, J. M., Wavelet transforms for the characterization and detection of repeating motifs, J. Mol. Biol., 316, 341-363 (2002)
[26] Brown, P. O.; Botstein, D., Exploring the new world of the genome with DNA microarrays, Nature America (Genetics supplement), 21, 33-37 (1999)
[27] Wang, Y.; Lu, J.; Lee, R.; Gu, Z.; Clarke, R., Iterative normalization of cDNA microarray data, IEEE Trans. Inform. Tech. Biomed., 6, 1, 29-37 (2002)
[28] Alter, O.; O. Brown, P.; Botstein, D., Singular value decomposition for genome-wide expression data processing and modeling, Proc. Natl. Acad. of Sci. USA, 97, 18, 10101-10106 (2000)
[29] Alter, O.; Brown, P. O.; Botstein, D., Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms, Proc. Natl. Acad. Sci. USA, 100, 6, 3351-3356 (2003)
[30] Krogh, A.; Saira Mian, I.; Haussler, D., A hidden Markov model that finds genes in E. coli DNA, Nucleic Acids Res., 22, 4768-4778 (1994)
[31] Salzberg, S. L.; Delcher, A. L.; Kasif, S.; White, O., Microbial gene identification using interpolated Markov models, Nucleic Acids Res., 26, 2, 544-548 (1998)
[32] Huang, W.; Fuhrmann, D. R.; Politte, D. G.; Thomas, L. J.; States, D. J., Filter matrix estimation in automated DNA sequencing, IEEE Trans. Biomed. Eng., 45, 4, 422-428 (1998)
[33] Davies, S. W.; Eizenman, M.; Pasupathy, S., Optimal structure for automatic processing of DNA sequences, IEEE Trans. Biomed. Eng., 46, 9, 1044-1056 (1999)
[35] Storz, G., An expanding universe of noncoding RNAs, Science, 296, 1260-1263 (2002)
[37] Eddy, S. R., Computational genomics of noncoding RNA genes, Cell, 109, 137-140 (2002)
[38] Oppenheim, A. V.; Willsky, A. S.; Nawab, S. H., Signals and Systems (1997), Prentice Hall, Inc: Prentice Hall, Inc Englewood Cliffs, NJ
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.