An accelerated computational approach in proteomics. (English) Zbl 1444.92074

Naik, Ganesh (ed.), Biomedical signal processing. Advances in theory, algorithms and applications. Singapore: Springer. Ser. BioEng., 389-432 (2020).
Summary: The advent of new technologies and research in the field of computational bioinformatics has revolutionized the rate of biological data generation. As a result, the contribution of data from proteomics and genomics has increased by many folds, doubling every 18 months. Thereby, the operations involved in proteomics study have become significantly compute intensive. Protein identification, a fundamental process in proteomics study, requires identification of one or more proteins from a large database of proteins. It is rigorously used for disease diagnosis and prognosis by assisting in biomarker identification and discovery for the futuristic medical prescription. Now a days, mass spectrometry is a widely used analytical tool in proteomics studies which includes peak detection and database searching as essential steps. To cope up with the ever increasing growth of biological data in the domain of proteomics, protein identification requires accelerated and efficient solutions. This chapter mainly focuses on the review of various hardware accelerated methodologies for peak detection in mass spectrometry data and database searching for strings from an algorithmic and architectural perspective in the context of protein identification.
For the entire collection see [Zbl 1433.92003].


92D20 Protein sequences, DNA sequences
92-08 Computational methods for problems pertaining to biology
Full Text: DOI


[1] Garret, R.H., Grisham, C.M.: Biochemistry. Cengage Learning (2013)
[2] Ebbing, D., Gammon, S.D.: General Chemistry. Cengage Learning (2010)
[3] Abhai, K., Verma, A., Mishra, V.N., Singh, S.: Proteomics based identification of differential plasma proteins and changes in white matter integrity as markers in early detection of mild cognitive impaired subjects at high risk of alzheimer’s disease. Neurosci. Lett. 676, 71-77 (2018)
[4] Pinker, K., Chin, J., Melsaether, A.N., Morris, E.A., Moy, L.: Precision medicine and radiogenomics in breast cancer: new approaches toward diagnosis and treatment. Radiology 287(3), 732-747 (2018)
[5] Buchberger, A.R., DeLaney, K., Johnson, J., Li, L.: Mass spectrometry imaging: a review of emerging advancements and future insights. Anal. Chem. 90(1), 240-265 (2017)
[6] Bischoff, R., Luider, T.M.: Methodological advances in the discovery of protein and peptide disease markers. J. Chromatogr. B 803(1), 27-40 (2004)
[7] Boschetti, E., D’Amato, A., Candiano, G., Righetti, P.G.: Protein biomarkers for early detection of diseases: the decisive contribution of combinatorial peptide ligand libraries. J. proteomics 188, 1-14 (2018)
[8] Clarke, W., Zhang, Z., Chan, D.W.: The application of clinical proteomics to cancer and other diseases. Clin. Chem. Lab. Med. 41(12), 1562-1570 (2003)
[9] Sallam, R.M.: Proteomics in cancer biomarkers discovery: challenges and applications. Dis. markers 2015 (2015)
[10] Wu, L., Qu, X.: Cancer biomarker detection: recent achievements and challenges. Chem. Soc. Rev. 44(10), 2963-2997 (2015)
[11] Petricoin, E.F., Zoon, K.C., Kohn, E.C., Barrett, J.C., Liotta, L.A.: Clinical proteomics: translating benchside promise into bedside reality. Nat. Rev. Drug Discov. 1(9), 683 (2002)
[12] Bloss, C.S., Jeste, D.V., Schork, N.J.: Genomics for disease treatment and prevention. Psychiatric Clin. 34(1), 147-166 (2011)
[13] Aebersold, R.M.M.: Mass spectrometry-based proteomics. Nature 422, 198-207 (2003)
[14] Henzel, W.J., et al.: Protein identification: the origins of peptide mass fingerprinting. J. Am. Soc. Mass Spectrom. 14(9), 931-942 (2003)
[15] Bogdán, I.A., et al.: High-performance hardware implementation of a parallel database search engine for real-time peptide mass fingerprinting. Bioinformatics 24(13), 1498-1502 (2008)
[16] Gras, R., et al.: Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection. Electrophoresis 20, 3535-50 (1999)
[17] Adam, B., et al.: Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. Cancer Res. 62, 3609-3614 (2002)
[18] Coombes, K., et al.: Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transformn. Proteomics 5, 4107-17 (2005)
[19] Mantini, D., et al.: LIMPIC: a computational method for the separation of protein signals from noise. BMC Bionform. 8, 101 (2007)
[20] Satten, G., et al.: Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens. Bioinformatics 20(17), 3128-36 (2004)
[21] Yasui, Y., et al.: An automated peak identification/calibration procedure for high-dimensional protein measures from mass spectrometers. J. Biomed. Biotechnol. 4, 242-8 (2003)
[22] Diamandis, E.: Mass spectrometry as a diagnostic and a cancer biomarker discovery tool: opportunities and potential limitations. Mol. Cell Proteomics 3(4), 367-78 (2004)
[23] Mantini, D., et al.: Independent component analysis for the extraction of reliable protein signal profiles from MALDI-TOF mass spectra. Bioinformatics 24, 63-70 (2008)
[24] Zerck, A., et al.: An iterative strategy for precursor ion selection for lc-ms/ms based shotgun proteomics. J. Proteome Res. 8(7), 3239-3251 (2009)
[25] Peace, R.J., et al.: Exact string matching for ms/ms protein identification using the cell broadband engine. CMBES Proc. 33(1), (2018)
[26] Benson, D., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D., Ostell, J., Sayers, E.: GenBank. Nucleic Acids Res. 45(1) (2016)
[27] Zhou, C., et al.: Speeding up tandem mass spectrometry-based database searching by longest common prefix. BMC Bioinform. 11(1), 577 (2010)
[28] Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333-340 (1975) · Zbl 0301.68048
[29] Schadt, E.E., et al.: Computational solutions to large-scale data management and analysis. Nat. Rev. Genet. 11(9), 647 (2010)
[30] Aluru, S., Jammula, N.: A review of hardware acceleration for computational genomics. IEEE Des. Test 31(1), 19-30 (2014)
[31] Arram, J., et al.: Leveraging fpgas for accelerating short read alignment. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 14(3), 668-677 (2017)
[32] Shyu, K.K., Lee, M.H., Wu, Y.T., Lee, P.L.: Implementation of pipelined fastICA on FPGA for real-time blind source separation. IEEE Trans. Neural Netw. 19(6), 958-970 (2008)
[33] Yang, C.H., Shih, Y.H., Chiueh, H.: An \(81.6 \upmu{\rm W}\) fastica processor for epileptic seizure detection. IEEE Trans. Biomed. Circuits Syst. 9(1), 60-71 (2015)
[34] Mammone, N., Foresta, F.L., Morabito, F.C.: Automatic artifact rejection from multichannel scalp eeg by wavelet ica. IEEE Sens. J. 12(3), 533-542 (2012)
[35] Van, L.D., Wu, D.Y., Chen, C.S.: Energy-efficient fastica implementation for biomedical signal separation. IEEE Trans. Neural Netw. 22(11), 1809-1822 (2011)
[36] Bhardwaj, S, et al.: Online and automated reliable system design to remove blink and muscle artefact in eeg. In: 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 6784-6787 (2015)
[37] Naik, G.R., et al.: An ica-ebm-based semg classifier for recognizing lower limb movements in individuals with and without knee pathology. IEEE Trans. Neural Syst. Rehabil. Eng. 26(3), 675-686 (2018)
[38] Jiménez-González, A., James, C.J.: Extracting sources from noisy abdominal phonograms: a single-channel blind source separation method. Med. Biol. Eng. Comput. 47(6), 655-664 (2009)
[39] Jiménez-González, A., James, C.J.: Time-structure based reconstruction of physiological independent sources extracted from noisy abdominal phonograms. IEEE Trans. Biomed. Eng. 57(9), 2322-2330 (2010)
[40] Zou, X., et al.: Speech signal enhancement based on map algorithm in the ica space. IEEE Trans. Signal Proc. 56(5), 1812-1820 (2008) · Zbl 1390.94517
[41] Lee, H.Y., et al.: Dnn-based feature enhancement using doa-constrained ica for robust speech recognition. IEEE Signal Proc. Lett. 23(8), 1091-1095 (2016)
[42] Hyvärinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626-634 (1999)
[43] Oja, E., Yuan, Z.: The fastica algorithm revisited: convergence analysis. IEEE Trans. Neural Netw. 17(6), 1370-1381 (2006)
[44] Gotze, J., Paul, S., Sauer, M.: An efficient Jacobi-like algorithm for parallel eigenvalue computation. IEEE Trans. Comput. 42(9), 1058-1065 (1993) · Zbl 1396.65087
[45] Acharyya, A., et al.: Coordinate rotation based low complexity n-d fastica algorithm and architecture. IEEE Trans. Signal Proc. 59(8), 3997-4011 (2011) · Zbl 1392.94060
[46] Hyyrö, H., et al.: On exact string matching of unique oligonucleotides. Comput. Biol. Med. 35(2), 173-181 (2005)
[47] Sahab, Z.J., et al.: Methodology and applications of disease biomarker identification in human serum. Biomark. Insights 2, 117727190700200034 (2007)
[48] Brudno, M., et al.: Fast and sensitive multiple alignment of large genomic sequences. BMC Bioinf. 4(1), 66 (2003)
[49] Michael, M., et al.: Siteblast-rapid and sensitive local alignment of genomic sequences employing motif anchors. Bioinformatics 21(9), 2093-2094 (2004)
[50] Alex, A.T., et al.: Hardware-accelerated protein identification for mass spectrometry. Rapid Commun. Mass Spectrom. Int. J. Devoted Rapid Dissem. Up-to-the-Minute Res. Mass Spectrom. 19(6), 833-837 (2005)
[51] Dandass, Y.S., et al.: Accelerating string set matching in fpga hardware for bioinformatics research. BMC Bioinf. 9(1), 197 (2008)
[52] Gudur, V.Y., Thallada, S., Deevi, A.R., Gande, V.K., Acharyya, A., Bhandari, V., Sharma, P., Khursheed, S., Naik, G.R.: Reconfigurable hardware-software codesign methodology for protein identification. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 456-2459 (2016)
[53] Kim, H., Choi, K.I.: A pipelined non-deterministic finite automaton-based string matching scheme using merged state transitions in an fpga. PLoS ONE 11(10), e0163535 (2016)
[54] Maxfield, C.: The Design Warrior’s Guide to FPGAs: Devices, Tools and Flows. Elsevier (2004)
[55] Palnitkar S (2003) Verilog HDL: A Guide to Digital Design and Synthesis, vol. 1. Prentice Hall Professional
[56] Dm, S., et al.: Human biomarker discovery and predictive models for disease progression for idiopathic pneumonia syndrome following allogeneic stem cell transplantation. Mol. Cell Proteomics 11(6) (2012)
[57] Brody, E., et al.: Life’s simple measures: unlocking the proteome. J. Mol. Biol. 422(5), 595-606 (2012)
[58] Heikkinen, M., et al.: Independent component analysis to mass spectra of aluminium sulphate. World Acad. Sci. Eng. Technol. 26, 173-177 (2007)
[59] Chen, Y., Wolfgang, W., Hoehenwarter, W.: Comparative analysis of phytohormone-responsive phosphoproteins in arabidopsis thaliana using tio2-phosphopeptide enrichment and mass accuracy precursor alignment. Plant J. 63, 1-17 (2012)
[60] Bhardwaj, S., Raghuraman, S., Acharyya, A.: Coordinate rotation and vector cross product based hardware accelerator for \(n\) D FastICA. In: 2017 European Conference on Circuit Theory and Design (ECCTD), pp. 1-4 (2017)
[61] Bhardwaj, S., et al.: Vector cross product and coordinate rotation based nd hybrid fastica. J. Low Power Electron. 14(2), 351-364 (2018)
[62] Hyvärinen, A., Oja, E.: A fast fixed-point algorithm for independent component analysis. Neural Comput. 9(7), 1483-1492 (1997)
[63] Bravo, I., Mazo, M., Lazaro, J.L., Jimenez, P., Gardel, A., Marron, M.: Novel hw architecture based on fpgas oriented to solve the eigen problem. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 16(12), 1722-1725 (2008)
[64] Volder, J.E.: The cordic trigonometric computing technique. IRE Trans. Electron. Comput. EC-8(3), 330-334 (1959)
[65] Walther, J.S.: A unified algorithm for elementary functions. In: Spring Joint Computer Conference, pp. 18-20 (1971)
[66] Adapa, B., Biswas, D., Bhardwaj, S., Raghuraman, S., Acharyya, A., Maharatna, K.: Coordinate rotation-based low complexity \(k\)-means clustering architecture. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 25(4), 1568-1572 (2017)
[67] Aggarwal, S., Meher, P.K., Khare, K.: Concept, design, and implementation of reconfigurable cordic. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 24(4), 1588-1592
[68] Bhardwaj, S., Bhagyaraja, A., Shashank, R., Jadhav, P., Biswas, D., Acharyya, A., Naik, G.R.: Low complexity single channel ica architecture design methodology for pervasive healthcare applications. In: 2016 IEEE International Workshop on Signal Processing Systems (SiPS). IEEE, pp. 39-44 (2016)
[69] Shaw, R.: Vector cross products in n dimensions. Int. J. Math. Educ. Sci. Technol. 18, 803-816 (1987) · Zbl 0631.15020
[70] Dittmer, A.: Cross product identities in arbitrary dimension. Am. Math. Mon. 101, 887-891 (1994) · Zbl 0823.15030
[71] Bhardwaj, S., Raghuraman, S., Acharyya, A.: Simplex FastICA: an accelerated and low complex architecture design methodology for \(n\) D FastICA. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. (2019)
[72] Bhardwaj, S., Raghuraman, S., Acharyya, A.: Low complexity hardware accelerator for \(n\) D FastICA based on coordinate rotation. In: 2017 IEEE International Workshop on Signal Processing Systems (SiPS), pp. 1-6 (2017)
[73] Acharyya, A., Maharatna, K., Al-Hashimi, B.M.: Algorithm and architecture for n-d vector cross-product computation. IEEE Trans. Signal Process. 59(2), 812-826 (2011) · Zbl 1392.94059
[74] Sung, T., Hu, Y., Yu, H.: Doubly pipelined cordic array for digital signal processing algorithms. In: ICASSP ’86. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 11, pp. 1169-1172 (1986)
[75] Rao, C.S., et al.: String matching problems with parallel approaches an evaluation for the most recent studies. Glob. J. Comput. Sci. Technol. (2013)
[76] Schaumont, P.R.: The nature of hardware and software. In: A Practical Introduction to Hardware/Software Codesign, pp. 3-30. Springer (2013)
[77] Teich, J.: Hardware/software codesign: the past, the present, and predicting the future. Proc. IEEE 100 (Special Centennial Issue), 1411-1430 (2012)
[78] Santarini, M.: Zynq-7000 epp sets stage for new era of innovations. Xcell J. 75, 8-13 (2011)
[79] Dorta, T., Jiménez, J., Martín, J., Bidarte, U., Astarloa, A.: Reconfigurable multiprocessor systems: a review. Int. J. Reconfigurable Comput. 2010 (2010)
[80] Tong, J.G., et al.: Soft-core processors for embedded systems. In: 2006 ICM’06 International Conference on Microelectronics. IEEE, pp. 170-173 (2006)
[81] Senhadji-Navarro, R., et al.: Performance evaluation of RAM-based implementation of finite state machines in fpgas. In: 2012 19th IEEE International Conference on Electronics, Circuits and Systems (ICECS), pp. 225-228. IEEE (2012)
[82] Gudur, V.Y., Acharyya, A.: Accelerated reconfigurable string matching using hardware-software codesign for computational bioinformatics applications. In: 2017 European Conference on Circuit Theory and Design (ECCTD), pp. 1-4 (2017)
[83] Gudur, V.Y., Acharyya, A.: Hardware-software codesign based accelerated and reconfigurable methodology for string matching in computational bioinformatics applications. IEEE/ACM Trans. Comput. Biol. Bioinf. 1-1 (2019)
[84] Xilinx, Inc.: Zynq-7000 all programmable SoC: embedded design tutorial. In: A Hands-On Guide Effective Embedded System Design UG1165 (v20173) (2017)
[85] Gävert, H., et al.: The fastica package for Matlab (2017). https://research.ics.aalto.fi/ica/fastica/
[86] Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.K., Eugene Stanley, H.: Physiobank, physiotoolkit, and physionet. Circulation 101(23), e215-e220 (2000)
[87] Bateman, A., et al.: Uniprot: a hub for protein information. Nucleic Acids Res. 43(D1), D204-D212 (2015)
[88] Gasteiger, E., et al.: Protein identification and analysis tools on the expasy server. In: The Proteomics Protocols Handbook, pp. 571-607. Springer (2005)
[89] Lei, S., et al.: Scadis: a scalable accelerator for data-intensive string set matching on fpgas. In: Trustcom/BigDataSE/I SPA, 2016 IEEE, pp. 1190-1197. IEEE (2016)
[90] Faro, S. · Zbl 1293.68314
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.