Scalpel: extracting neurons from calcium imaging data. (English) Zbl 1412.62171

Summary: In the past few years, new technologies in the field of neuroscience have made it possible to simultaneously image activity in large populations of neurons at cellular resolution in behaving animals. In mid-2016, a huge repository of this so-called “calcium imaging” data was made publicly available. The availability of this large-scale data resource opens the door to a host of scientific questions for which new statistical methods must be developed.
In this paper we consider the first step in the analysis of calcium imaging data – namely, identifying the neurons in a calcium imaging video. We propose a dictionary learning approach for this task. First, we perform image segmentation to develop a dictionary containing a huge number of candidate neurons. Next, we refine the dictionary using clustering. Finally, we apply the dictionary to select neurons and estimate their corresponding activity over time, using a sparse group lasso optimization problem. We assess performance on simulated calcium imaging data and apply our proposal to three calcium imaging data sets.
Our proposed approach is implemented in the \(\mathtt R\) package \(\mathtt {scalpel}\), which is available on \(\mathtt {CRAN}\).


62P10 Applications of statistics to biology and medical sciences; meta analysis
62J07 Ridge regression; shrinkage estimators (Lasso)
68T05 Learning and adaptive systems in artificial intelligence
62H12 Estimation in multivariate analysis
Full Text: DOI arXiv Euclid


[1] Ahrens, M. B., Orger, M. B., Robson, D. N., Li, J. M. and Keller, P. J. (2013). Whole-brain functional imaging at cellular resolution using light-sheet microscopy. Nat. Methods10 413-420.
[2] Apthorpe, N., Riordan, A., Aguilar, R., Homann, J., Gu, Y., Tank, D. and Seung, H. S. (2016). Automatic neuron detection in calcium imaging data using convolutional networks. In Advances in Neural Information Processing Systems 3270-3278.
[3] Bien, J. and Tibshirani, R. (2011). Hierarchical clustering with prototypes via minimax linkage. J. Amer. Statist. Assoc.106 1075-1084. · Zbl 1229.62083 · doi:10.1198/jasa.2011.tm10183
[4] Bien, J. and Tibshirani, R. (2015). protoclust: Hierarchical Clustering with Prototypes. Available at https://CRAN.R-project.org/package=protoclust. R package version 1.5. · Zbl 1229.62083 · doi:10.1198/jasa.2011.tm10183
[5] Chen, T.-W., Wardill, T. J., Sun, Y., Pulver, S. R., Renninger, S. L., Baohan, A., Schreiter, E. R., Kerr, R. A., Orger, M. B., Jayaraman, V., Looger, L. L., Svoboda, K. and Kim, D. S. (2013). Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature499 295-300.
[6] Diego, F. and Hamprecht, F. A. (2013). Learning multi-level sparse representations. In Advances in Neural Information Processing Systems 818-826.
[7] Diego, F. and Hamprecht, F. A. (2014). Sparse space-time deconvolution for calcium image analysis. In Advances in Neural Information Processing Systems 64-72.
[8] Diego, F., Reichinnek, S., Both, M., Hamprecht, F. et al. (2013). Automated identification of neuronal activity from calcium imaging by sparse dictionary learning. In Biomedical Imaging (ISBI), 2013 IEEE 10th International Symposium on 1058-1061. IEEE Press, New York.
[9] Dombeck, D. A., Khabbaz, A. N., Collman, F., Adelman, T. L. and Tank, D. W. (2007). Imaging large-scale neural activity with cellular resolution in awake, mobile mice. Neuron56 43-57.
[10] Friedrich, J., Zhou, P. and Paninski, L. (2017). Fast online deconvolution of calcium imaging data. PLoS Comput. Biol.13 e1005423.
[11] Friedrich, J., Soudry, D., Mu, Y., Freeman, J., Ahres, M. and Paninski, L. (2015). Fast constrained non-negative matrix factorization for whole-brain calcium imaging data. In NIPS Workshop on Statistical Methods for Understanding Neural Systems.
[12] Gower, J. C. (2006). Similarity, dissimilarity and distance, measures of. Encyclopedia of Statistical Sciences.
[13] Grienberger, C. and Konnerth, A. (2012). Imaging calcium in neurons. Neuron73 862-885.
[14] Haeffele, B., Young, E. and Vidal, R. (2014). Structured low-rank matrix factorization: Optimality, algorithm, and applications to image processing. In Proceedings of the 31st International Conference on Machine Learning (ICML-14) 2007-2015.
[15] Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. Springer, New York. · Zbl 1273.62005
[16] Helmchen, F. and Denk, W. (2005). Deep tissue two-photon microscopy. Nat. Methods2 932-940.
[17] Huber, D., Gutnisky, D. A., Peron, S., O’connor, D. H., Wiegert, J. S., Tian, L., Oertner, T. G., Looger, L. L. and Svoboda, K. (2012). Multiple dynamic representations in the motor cortex during sensorimotor learning. Nature484 473-478.
[18] Jewell, S. and Witten, D. (2018). Exact spike train inference via \(ℓ_0\) optimization. Ann. Appl. Stat.12 2457-2482. · Zbl 1412.62159
[19] Ko, H., Hofer, S. B., Pichler, B., Buchanan, K. A., Sjöström, P. J. and Mrsic-Flogel, T. D. (2011). Functional specificity of local synaptic connections in neocortical networks. Nature473 87-91.
[20] Looger, L. L. and Griesbeck, O. (2012). Genetically encoded neural activity indicators. Curr. Opin. Neurobiol.22 18-23.
[21] Maruyama, R., Maeda, K., Moroda, H., Kato, I., Inoue, M., Miyakawa, H. and Aonishi, T. (2014). Detecting cells using non-negative matrix factorization on calcium imaging data. Neural Networks55 11-19.
[22] Mellen, N. M. and Tuong, C.-M. (2009). Semi-automated region of interest generation for the analysis of optically recorded neuronal activity. Neuroimage47 1331-1340.
[23] Mishchencko, Y., Vogelstein, J. T. and Paninski, L. (2011). A Bayesian approach for inferring neuronal connectivity from calcium fluorescent imaging data. Ann. Appl. Stat.5 1229-1261. · Zbl 1223.62162 · doi:10.1214/09-AOAS303
[24] Mukamel, E. A., Nimmerjahn, A. and Schnitzer, M. J. (2009). Automated analysis of cellular signals from large-scale calcium imaging data. Neuron63 747-760.
[25] Ozden, I., Lee, H. M., Sullivan, M. R. and Wang, S. S.-H. (2008). Identification and clustering of event patterns from in vivo multiphoton optical recordings of neuronal ensembles. J. Neurophysiol.100 495-503.
[26] Pachitariu, M., Packer, A. M., Pettit, N., Dalgleish, H., Häusser, M. and Sahani, M. (2013). Extracting regions of interest from biological images with convolutional sparse block coding. In Advances in Neural Information Processing Systems 1745-1753.
[27] Paninski, L., Pillow, J. and Lewi, J. (2007). Statistical models for neural encoding, decoding, and optimal stimulus design. Prog. Brain Res.165 493-507.
[28] Petersen, A., Simon, N. and Witten, D. (2018). Supplement to “SCALPEL: Extracting neurons from calcium imaging data.” DOI:10.1214/18-AOAS1159SUPP. · Zbl 1412.62171
[29] Pnevmatikakis, E. A., Soudry, D., Gao, Y., Machado, T. A., Merel, J., Pfau, D., Reardon, T., Mu, Y., Lacefield, C., Yang, W. et al. (2016). Simultaneous denoising, deconvolution, and demixing of calcium imaging data. Neuron89 285-299.
[30] Prevedel, R., Yoon, Y.-G., Hoffmann, M., Pak, N., Wetzstein, G., Kato, S., Schrödel, T., Raskar, R., Zimmer, M., Boyden, E. S. and Vaziri, A. (2014). Simultaneous whole-animal 3D imaging of neuronal activity using light-field microscopy. Nat. Methods11 727-730.
[31] Rochefort, N. L., Jia, H. and Konnerth, A. (2008). Calcium imaging in the living brain: Prospects for molecular medicine. Trends in Molecular Medicine14 389-399.
[32] Shen, H. (2016). Brain-data gold mine could reveal how neurons compute. Nature535 209-210.
[33] Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2013). A sparse-group lasso. J. Comput. Graph. Statist.22 231-245.
[34] Smith, S. L. and Häusser, M. (2010). Parallel processing of visual space by neighboring neurons in mouse visual cortex. Nature Neuroscience13 1144-1149.
[35] Sonka, M., Hlavac, V. and Boyle, R. (2014). Image Processing, Analysis, and Machine Vision. Cengage Learning, Boston, MA.
[36] Svoboda, K. and Yasuda, R. (2006). Principles of two-photon excitation microscopy and its applications to neuroscience. Neuron50 823-839.
[37] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B58 267-288. · Zbl 0850.62538
[38] Vogelstein, J. T., Packer, A. M., Machado, T. A., Sippy, T., Babadi, B., Yuste, R. and Paninski, L. (2010). Fast nonnegative deconvolution for spike train inference from population calcium imaging. Journal of Neurophysiology104 3691-3704.
[39] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol.68 49-67. · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[40] Zhou, P., Resendez, S. L., Stuber, G. D., Kass, R. E. and Paninski, L. (2016). Efficient and accurate extraction of in vivo calcium signals from microendoscopic video data.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.