zbMATH — the first resource for mathematics

Approximating the architecture of visual cortex in a convolutional network. (English) Zbl 1429.92027
Summary: Deep convolutional neural networks (CNNs) have certain structural, mechanistic, representational, and functional parallels with primate visual cortex and also many differences. However, perhaps some of the differences can be reconciled. This study develops a cortex-like CNN architecture, via (1) a loss function that quantifies the consistency of a CNN architecture with neural data from tract tracing, cell reconstruction, and electrophysiology studies; (2) a hyperparameter-optimization approach for reducing this loss, and (3) heuristics for organizing units into convolutional-layer grids. The optimized hyperparameters are consistent with neural data. The cortex-like architecture differs from typical CNN architectures. In particular, it has longer skip connections, larger kernels and strides, and qualitatively different connection sparsity. Importantly, layers of the cortex-like network have one-to-one correspondences with cortical neuron populations. This should allow unambiguous comparison of model and brain representations in the future and, consequently, more precise measurement of progress toward more biologically realistic deep networks.
92B20 Neural networks for/in biological studies, artificial life and related topics
Full Text: DOI
[1] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., … Zheng, X. (2016). TensorFlow: Large-scale machine learning on heterogeneous distributed systems. In Proceedings of the 12th Symposium on Operating Systems Design and Implementation (vol. 16, pp. 265-283). Berkeley, CA: USENIX.
[2] Angelucci, A., Levitt, J. B., Walton, E. J. S., Hupe, J.-M., Bullier, J., & Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. Journal of Neuroscience, 22(19), 8633-8646. ,
[3] Bakker, R., Wachtler, T., & Diesmann, M. (2012). CoCoMac 2.0 and the future of tract-tracing databases. Frontiers in Neuroinformatics, 6, 30. doi: ,
[4] Binzegger, T., Douglas, R. J., & Martin, K. A. C. (2004). A quantitative map of the circuit of cat primary visual cortex. Journal of Neuroscience, 24(39), 8441-8453. doi: ,
[5] Boussaoud, D., Desimone, R., & Ungerleider, L. G. (1991). Visual topography of area TEO in the macaque. Journal of Comparative Neurology, 306(4), 554-575. doi: ,
[6] Briggs, F. (2010). Organizing principles of cortical layer 6. Frontiers in Neural Circuits, 4(February), 1-8. doi: ,
[7] Brincat, S. L., & Connor, C. E. (2006). Dynamic shape synthesis in posterior inferotemporal cortex. Neuron, 49(1), 17-24. doi: ,
[8] Bullier, J., Kennedy, H., & Salinger, W. (1984). Branching and laminar origin of projections between visual cortical areas in the cat. Journal of Comparative Neurology, 228, 329-341. ,
[9] Cadieu, C. F., Hong, H., Yamins, D. L. K., Pinto, N., Ardila, D., Solomon, E. A., … DiCarlo, J. J. (2014). Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Computational Biology, 10(12). doi: ,
[10] Callaway, E. M. (2004). Feedforward, feedback and inhibitory connections in primate visual cortex. Neural Networks, 17, 625-632. doi: , · Zbl 1051.92007
[11] Callaway, E. M., & Wiser, A. K. (1996). Contributions of individual layer 2-5 spiny neurons to local circuits in macaque primary visual cortex. Visual Neuroscience, 13, 907-922. ,
[12] Changpinyo, S., Sandler, M., & Zhmoginov, A. (2017). The power of sparsity in convolutional neural networks. arXiv:1702, 1-13.
[13] Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834-848. ,
[14] Chollet, F. (2015). Keras. https://keras.io/.
[15] Cowley, B. R., Smith, M. A., Kohn, A., & Yu, B. M. (2016). Stimulus-driven population activity patterns in macaque primary visual cortex. PLoS Computational Biology, 12(12), 1-31. doi: ,
[16] Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., & Wei, Y. (2017). Deformable convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. Piscataway, NJ: IEEE. ,
[17] Dong, Q., Liu, B., & Hu, Z. (2017). Comparison of it neural response statistics with simulations. Frontiers in Computational Neuroscience, 11, 60. ,
[18] Eickenberg, M., Gramfort, A., Varoquaux, G., & Thirion, B. (2016). Seeing it all: Convolutional network layers map the function of the human visual system. NeuroImage, 152, 184-194. doi: ,
[19] Elston, G. N. (2007). Specialization of the neocortical pyramidal cell during primate evolution. In J. H. Kaas (Ed.), Evolution of nervous systems (pp. 191-242). Orlando, FL: Academic Press. ,
[20] Fares, T., & Stepanyants, A. (2009). Cooperative synapse formation in the neocortex. In Proceedings of the National Academy of Sciences of the United States of America, 106(38), 16463-16468. doi: ,
[21] Felleman, D. J., & Van Essen, D. C. (1987). Receptive field properties of neurons in area V3 of macaque monkey extrastriate cortex. Journal of Neurophysiology, 57(4), 889-920. ,
[22] Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1, 1-47. ,
[23] Galletti, C., Fattori, P., Gamberini, M., & Kutz, D. F. (1999). The cortical visual area V6: Brain location and visual topography. European Journal of Neuroscience, 11, 3922-3936. doi: ,
[24] Garcia-Marin, V., Kelly, J. G., & Hawken, M. J. (2017). Major feedforward thalamic input into layer 4C of primary visual cortex in primate. Cerebral Cortex, 29(1), 1-16. doi: ,
[25] Gattass, R., Gross, C. G., & Sandell, J. H. (1981). Visual topography of V2 in the macaque. Journal of Comparative Neurology, 201(4), 519-539. doi: ,
[26] Gattass, R., Sousa, A. P. B., & Gross, C. G. (1988). Visuotopic organization and extent of V3 and V4 of the macaque. Journal of Neuroscience, 8(6), 1831-1845. ,
[27] Gilbert, C. D. (1977). Laminar differences in receptive field properties of cells in cat primary visual cortex. J. Physiology, 268(1977), 391-421. doi: ,
[28] Goris, R. L., Simoncelli, E. P., & Movshon, J. A. (2015). Origin and function of tuning diversity in macaque visual cortex. Neuron, 88(4), 819-831. doi: ,
[29] Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2017). LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2222-2232. doi: ,
[30] Güçlü, U., & van Gerven, M. a. J. (2015). Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. Journal of Neuroscience, 35(27), 10005-10014. doi: ,
[31] Hassabis, D., Kumaran, D., Summerfield, C., & Botvinick, M. (2017). Neuroscience-inspired artificial intelligence. Neuron, 95(2), 245-258. doi: ,
[32] He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2980-2988). Piscataway, NJ: IEEE. doi: ,
[33] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770-778). Piscataway, NJ: IEEE. doi: ,
[34] Hong, H., Yamins, D. L. K., Majaj, N. J., & DiCarlo, J. J. (2016). Explicit information for category-orthogonal object properties increases along the ventral stream. Nature Neuroscience, 19(4), 613-622. doi: ,
[35] Huang, G., Liu, Z., van der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 4700-4708). Piscataway, NJ: IEEE. doi: ,
[36] Hubel, D. H., & Wiesel, T. N. (1974). Uniformity of monkey striate cortex: A parallel relationship between field size, scatter, and magnification factor. Journal of Comparative Neurology, 158(3), 295-305. doi: ,
[37] Hübener, M., Schwarz, C., & Bolz, J. (1990). Morphological types of projection neurons in layer 5 of cat Vd cortex. Journal of Comparative Neurology, 301, 655-674. doi: ,
[38] Issa, E. B., Cadieu, C. F., & DiCarlo, J. J. (2018). Neural dynamics at successive stages of the ventral visual stream are consistent with hierarchical error signals. bioRxiv:092551.
[39] Karpathy, A. (2014). What I learned from competing against a ConvNet on ImageNet. http://karpathy.github.io.
[40] Kasthuri, N., Hayworth, K. J., Berger, D. R., Schalek, R. L., Conchello, J. A., Knowles-Barley, S., … Lichtman, J. W. (2015). Saturated reconstruction of a volume of neocortex. Cell, 162(3), 648-661. doi: ,
[41] Khaligh-Razavi, S. M., & Kriegeskorte, N. (2014). Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Computational Biology, 10(11). doi: ,
[42] Kheradpisheh, S. R., Ghodrati, M., & Ganjtabesh, M. (2016). Deep networks can resemble human feed-forward vision in invariant object recognition. Scientific Reports, 6, 32672. doi: ,
[43] Kim, E. J., Juavinett, A. L., Kyubwa, E. M., Jacobs, M. W., & Callaway, E. M. (2015). Three types of cortical layer 5 neurons that differ in brain-wide connectivity and function. Neuron, 88(6), 1253-1267. doi: ,
[44] Kingma, D., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412/6980 [cs], 1-15.
[45] Komatsu, H., & Wurtz, R. H. (1988). Relation of cortical areas MT and MST to pursuit eye movements. I. Localization and visual properties of neurons. Journal of Neurophysiology, 60(2), 580-603. doi: ,
[46] Kravitz, D. J., Saleem, K. S., Baker, C. I., & Mishkin, M. (2011). A new neural framework for visuospatial processing. Nature Reviews Neuroscience, 12(4), 217-230. doi: ,
[47] Krizhevsky, A. (2009). Learning multiple layers of features from tiny images. PhD diss., University of Toronto.
[48] Krizhevsky, A., Nair, V., & Hinton, G. (2014). The CIFAR-10 dataset. http://www.cs.toronto.edu/kriz/cifar.html.
[49] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, 25 (pp. 1097-1105). Red Hook, NY: Curran.
[50] Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. (2016). Building machines that learn and think like people. Behavioral and Brain Sciences, 40, 1-101. doi: ,
[51] LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Handwritten digit recognition with a back-propagation network. In D. S. Touretzky (Ed.), Neural information processing systems (pp. 396-404). San Mateo, CA: Morgan Kaufmann.
[52] LeCun, Y., Denker, J. S., & Solla, S. A. (1989). Optimal brain damage. In D. S. Touretzky (Ed.), Advances in neural information processing systems (pp. 598-605). San Mateo, CA: Morgan Kaufmann.
[53] Liu, B., Wang, M., Foroosh, H., Tappen, M., & Penksy, M. (2015). Sparse convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 806-814). Piscataway, NJ: IEEE.
[54] Livingstone, M., & Hubel, D. (1988). Segregation of form, color, movement, and depth: Anatomy, physiology, and perception. Science, 240(4853), 740-749. ,
[55] Lotter, W., Kreiman, G., & Cox, D. (2017). Deep predictive coding networks for video prediction and unsupervised learning. In Proceedings of the International Conference on Learning Representations. https://openreview.net/group?id-ICLR.cc/2017/conference
[56] Luo, W., Li, Y., Urtasun, R., & Zemel, R. (2016). Understanding the effective receptive field in deep convolutional neural networks. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in neural information processing systems, 29 (pp. 4898-4906). Red Hook, NY: Curran.
[57] Lur, G., Vinck, M. A., Tang, L., Cardin, J. A., & Higley, M. J. (2016). Projection-specific visual feature encoding by layer 5 cortical subnetworks. Cell Reports, 14(11), 2538-2545. doi: ,
[58] Markov, N. T., Ercsey-Ravasz, M. M., Ribeiro Gomes, a. R., Lamy, C., Magrou, L., Vezoli, J., … Kennedy, H. (2014). A weighted and directed interareal connectivity matrix for macaque cerebral cortex. Cerebral Cortex, 24(1), 17-36. doi: ,
[59] Markov, N. T., Misery, P., Falchier, A., Lamy, C., Vezoli, J., Quilodran, R., … Knoblauch, K. (2011). Weight consistency specifies regularities of macaque cortical networks. Cerebral Cortex, 21(6), 1254-1272. doi: ,
[60] Maunsell, J. H. R., & Van Essen, D. C. (1987). Topographic organization of the middle temporal visual area in the macaque monkey: Representational biases and the relationship to callosal connections and myeloarchitectonic boundaries. Journal of Comparative Neurology, 266, 535-555. ,
[61] Merigan, W., & Maunsell, J. H. (1993). How parallel are the primate visual pathways?Annual Review of Neuroscience, 16, 369-402. ,
[62] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., … Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. doi: ,
[63] Nassi, J. J., & Callaway, E. M. (2009). Parallel processing strategies of the primate visual system. Nature Reviews Neuroscience, 10, 361-372. doi: ,
[64] Nayebi, A., Bear, D., Kubilius, J., Kar, K., Ganguli, S., Sussillo, D., … Yamins, D. L. K. (2018). Task-driven convolutional recurrent models of the visual system. arXiv:1807.00053.
[65] Nayebi, A., & Ganguli, S. (2017). Biologically inspired protection of deep networks from adversarial attacks. arXiv:1706.
[66] O’Kusky, J., & Colonnier, M. (1982). A laminar analysis of the number of neurons, glia, and synapses in the visual-cortex (area-17) of adult macaque monkeys. Journal of Comparative Neurology, 210(3), 278-290. ,
[67] Parisien, C., Anderson, C. H., & Eliasmith, C. (2008). Solving the problem of negative synaptic weights in cortical models. Neural Computation, 20(6), 1473-1494. doi: , · Zbl 1137.92008
[68] Rajalingham, R., Issa, E. B., Bashivan, P., Kar, K., Schmidt, K., & DiCarlo, J. J. (2018). Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks. Journal of Neuroscience, 38(33), 7255-7269. doi: ,
[69] Rockland, K. S. (2013). Collateral branching of long-distance cortical projections in monkey. Journal of Comparative Neurology, 521(18), 4112-4123. doi: ,
[70] Rubin, D. B., Hooser, S. D. V., & Miller, K. D. (2015). The stabilized supralinear network: A unifying circuit motif underlying multi-input integration in sensory cortex. Neuron, 85, 402-417. doi: ,
[71] Scardapane, S., Comminiello, D., Hussain, A., & Uncini, A. (2017). Group sparse regularization for deep neural networks. Neurocomputing, 241(2017), 81-89. doi: ,
[72] Schmidt, M., Bakker, R., Hilgetag, C. C., Diesmann, M., & van Albada, S. J. (2018). Multi-scale account of the network structure of macaque visual cortex. Brain Structure and Function, 223(3), 1409-1435. doi: ,
[73] Schmolesky, M. T., Wang, Y., Hanes, D. P., Thompson, K. G., Leutgeb, S., Schall, J. D., & Leventhal, A. G. (1998). Signal timing across the macaque visual system. J. Neurophysiology, 79, 3272-3278. ,
[74] Schrimpf, M., Kubilius, J., Hong, H., Majaj, N. J., Rajalingham, R., Issa, E. B., … DiCarlo, J. J. (2018). Brain-score: Which artificial neural network for object recognition is most brain-like? bioRxiv:407007.
[75] Seeliger, K., Fritsche, M., Güçlü, U., Schoenmakers, S., Schoffelen, J. M., Bosch, S. E., & van Gerven, M. A. (2017). Convolutional neural network-based encoding and decoding of visual object recognition in space and time. NeuroImage, 180, 253-266. doi: ,
[76] Shi, J., Wen, H., Zhang, Y., Han, K., & Liu, Z. (2017). Deep recurrent neural network reveals a hierarchy of process memory during dynamic natural vision. doi: ,
[77] Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations. Berkeley, CA: USENIX.
[78] Song, S., Sjostrom, P. J., Reigl, M., Nelson, S., & Chklovskii, D. B. (2005). Highly nonrandom features of synaptic connectivity in local cortical circuits. PLoS Biology, 3(3), 0507-0519. doi: ,
[79] Sun, Y., Wang, X., & Tang, X. (2016). Sparsifying neural network connections for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4856-4864). Piscataway, NJ: IEEE. doi: ,
[80] Szegedy, C., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1-9). Piscataway, NJ: IEEE. doi: ,
[81] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2818-2826). Piscataway, NJ: IEEE. doi: ,
[82] Tamura, H., & Tanaka, K. (2001). Visual response properties of cells in the ventral and dorsal parts of the macaque inferotemporal cortex. Cerebral Cortex, 11(5), 384-399. ,
[83] Thomson, A. M., & Bannister, A. P. (2003). Interlaminar connections in the neocortex. Cerebral Cortex, 13(1), 5-14. ,
[84] Tompson, J., Jain, A., LeCun, Y., & Bregler, C. (2014). Joint training of a convolutional network and a graphical model for human pose estimation. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, 27 (pp. 1799-1807). Red Hook, NY: Curran.
[85] Tripp, B. (2017). Similarities and differences between stimulus tuning in the inferotemporal visual cortex and convolutional networks. In Proceedings of the International Joint Conference on Neural Networks (pp. 3551-3560). Piscataway, NJ: IEEE. ,
[86] Tripp, B., & Eliasmith, C. (2016). Function approximation in inhibitory networks. Neural Networks, 77, 95-106. doi: , · Zbl 1415.92024
[87] Van Essen, D. C., & Newsome, W. T. (1984). The visual field representation in the striate cortex of the macaque monkey: Asymmetries, anisptropies, and indiviual variability. Vision Research, 24(5), 429-448. doi: ,
[88] Wang, P., & Cottrell, G. W. (2017). Central and peripheral vision for scene recognition: A neurocomputational modeling exploration. Journal of Vision, 17(4), 9. doi: ,
[89] Weber, A. J., Chen, H., Hubbard, W. C., & Kaufman, P. L. (2000). Experimental glaucoma and cell size, density, and number in the primate lateral geniculate nucleus. Investigative Ophthalmology and Visual Science, 41(6), 1370-1379.
[90] Wen, H., Shi, J., Chen, W., & Liu, Z. (2017). Transferring and generalizing deep-learning-based neural encoding models across subjects. doi: ,
[91] Wen, W., Wu, C., Wang, Y., Chen, Y., & Li, H. (2016). Learning structured sparsity in deep neural networks. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Neural information processing systems, 29 (pp. 1-9). Red Hook, NY: Curran.
[92] Wiser, A. K., & Callaway, M. (1996). Contributions of individual layer 6 pyramidal neurons to local circuitry in macaque primary visual cortex. Journal of Neuroscience, 16(8), 2724-2739. ,
[93] Xu, K., Ba, J. L., Kiros, R., Cho, K., Courville, A., Salakhutdinov, R., … Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning (pp. 2048-2057). New York: ACM. doi:. ,
[94] Yamins, D. L. K., Hong, H., Cadieu, C. F., Solomon, E. A., Seibert, D., & Dicarlo, J. J. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. PNAS, 111, 8619-8624. doi: ,
[95] Yu, F., & Koltun, V. (2018). Multi-scale context aggregation by dilated convolutions arXiv:1511.07122.
[96] Žbontar, J., & LeCun, Y. (2016). Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research, 17, 1-32. · Zbl 1360.68726
[97] Zhang, X., Zhou, X., Lin, M., & Sun, J. (2017). ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8619-8624). Piscataway, NJ: IEEE.
[98] Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. arXiv:1611.01578.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.