High-throughput data analysis in behavior genetics. (English) Zbl 1194.62120

Summary: In recent years, a growing need has arisen in different fields for the development of computational systems for automated analysis of large amounts of data (high-throughput). Dealing with nonstandard noise structure and outliers, that could have been detected and corrected in manual analysis, must now be built into the system with the aid of robust methods. We discuss such problems and present insights and solutions in the context of behavior genetics, where data consists of a time series of locations of a mouse in a circular arena. In order to estimate the location, velocity and acceleration of the mouse, and identify stops, we use a nonstandard mix of robust and resistant methods: LOWESS and repeated running median. In addition, we argue that protection against small deviations from experimental protocols can be handled automatically using statistical methods. In our case, it is of biological interest to measure a rodent’s distance from the arena’s wall, but this measure is corrupted if the arena is not a perfect circle, as required in the protocol. The problem is addressed by estimating robustly the actual boundary of the arena and its center using a nonparametric regression quantile of the behavioral data, with the aid of a fast algorithm developed for that purpose.


62P10 Applications of statistics to biology and medical sciences; meta analysis
92D10 Genetics and epigenetics
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)


EthoVision; fda (R)
Full Text: DOI arXiv


[1] Archer, J. (1973). Tests for emotionality in rats and mice: A review. Animal Behaviour 21 205-235.
[2] Besson, M. and Martin, J.-R. (2005). Centrophobism/thigmotaxis, a new role for the mushroom bodies in Drosophila. Developmental Neurobiology 62 386-396.
[3] Bolivar, V., Cook, M. and Flaherty, L. (2000). List of transgenic and knockout mice: Behavioral profiles. Mamm. Genome 11 260-274.
[4] Branson, K., Robie, A. A., Bender, J., Perona, P. and Dickinson, M. H. (2009). High-throughput ethomics in large groups of Drosophila. Nature Methods 6 451-457.
[5] Brunner, D., Nestlerc, E. and Leahyc, E. (2002). High-throughput technologies in need of high-throughput behavioral systems. Drug Discovery Today 7 S107-S112.
[6] Chan, Y. T., Elhalwagy, Y. Z. and Thomas, S. M. (2002). Estimation of circle parameters by centroiding. J. Optim. Theory Appl. 114 363-371. · Zbl 1030.62044
[7] Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. J. Amer. Statist. Assoc. 74 829-836. · Zbl 0423.62029
[8] Crabbe, J. C., Wahlsten, D. and Dudek, B. C. (1999). Genetics of mouse behavior: Interactions with laboratory environment. Science 284 1670-1672.
[9] Drai, D., Benjamini, Y. and Golani, I. (2000). Statistical discrimination of natural modes of motion in rat exploratory behavior. Journal of Neuroscience Methods 96 119-131.
[10] Drai, D. and Golani, I. (2001). SEE, a tool for the visualization and analysis of rodent exploratory behavior. Neuroscience and Biobehavioral Reviews 25 409-426.
[11] Finn, D. A., Rutledge-Gorman, M. T. and Crabbe, J. C. (2003). Genetic animal models of anxiety. Neurogenetics 4 109-135.
[12] Hall, C. S. (1936). Emotional behavior in the rat. III. The relationship between emotionality and ambulatory activity. J. Comp. Physiol. Psychol. 22 345-352.
[13] Hen, I., Sakov, A., Kafkafi, N., Golani, I. and Benjamini, Y. (2004). The dynamics of spatial behavior: How can robust smoothing techniques help? Journal of Neuroscience Methods 133 161-172.
[14] Golani, I., Benjamini, Y. and Eilam, D. (1993). Stopping behavior: Constraints on exploration in rats ( Rattus norvegicus ). Behavioural Brain Research 53 21-33.
[15] Kafkafi, N., Benjamini, Y., Sakov, A., Elmer, G. and Golani, I. (2005). Genotype-environment interactions in mouse behavior: A way out of the problems. Proc. Natl. Acad. Sci. USA 102 4619-4624.
[16] Karimaki, V. (1991). Effective circle fitting for particle trajectories. Nuclear Instrumentation Methods in Physics Research 305A 187-191.
[17] Kim, C. E. (1984). Digital disks. IEEE Transactions on Pattern Analysis and Machine Intelligence 6 372-374. · Zbl 0531.68048
[18] Koenker, R. (2005). Quantile Regression . Cambridge Univ. Press, Cambridge. · Zbl 1111.62037
[19] Koenker, R. and Bassett, G. S. (1978). Regression quantiles. Econometrika 46 33-50. · Zbl 0373.62038
[20] Likhvar, N. K. and Honda, Y. (2008). Choice of degree of smoothing in fitting nonparametric regression models for temparture-mortality relation in Japan based on a priori knowledge. Journal of Health Science 54 143-153.
[21] Lind, N. M., Vinther, M., Hemmingsen, R. P. and Hansen, A. (2005). Validation of a digital video tracking system for recording pig locomotor behaviour. Journal of Neurosience Methods 143 123-132.
[22] Lipkind, D., Sakov, A., Kafkafi, N., Elmer, G., Benjamini, Y. and Golani, I. (2004). New replicable anxiety-related measures of wall vs. center behavior of mice in the open field. Journal of Applied Physiology 97 347-359.
[23] Noldus, L. P. J. J., Spink, A. J. and Tegelenbosch, R. A. J. (2001). EthoVision: A versatile video tracking system for automation of behavioral experiments. Behavior Research Methods, Instruments, & Computers: A Journal of the Psychonomic Society, Inc. 33 398-414.
[24] Ramsay, J. and Silverman, B. (1997). Functional Data Analysis . Springer, New York. · Zbl 0882.62002
[25] Royer, F. and Lutcavage, M. (2008). Filtering and interpreting location errors in Satellite telemetry of marine animals. Journal of Experimental Marine Biology and Ecology 359 1-10.
[26] Shapiro, S. D. (1978). Properties of transforms for the detection of curves in noisy pictures. Computer Vision Graphics and Image Processing 8 129-143. · Zbl 0379.68065
[27] Silverman, B. (1986). Density Estimation for Statistics and Data Analysis . Chapman & Hall, London. · Zbl 0617.62042
[28] Spink, A. J., Tegelenbosch, R. A. J., Buma, M. O. S. and Noldus, L. P. J. J. (2001). The EthoVision video tracking system-A tool for behavioral phenotyping of transgenic mice. Physiology & Behavior 73 731-734.
[29] Steele, A. D., Jackson, W. S., King, O. D. and Lindquist, S. (2007). The power of automated high-resolution behavior analysis revealed by its application to mouse models of Huntington’s and prion diseases. Proc. Natl. Acad. Sci. 104 1983-1988.
[30] Tukey, J. W. (1977). Exploratory Data Analysis . Addison-Wesley, Reading, MA. · Zbl 0409.62003
[31] Valente, D., Golani, I. and Mitra, P. P. (2007). Analysis of the trajectory of Drosophila melanogaster in a circular open field arena. PLoS ONE 2(10) e:1083 DOI: .
[32] Vitelson, H. (2005). Spatial behavior of pre-walking infants: Patterns of locomotion in a novel environment. Ph.D. thesis, Tel Aviv Univ.
[33] Walsh, R. N. and Cummins, R. A. (1976). The open-field test: A critical review. Psychological Bulletin 83 482-504.
[34] Wang, H. G., Sung, E. and Venkateswarlu, R. (2005). Estimating the eye gaze from one eye. Computer Vision and Image Understanding 98 83-103.
[35] Yu, K. and Jones, M. C. (1998). Local linear quantile regression. J. Amer. Statist. Assoc. 93 228-237. · Zbl 0906.62038
[36] Zelniker, E. and Clarkson, I. V. L. (2006). A statistical analysis of the Delogne-Kasa method for fitting circle. Digital Signal Processing 16 498-522. · Zbl 0858.94001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.