Bayesian detection of embryonic gene expression onset in C. elegans. (English) Zbl 1454.62345

Summary: To study how a zygote develops into an embryo with different tissues, large-scale 4D confocal movies of C. elegans embryos have been produced recently by experimental biologists. However, the lack of principled statistical methods for the highly noisy data has hindered the comprehensive analysis of these data sets. We introduced a probabilistic change point model on the cell lineage tree to estimate the embryonic gene expression onset time. A Bayesian approach is used to fit the 4D confocal movies data to the model. Subsequent classification methods are used to decide a model selection threshold and further refine the expression onset time from the branch level to the specific cell time level. Extensive simulations have shown the high accuracy of our method. Its application on real data yields both previously known results and new findings.


62P10 Applications of statistics to biology and medical sciences; meta analysis
62-08 Computational methods for problems pertaining to statistics
Full Text: DOI arXiv Euclid


[1] Andersen, E. C., Lu, X. and Horvitz, H. R. (2006). C. elegans ISWI and NURF301 antagonize an Rb-like pathway in the determination of multiple cell fates. Development 133 2695-2704.
[2] Bao, Z., Murray, J. I., Boyle, T., Ooi, S. L., Sandel, M. J. and Waterston, R. H. (2006). Automated cell lineage tracing in caenorhabditis elegans. Proc. Natl. Acad. Sci. USA 103 2707-2712.
[3] Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statist. Sci. 7 457-472. · Zbl 1386.65060
[4] Good, K., Ciosk, R., Nance, J., Neves, A., Hill, R. J. and Priess, J. R. (2004). The t-box transcription factors tbx-37 and tbx-38 link glp-1/notch signaling to mesoderm induction in C. elegans embryos. Development 131 1967-1968.
[5] Guralnik, V. and Srivastava, J. (1999). Event detection from time series data. In KDD’ 99 Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 17 33-42. ACM, San Diego, CA.
[6] Harris, D., Burges, J. C. C., Kaufman, L., Smola, J. A. and Vladimir, N. V. (1997). Support vector regression machines. Adv. Neural Inf. Process. Syst. 9 155-161.
[7] Hu, J., Zhao, Z., Yalamanchili, H., Wang J., Ye, K. and Fan, X. (2015). Supplement to “Bayesian detection of embryonic gene expression onset in C. elegans .” , DOI:10.1214/15-AOAS820SUPPB , DOI:10.1214/15-AOAS820SUPPC , DOI:10.1214/15-AOAS820SUPPD , DOI:10.1214/15-AOAS820SUPPE , DOI:10.1214/15-AOAS820SUPPF . · Zbl 1454.62345
[8] Krause, M. (1995). Myod and myogenesis in C. elegans. BioEssays 17 228.
[9] Liben-Nowell, D. and Kleinberg, J. (2008). Tracing information flow on a global scale using Internet chain-letter data. Proc. Natl. Acad. Sci. USA 105 4633-4638.
[10] Liu, X., Long, F., Peng, H., Aerni, S. J., Jiang, M., Blanco, A. S., Murray, J. I., Preston, E., Mericle, B., Batzoglou, S., Myers, E. W. and Kim, S. K. (2009). Analysis of cell fate from single-cell gene expression profiles in C. elegans. Cell 139 623-633.
[11] Long, F., Peng, H., Liu, X., Kim, S. K. and Myers, E. (2009). A 3D digital atlas of C. elegans and its application to single-cell analyses. Nat. Methods 6 667-672.
[12] Maduroa, M. F., Hillb, R. J., Heidc, P. J., Smitha, E. D. N., Zhu, J., Priess, J. R. and Rothman, J. H. (2005). Genetic redundancy in endoderm specification within the genus caenorhabditis. Dev. Biol. 284 522.
[13] Murray, J. I., Bao, Z., Boyle, T. J., Boeck, M. E., Mericle, B. L., Nicholas, T. J., Zhao, Z., Sandel, M. J. and Waterston, R. H. (2008). Automated analysis of embryonic gene expression with cellular resolution in C. elegans. Nature Methods 5 703-709.
[14] Murray, J. I., Boyle, T. J., Preston, E., Vafeados, D., Mericle, B., Weisdepp, P., Zhao, Z., Bao, Z., Boeck, M. and Waterston, R. H. (2012). Multidimensional regulation of gene expression in the C. elegans embryo. Genome Research 22 1282-1294.
[15] Perreault, L., Bernier, J., Bobee, B. and Parent, E. (2000). Bayesian change-point analysis in hydrometeorological time series. Journal of Hydrology 235 221-241.
[16] Picard, D. (1985). Testing and estimating change-points in time series. Adv. in Appl. Probab. 17 841-867. · Zbl 0585.62151
[17] Spencer, W. C., Zeller, G., Watson, J. D., Henz, S. R., Watkins, K. L., McWhirter, R. D., Petersen, S., Sreedharan, V. T., Widmer, C., Jo, J., Reinke, V., Petrella, L., Strome, S., Stetina, S. E. V., Katz, M., Shaham, S., Ratsch, G. and Miller, D. M. (2011). A spatial and temporal map of C. elegans gene expression. Genome Research 21 325-341.
[18] Sulston, J. E., Schierenberg, E., White, J. G. and Thomson, J. N. (1983). The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev. Biol. 100 64-119.
[19] Yalamanchili, H. K., Yan, B., Li, M. J., Qin, J., Zhao, Z., Chin, F. Y. and Wang, J. (2013). Dynamic delay gene network inference from high temporal data using gapped local alignment. Bioinformatics 30 377-383.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.