SLEDGE: sequential labeling of image edges for boundary detection. (English) Zbl 1270.68353

Summary: Our goal is to detect boundaries of objects or surfaces occurring in an arbitrary image. We present a new approach that discovers boundaries by sequential labeling of a given set of image edges. A visited edge is labeled as on or off a boundary, based on the edge’s photometric and geometric properties, and evidence of its perceptual grouping with already identified boundaries. We use both local Gestalt cues (e.g., proximity and good continuation), and the global Helmholtz principle of non-accidental grouping. A new formulation of the Helmholtz principle is specified as the entropy of a layout of image edges. For boundary discovery, we formulate a new, policy iteration algorithm, called SLEDGE. Training of SLEDGE is iterative. In each training image, SLEDGE labels a sequence of edges, which induces loss with respect to the ground truth. These sequences are then used as training examples for learning SLEDGE in the next iteration, such that the total loss is minimized. For extracting image edges that are input to SLEDGE, we use our new, low-level detector. It finds salient pixel sequences that separate distinct textures within the image. On the benchmark Berkeley Segmentation Datasets 300 and 500, our approach proves robust and effective. We outperform the state of the art both in recall and precision for different input sets of image edges.


68T45 Machine vision and scene understanding
68R10 Graph theory (including graph drawing) in computer science


Full Text: DOI Link


[1] Ahuja, N., & Todorovic, S. (2007). Learning the taxonomy and models of categories present in arbitrary images. In ICCV, Rio de Janeiro.
[2] Ahuja, N., & Todorovic, S. (2008). Connected segmentation tree–A joint representation of region layout and hierarchy. In CVPR.
[3] Arbelaez, P. (2006). Boundary extraction in natural images using ultrametric contour maps. In POCV (p. 182).
[4] Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2010). Contour detection and hierarchical image segmentation. In TPAMI, 99(RapidPosts).
[5] Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. TPAMI, 24(4), 509–522. · Zbl 05111066
[6] Biederman, I. (1988). Surface versus edge-based determinants of visual recognition. Cognitive Psychology, 20(1), 38–64.
[7] Borenstein, E., & Ullman, S. (2002). Class-specific, top-down segmentation. In ECCV, Copenhagen (vol. 2, pp. 109–124). · Zbl 1039.68601
[8] Borgefors, G. (1988). Hierarchical Chamfer matching: A parametric edge matching algorithm. TPAMI, 10(6), 849–865. · Zbl 05111607
[9] Brice, C. R., & Fennema, C. L. (1970). Scene analysis using regions. Artificial Intelligence, 1, 205–226.
[10] Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. In Seventh international world-wide web conference (WWW : 1998).
[11] Canny, J. (1986). A computational approach to edge detection. TPAMI, 8(6), 679–698.
[12] Coughlan, J. M., & Yuille, A. L. (2002). Bayesian A* tree search with expected o(n) node expansions: Applications to road tracking. Neural Computation, 14(8), 1929–1958. · Zbl 1009.62022
[13] Daume, H., III, Langford, J., & Marcu, D. (2009). Search-based structured prediction. Machine Learning Journal.
[14] Deng, Y., & Manjunath, B. S. (2001). Unsupervised segmentation of color-texture regions in images and videos. TPAMI, 23(8), 800–810. · Zbl 05111967
[15] Desolneux, A., Moisan, L., & Morel, J. (2001). Edge detection by Helmholtz principle. Journal of Mathematical Imaging and Vision, 14(3), 271–284. · Zbl 0988.68819
[16] Desolneux, A., Moisan, L., & Morel, J.-M. (2000). Meaningful alignments. IJCV, 40(1), 7–23. · Zbl 1012.68701
[17] Desolneux, A., Moisan, L., & Morel, J. -M. (2003). A grouping principle and four applications. TPAMI, 25(4), 508–513.
[18] Dietterich, T. G. (2000). Ensemble methods in machine learning. In Lecture Notes in Computer Science (pp. 1–15).
[19] Dollar, P., Tu Z., Belongie, S. (2006). Supervised learning of edges and object boundaries. In CVPR (pp. 1964–1971).
[20] Donoser, M., Riemenschneider, H., & Bischof, H. (2010). Linked edges as stable region boundaries. In CVPR.
[21] Drummond, C., & Holte, R. C. (2003). C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II.
[22] Felzenszwalb, P., & McAllester, D. (2006). A min-cover approach for finding salient curves. In POCV.
[23] Ferrari, V., Jurie, F., & Schmid, C. (2010). From images to shape models for object detection. IJCV, 87(3), 284–303. · Zbl 06023079
[24] Freund, Y., Mansour, Y., & Schapire, R. E. (2001) Why averaging classifiers can protect against overfitting. In Proceedings of the 8th international workshop on artificial intelligence and statistics.
[25] Fridman, A. (2003). Mixed markov models. Proceedings of the National Academy of Sciences, 100(14), 8092–8096. · Zbl 1069.60088
[26] Galun, M., Basri, R., & Brandt, A. (2007). Multiscale edge detection and fiber enhancement using differences of oriented means. In ICCV (pp. 1–8).
[27] Geman, D., & Jedynak, B. (1996). An active testing model for tracking roads in satellite images. TPAMI, 18(1), 1–14. · Zbl 05110517
[28] Greminger, M. A., & Nelson, B. J. (2008). A deformable object tracking algorithm based on the boundary element method that is robust to occlusions and spurious edges. IJCV, 78(1), 29–45. · Zbl 05322228
[29] Guy, G., & Medioni, G. (1996). Inferring global perceptual contours from local features. IJCV, 20(1–2), 113–133. · Zbl 05475064
[30] Helmholtz, H. (1962). Treatise on physiological optics (first published in 1867). New York: Dover.
[31] Hochberg, J. E. (1957). Effects of the Gestalt revolution: The Cornell symposium on perception. Psychological Review, 64(2), 73–84.
[32] Itti, L., & Koch, C. (2001). Computational modeling of visual attention. Nature Reviews Neuroscience, 2(3), 194–203.
[33] Jain, A., Gupta, A., & Davis, L. S. (2010). Learning what and how of contextual models for scene labeling. ECCV, 4, 199–212.
[34] Jermyn, I. H., & Ishikawa, H. (2001). Globally optimal regions and boundaries as minimum ratio weight cycles. TPAMI, 23(10), 1075–1088. · Zbl 05112416
[35] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. JAIR, 4, 237–285.
[36] Kim, G., Faloutsos, C., & Hebert, M. (2008). Unsupervised modeling of object categories using link analysis techniques. In CVPR.
[37] Kittler, J., Hatef, M., Duin, R. P. W., & Matas, J. (1998). On combining classifiers. TPAMI, 20, 226–239.
[38] Koffka, K. (1935). Principles of Gestalt psychology. London: Routledge.
[39] Kokkinos, I. (2010). Boundary detection using F-measure-, Filter- and Feature- ( $${F}\^3$$ ) boost. In ECCV.
[40] Kokkinos, I. (2010). Highly accurate boundary detection and grouping. In CVPR.
[41] Konishi, S., Yuille, A., Coughlan, J., & Zhu, S.-C. (1999). Fundamental bounds on edge detection: An information theoretic evaluation of different edge cues. In CVPR.
[42] Konishi, S., Yuille, A. L., Coughlan, J. M., & Zhu, S. C. (2003). Statistical edge detection: Learning and evaluating edge cues. TPAMI, 25, 57–74. · Zbl 05110373
[43] Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML, Williamstown (pp. 282–289).
[44] Lee, Y., & Grauman, K. (2009). Shape discovery from unlabeled image collections. In CVPR.
[45] Lindeberg, T. (1998). Edge detection and ridge detection with automatic scale selection. IJCV, 30(2), 117–156. · Zbl 05469780
[46] Lowe, D. G. (1985). Perceptual organization and visual recognition. Norwell: Kluwer Academic Publishers.
[47] Mahamud, S., Williams, L. R., Thornber, K. K., & Xu, K. (2003). Segmentation of multiple salient closed contours from real images. TPAMI, 25(4), 433–444. · Zbl 05111075
[48] Mairal, J., Leordeanu, M., Bach, F., Hebert, M., & Ponce, J. (2008). Discriminative sparse image models for class-specific edge detection and image interpretation. In ECCV (pp. 43–56).
[49] Maire, M., Arbelaez, P., Fowlkes, C., & Malik, J. (2008). Using contours to detect and localize junctions in natural images. In CVPR (pp. 1–8).
[50] Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV, Vancouver.
[51] Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI, 26, 530–549. · Zbl 05111303
[52] Morrone, M. C., & Owens, R. A. (1987). Feature detection from local energy. Pattern Recognition Letters, 6(5), 303–313.
[53] Palmer, S. (1999). Vision science: Photons to phenomenology. Cambridge: MIT Press.
[54] Perona, P., & Malik, J. (1990). Detecting and localizing edges composed of steps, peaks and roofs. In ICCV.
[55] Porrill, J., & Pollard, S. (1991). Curve matching and stereo calibration. IVC, 9(1), 45–50.
[56] Ren, X. (2008). Multi-scale improves boundary detection in natural images. In ECCV, Marseille.
[57] Ren, X., Fowlkes, C., & Malik, J. (2008). Learning probabilistic models for contour completion in natural images. IJCV, 77(1–3), 47–63. · Zbl 05322213
[58] Rubner Y., & Tomasi C., (1996). Coalescing texture descriptors. In ARPA image understanding, Workshop (pp. 927–935).
[59] Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2008). Labelme: A database and web-based tool for image annotation. IJCV, 77(1–3), 157–173. · Zbl 05322219
[60] Sharon, E., Br ,A., & Basri, R. (2001). Segmentation and boundary detection using multiscale intensity measurements. In CVPR (pp. 469–476).
[61] Shashua, A., & Ullman, S. (1988). Structural saliency: The detection of globally salient structures using a locally connected network. In ICCV, Tampa.
[62] Taskar, B., Guestrin, C., & Koller, D. (2004). Max-margin Markov networks. In NIPS, Vancouver.
[63] Teh, C. H., & Chin, R. T. (1989). On the detection of dominant points on digital curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(8), 859–872. · Zbl 05112053
[64] Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484. · Zbl 1222.68321
[65] Varma, M., & Zisserman, A. (2003). Texture classification: Are filter banks necessary? CVPR, 2, 691.
[66] Wang, S., Kubota, T., Siskind, J. M., & Wang, J. (2005). Salient closed boundary extraction with ratio contour. TPAMI, 27(4), 546–561. · Zbl 05111102
[67] Will, S., Hermes, L., Buhmann, J. M., Puzicha, & J. (2000). On learning texture edge detectors. In ICIP (pp. 877–880).
[68] Williams, L., & Jacobs, D. (1995). Stochastic completion fields: A neural model of illusory contour shape and salience. In ICCV (pp. 408–415).
[69] Williams, L. R., & Thornber, K. K. (1999). A comparison of measures for detecting natural shapes in cluttered backgrounds. IJCV, 34(2–3), 81–96.
[70] Xiong, W., & Jia, J. (2007). Stereo matching on objects with fractional boundary. In CVPR.
[71] Yu, S. (2005). Segmentation induced by scale invariance. In CVPR.
[72] Zhu, Q., Song, G., & Shi, J. (2007). Untangling cycles for contour grouping. In ICCV (pp. 1–8).
[73] Zhu, S.-C. (1999). Embedding Gestalt laws in Markov random fields. TPAMI, 21(11), 1170–1187.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.