zbMATH — the first resource for mathematics

Real-time scale selection in hybrid multi-scale representations. (English) Zbl 1067.68753
Griffin, Lewis D. (ed.) et al., Scale space methods in computer vision. 4th international conference, Scale Space 2003, Isle of Skye, UK, June 10–12, 2003. Proceedings. Berlin: Springer (ISBN 3-540-40368-X/pbk). Lect. Notes Comput. Sci. 2695, 148-163 (2003).
Summary: Local scale information extracted from visual data in a bottom-up manner constitutes an important cue for a large number of visual tasks. This article presents a framework for how the computation of such scale descriptors can be performed in real time on a standard computer.
The proposed scale selection framework is expressed within a novel type of multi-scale representation, referred to as hybrid multi-scale representation, which aims at integrating and providing variable trade-offs between the relative advantages of pyramids and scale-space representation, in terms of computational efficiency and computational accuracy. Starting from binomial scale-space kernels of different widths, we describe a family pyramid representations, in which the regular pyramid concept and the regular scale-space representation constitute limiting cases. In particular, the steepness of the pyramid as well as the sampling density in the scale direction can be varied.
It is shown how the definition of \(\gamma\)-normalized derivative operators underlying the automatic scale selection mechanism can be transferred from a regular scale-space to a hybrid pyramid, and two alternative definitions are studied in detail, referred to as variance normalization and \(l_{p}\)-normalization. The computational accuracy of these two schemes is evaluated, and it is shown how the choice of sub-sampling rate provides a trade-off between the computational efficiency and the accuracy of the scale descriptors. Experimental evaluations are presented for both synthetic and real data. In a simplified form, this scale selection mechanism has been running for two years, in a real-time computer vision system.
For the entire collection see [Zbl 1031.68003].

68T45 Machine vision and scene understanding
Full Text: Link