×

zbMATH — the first resource for mathematics

Temporal scale selection in time-causal scale space. (English) Zbl 1391.94092
Summary: When designing and developing scale selection mechanisms for generating hypotheses about characteristic scales in signals, it is essential that the selected scale levels reflect the extent of the underlying structures in the signal. This paper presents a theory and in-depth theoretical analysis about the scale selection properties of methods for automatically selecting local temporal scales in time-dependent signals based on local extrema over temporal scales of scale-normalized temporal derivative responses. Specifically, this paper develops a novel theoretical framework for performing such temporal scale selection over a time-causal and time-recursive temporal domain as is necessary when processing continuous video or audio streams in real time or when modelling biological perception. For a recently developed time-causal and time-recursive scale-space concept defined by convolution with a scale-invariant limit kernel, we show that it is possible to transfer a large number of the desirable scale selection properties that hold for the Gaussian scale-space concept over a non-causal temporal domain to this temporal scale-space concept over a truly time-causal domain. Specifically, we show that for this temporal scale-space concept, it is possible to achieve true temporal scale invariance although the temporal scale levels have to be discrete, which is a novel theoretical construction. The analysis starts from a detailed comparison of different temporal scale-space concepts and their relative advantages and disadvantages, leading the focus to a class of recently extended time-causal and time-recursive temporal scale-space concepts based on first-order integrators or equivalently truncated exponential kernels coupled in cascade. Specifically, by the discrete nature of the temporal scale levels in this class of time-causal scale-space concepts, we study two special cases of distributing the intermediate temporal scale levels, by using either a uniform distribution in terms of the variance of the composed temporal scale-space kernel or a logarithmic distribution. In the case of a uniform distribution of the temporal scale levels, we show that scale selection based on local extrema of scale-normalized derivatives over temporal scales makes it possible to estimate the temporal duration of sparse local features defined in terms of temporal extrema of first- or second-order temporal derivative responses. For dense features modelled as a sine wave, the lack of temporal scale invariance does, however, constitute a major limitation for handling dense temporal structures of different temporal duration in a uniform manner. In the case of a logarithmic distribution of the temporal scale levels, specifically taken to the limit of a time-causal limit kernel with an infinitely dense distribution of the temporal scale levels towards zero temporal scale, we show that it is possible to achieve true temporal scale invariance to handle dense features modelled as a sine wave in a uniform manner over different temporal durations of the temporal structures as well to achieve more general temporal scale invariance for any signal over any temporal scaling transformation with a scaling factor that is an integer power of the distribution parameter of the time-causal limit kernel. It is shown how these temporal scale selection properties developed for a pure temporal domain carry over to feature detectors defined over time-causal spatio-temporal and spectro-temporal domains.

MSC:
94A08 Image processing (compression, reconstruction, etc.) in information and communication theory
Software:
SIFT; SURF
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Adelson, E; Bergen, J, Spatiotemporal energy models for the perception of motion, J. Opt. Soc. Am. A, 2, 284-299, (1985)
[2] Aertsen, AMHJ; Johannesma, PIM, The spectro-temporal receptive field: a functional characterization of auditory neurons, Biol. Cybern., 42, 133-143, (1981) · Zbl 0466.92006
[3] Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building Rome in a day. In: Proceedings of International Conference on Computer Vision (ICCV 2009), pp. 72-79 (2009)
[4] Alías, F; Socoró, JC; Sevillano, X, A review of physical and perceptual feature extraction techniques for speech, music and environmental sounds, Appl. Sci., 6, 143, (2016)
[5] Bay, H; Ess, A; Tuytelaars, T; Gool, L, Speeded up robust features (SURF), Comput. Vis. Image Underst., 110, 346-359, (2008)
[6] Bicego, M., Lagorio, A., Grosso, E., Tistarelli, M.: On the use of SIFT features for face authentication. In: Proceedings of Computer Vision and Pattern Recognition Workshop (CVPRW 2006), p. 35 (2006)
[7] Bosch, A., Zisserman, A., Munoz, X.: Image classification using random forests and ferns. In: Proceedings of International Conference on Computer Vision (ICCV 2007), pp. 1-8. Rio de Janeiro, Brazil (2007)
[8] Bretzner, L., Laptev, I., Lindeberg, T.: Hand-gesture recognition using multi-scale colour features, hierarchical features and particle filtering. In: Proceedings of Face and Gesture, pp. 63-74. Washington D.C., USA (2002)
[9] Bretzner, L; Lindeberg, T, Feature tracking with automatic selection of spatial scales, Comput. Vis. Image Underst., 71, 385-392, (1998)
[10] Brown, M., Lowe, D.G.: Unsupervised 3d object recognition and reconstruction in unordered datasets. In: Proceedings of 3-D Digital Imaging and Modeling (3DIM 2005), pp. 56-63 (2005)
[11] Brown, M; Lowe, DG, Automatic panoramic image stitching using invariant features, Int. J. Comput. Vis., 74, 59-73, (2007)
[12] Chomat, O., de Verdiere, V., Hall, D., Crowley, J.: Local scale selection for Gaussian based description techniques. In: Proceedings of European Conference on Computer Vision (ECCV 2000), Lecture Notes in Computer Science, vol. 1842, pp. I:117-133. Springer, Dublin, Ireland (2000)
[13] Datta, R; Joshi, D; Li, J; Wang, JZ, Image retrieval: ideas, influences, and trends of the new age, ACM Comput. Surv., 40, 5, (2008)
[14] DeAngelis, GC; Anzai, A; Chalupa, LM (ed.); Werner, JS (ed.), A modern view of the classical receptive field: linear and non-linear spatio-temporal processing by V1 neurons, No. 1, 704-719, (2004), Cambridge
[15] DeAngelis, GC; Ohzawa, I; Freeman, RD, Receptive field dynamics in the central visual pathways, Trends Neurosci., 18, 451-457, (1995)
[16] Derpanis, KG; Wildes, RP, Spacetime texture representation and recognition based on a spatiotemporal orientation analysis, IEEE Trans. Pattern Anal. Mach. Intell., 34, 1193-1205, (2012)
[17] Elder, J; Zucker, S, Local scale control for edge detection and blur estimation, IEEE Trans. Pattern Anal. Mach. Intell., 20, 699-716, (1998)
[18] Ezzat, T., Bouvrie, J.V., Poggio, T.: Spectro-temporal analysis of speech using 2-D Gabor filters. In: INTERSPEECH, pp. 506-509 (2007)
[19] Fagerström, D, Temporal scale-spaces, Int. J. Comput. Vis., 2-3, 97-106, (2005)
[20] Fleet, DJ; Langley, K, Recursive filters for optical flow, IEEE Trans. Pattern Anal. Mach. Intell., 17, 61-67, (1995)
[21] Florack, L.M.J.: Image Structure. Series in Mathematical Imaging and Vision. Springer, Berlin (1997)
[22] Frangi, AF; Hoogeveen, NW; Walsum, T; Viergever, MA, Model-based quantitation of 3D magnetic resonance angiographic images, IEEE Trans. Med. Imaging, 18, 946-956, (2000)
[23] Gårding, J; Lindeberg, T, Direct computation of shape cues using scale-adapted spatial derivative operators, Int. J. Comput. Vis., 17, 163-191, (1996)
[24] Guichard, F, A morphological, affine, and Galilean invariant scale-space for movies, IEEE Trans. Image Process., 7, 444-456, (1998)
[25] Hall, D., de Verdiere, V., Crowley, J.: Object recognition using coloured receptive fields. In: Proceedings of European Conference on Computer Vision (ECCV 2000), Lecture Notes in Computer Science, vol. 1842, pp. I:164-177. Springer, Dublin, Ireland (2000)
[26] Han, Z; Xu, Z; Zhu, SC, Video primal sketch: a unified middle-level representation for video, J. Math. Imaging Vis., 53, 151-170, (2015) · Zbl 1343.94010
[27] Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004) · Zbl 1072.68104
[28] Hassner, T., Mayzels, V., Zelnik-Manor, L.: On SIFTs and their scales. In: Proceedings of Computer Vision and Pattern Recognition (CVPR 2012), pp. 1522-1528. Providence, Rhode Island (2012)
[29] Heckmann, M; Domont, X; Joublin, F; Goerick, C, A hierarchical framework for spectro-temporal feature extraction, Speech Commun., 53, 736-752, (2011)
[30] Hubel, DH; Wiesel, TN, Receptive fields of single neurones in the cat’s striate cortex, J. Physiol., 147, 226-238, (1959)
[31] Hubel, D.H., Wiesel, T.N.: Brain and Visual Perception: The Story of a 25-Year Collaboration. Oxford University Press, Oxford (2005)
[32] Iijima, T.: Observation theory of two-dimensional visual patterns. Technical Report, Papers of Technical Group on Automata and Automatic Control, IECE, Japan (1962)
[33] Jacobs, N; Pless, R, Time scales in video surveillance, IEEE Trans. Circuits Syst. Video Technol., 18, 1106-1113, (2008)
[34] Jaimes, A., Sebe, N.: Multimodal human-computer interaction: a survey. Comput. Vis. Image Underst. 108(1), 116-134 (2007)
[35] Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: International Conference on Computer Vision (ICCV’07), pp. 1-8 (2007)
[36] Kadir, T; Brady, M, Saliency, scale and image description, Int. J. Comput. Vis., 45, 83-105, (2001) · Zbl 0987.68597
[37] Kang, Y., Morooka, K., Nagahashi, H.: Scale invariant texture analysis using multi-scale local autocorrelation features. In: Proceedings of Scale Space and PDE Methods in Computer Vision (Scale-Space’05), Lecture Notes in Computer Science, vol. 3459, pp. 363-373. Springer (2005) · Zbl 1119.68482
[38] Karlin, S.: Total Positivity. Stanford University Press, Palo Alto (1968) · Zbl 0219.47030
[39] Kläser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: Proceedings of British Machine Vision Conference, Leeds, U.K. (2008)
[40] Kleinschmidt, M, Methods for capturing spectro-temporal modulations in automatic speech recognition, Acta Acust. United Acust., 88, 416-422, (2002)
[41] Koenderink, JJ, The structure of images, Biol. Cybern., 50, 363-370, (1984) · Zbl 0537.92011
[42] Koenderink, JJ, Scale-time, Biol. Cybern., 58, 159-162, (1988) · Zbl 0634.92021
[43] Koenderink, JJ; Doorn, AJ, Generic neighborhood operators, IEEE Trans. Pattern Anal. Mach. Intell., 14, 597-605, (1992)
[44] Krissian, K; Malandain, G; Ayache, N; Vaillant, R; Trousset, Y, Model-based detection of tubular structures in 3D images, Comput. Vis. Image Underst., 80, 130-171, (2000) · Zbl 1010.68553
[45] Laptev, I; Caputo, B; Schuldt, C; Lindeberg, T, Local velocity-adapted motion events for spatio-temporal recognition, Comput. Vis. Image Underst., 108, 207-229, (2007)
[46] Laptev, I., Lindeberg, T.: Space-time interest points. In: Proceedings of International Confernce on Computer Vision (ICCV 2003), pp. 432-439. Nice, France (2003) · Zbl 1067.68751
[47] Laptev, I., Lindeberg, T.: Local descriptors for spatio-temporal recognition. Proceedings of ECCV’04 Workshop on Spatial Coherence for Visual Motion Analysis. Lecture Notes in Computer Science, vol. 3667, pp. 91-103. Springer, Prague, Czech Republic (2004)
[48] Larsen, A.B.L., Darkner, S., Dahl, A.L., Pedersen, K.S.: Jet-based local image descriptors. In: Proceedings of European Conference on Computer Vision (ECCV 2012), Lecture Notes in Computer Science, vol. 7574, pp. III:638-650. Springer (2012)
[49] Lazebnik, S; Schmid, C; Ponce, J, A sparse texture representation using local affine regions, IEEE Trans. Pattern Anal. Mach. Intell., 27, 1265-1278, (2005) · Zbl 1084.60019
[50] Lew, MS; Sebe, N; Djeraba, C; Jain, R, Content-based multimedia information retrieval: state of the art and challenges, ACM Trans. Multimed. Comput. Commun. Appl., 2, 1-19, (2006)
[51] Li, S.Z. (ed.): Encyclopedia of Biometrics. Springer, Berlin (2009)
[52] Li, Y., Tax, D.M.J., Loog, M.: Supervised scale-invariant segmentation (and detection). Proceedings of Scale Space and Variational Methods in Computer Vision (SSVM 2011). Lecture Notes in Computer Science, vol. 6667, pp. 350-361. Springer, Ein Gedi, Israel (2012)
[53] Lindeberg, T, Scale-space for discrete signals, IEEE Trans. Pattern Anal. Mach. Intell., 12, 234-254, (1990)
[54] Lindeberg, T, Discrete derivative approximations with scale-space properties: a basis for low-level feature extraction, J. Math. Imaging Vis., 3, 349-376, (1993)
[55] Lindeberg, T, Effective scale: a natural unit for measuring scale-space lifetime, IEEE Trans. Pattern Anal. Mach. Intell., 15, 1068-1074, (1993)
[56] Lindeberg, T.: On scale selection for differential operators. Proceedings of 8th Scandinavian Conf. on Image Analysis (SCIA’93), pp. 857-866. Norwegian Society for Image Processing and Pattern Recognition, Tromsø Norway (1993)
[57] Lindeberg, T.: Scale-Space Theory in Computer Vision. Springer, Berlin (1993) · Zbl 0812.68040
[58] Lindeberg, T.: Scale-space theory: a basic tool for analysing structures at different scales. J. Appl. Stat. 21(2), 225-270 (1994). http://www.csc.kth.se/ tony/abstracts/Lin94-SI-abstract.html
[59] Lindeberg, T.: Linear spatio-temporal scale-space. In: ter Haar Romeny, B.M., Florack, L.M.J., Koenderink, J.J., Viergever, M.A. (eds.) Scale-Space Theory in Computer Vision: Proceedings of First International Conference on Scale-Space’97, Lecture Notes in Computer Science, vol. 1252, pp. 113-127. Springer, Utrecht, The Netherlands (1997)
[60] Lindeberg, T; Sommer, G (ed.); Koenderink, JJ (ed.), On automatic selection of temporal scales in time-casual scale-space, No. 1315, 94-113, (1997), Kiel, Germany
[61] Lindeberg, T, Edge detection and ridge detection with automatic scale selection, Int. J. Comput. Vis., 30, 117-154, (1998)
[62] Lindeberg, T, Feature detection with automatic scale selection, Int. J. Comput. Vis., 30, 77-116, (1998)
[63] Lindeberg, T, A scale selection principle for estimating image deformations, Image Vis. Comput., 16, 961-977, (1998)
[64] Lindeberg, T.: Principles for automatic scale selection. In: Handbook on Computer Vision and Applications, pp. 239-274. Academic Press, Boston, USA (1999). http://www.csc.kth.se/cvap/abstracts/cvap222.html · Zbl 0987.68597
[65] Lindeberg, T.: Linear spatio-temporal scale-space. Technical Report ISRN KTH/NA/P-01/22-SE, Department of Numerical Analysis and Computer Science, KTH (2001). http://www.csc.kth.se/cvap/abstracts/cvap257.html · Zbl 1255.68250
[66] Lindeberg, T, Generalized Gaussian scale-space axiomatics comprising linear scale-space, affine scale-space and spatio-temporal scale-space, J. Math. Imaging Vis., 40, 36-81, (2011) · Zbl 1255.68250
[67] Lindeberg, T.: Scale invariant feature transform. Scholarpedia 7(5), 10,491 (2012)
[68] Lindeberg, T, A computational theory of visual receptive fields, Biol. Cybern., 107, 589-635, (2013) · Zbl 1294.92009
[69] Lindeberg, T.: Invariance of visual operations at the level of receptive fields. PLoS One 8(7), e66,990 (2013) · Zbl 1294.92009
[70] Lindeberg, T, Scale selection properties of generalized scale-space interest point detectors, J. Math. Imaging Vis., 46, 177-210, (2013) · Zbl 1312.68202
[71] Lindeberg, T; Ikeuchi, K (ed.), Scale selection, 701-713, (2014), Berlin
[72] Lindeberg, T, Image matching using generalized scale-space interest points, J. Math. Imaging Vis., 52, 3-36, (2015) · Zbl 1357.94023
[73] Lindeberg, T.: Separable time-causal and time-recursive spatio-temporal receptive fields. In: Proc. Scale-Space and Variational Methods for Computer Vision (SSVM 2015), Lecture Notes in Computer Science, vol. 9087, pp. 90-102. Springer (2015)
[74] Lindeberg, T.: Spatio-temporal scale selection in video data (in preparation) (2016) · Zbl 1420.94015
[75] Lindeberg, T, Time-causal and time-recursive spatio-temporal receptive fields, J. Math. Imaging Vis., 55, 50-88, (2016) · Zbl 1334.94034
[76] Lindeberg, T; Bretzner, L; Griffin, L (ed.); Lillholm, M (ed.), Real-time scale selection in hybrid multi-scale representations, No. 2695, 148-163, (2003), Isle of Skye, Scotland · Zbl 1067.68753
[77] Lindeberg, T., Fagerström, D.: Scale-space with causal time direction. Proceedings of European Conference on Computer Vision (ECCV’96). Lecture Notes in Computer Science, vol. 1064, pp. 229-240. Springer, Cambridge, UK (1996)
[78] Lindeberg, T., Friberg, A.: Idealized computational models of auditory receptive fields. PLoS One 10(3), e0119,032:1-e0119,032:58 (2015)
[79] Lindeberg, T., Friberg, A.: Scale-space theory for auditory signals. In: Proceedings of Scale-Space and Variational Methods for Computer Vision (SSVM 2015), Lecture Notes in Computer Science, vol. 9087, pp. 3-15. Springer (2015)
[80] Lindeberg, T., Gårding, J.: Shape from texture from a multi-scale perspective. In: Nagel, T.S.H.H.-H., Shirai, Y. (eds.) Proceedings of International Conference on Computer Vision (ICCV’93), pp. 683-691. IEEE Computer Society Press, Berlin, Germany (1993)
[81] Lindeberg, T; Gårding, J, Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D structure, Image Vis. Comput., 15, 415-434, (1997)
[82] Liu, C; Yuen, J; Torralba, A, SIFT flow: dense correspondence across scenes and its applications, IEEE Trans. Pattern Anal. Mach. Intell., 33, 978-994, (2011)
[83] Liu, X.M., Wang, C., Yao, H., Zhang, L.: The scale of edges. In: Proceedings of Computer Vision and Pattern Recognition (CVPR 2012), pp. 462-469 (2012)
[84] Loog, M., Li, Y., Tax, D.: Maximum membership scale selection. Multiple Classifier Systems. Lecture Notes in Computer Science, vol. 5519, pp. 468-477. Springer, Berlin (2009) · Zbl 1255.68250
[85] Lowe, DG, Distinctive image features from scale-invariant keypoints, Int. J. Comput. Vis., 60, 91-110, (2004)
[86] Mahmoodi, S, Linear neural circuitry model for visual receptive fields, J. Math. Imaging Vis., 54, 1-24, (2016) · Zbl 1352.92032
[87] Meyer, B.T., Kollmeier, B.: Optimization and evaluation of Gabor feature sets for ASR. In: INTERSPEECH, pp. 906-909 (2008)
[88] Mikolajczyk, K; Schmid, C, Scale and affine invariant interest point detectors, Int. J. Comput. Vis., 60, 63-86, (2004)
[89] Mikolajczyk, K; Tuytelaars, T; Schmid, C; Zisserman, A; Matas, J; Schaffalitzky, F; Kadir, T; Gool, L, A comparison of affine region detectors, Int. J. Comput. Vis., 65, 43-72, (2005)
[90] Miller, LM; Escabi, NA; Read, HL; Schreiner, C, Spectrotemporal receptive fields in the lemniscal auditory thalamus and cortex, J. Neurophysiol., 87, 516-527, (2001)
[91] Mrázek, P; Navara, M, Selection of optimal stopping time for nonlinear diffusion filtering, Int. J. Comput. Vis., 52, 189-203, (2003)
[92] Mutch, J; Lowe, DG, Object class recognition and localization using sparse features with limited receptive fields, Int. J. Comput. Vis., 80, 45-57, (2008)
[93] Negre, A; Braillon, C; Crowley, JL; Laugier, C, Real-time time-to-collision from variation of intrinsic scale, Exp. Robot., 39, 75-84, (2008)
[94] Niebles, JC; Wang, H; Fei-Fei, L, Unsupervised learning of human action categories using spatial-temporal words, Int. J. Comput. Vis., 79, 299-318, (2008)
[95] Paris, S.: Edge-preserving smoothing and mean-shift segmentation of video streams. Proceedings of European Conference on Computer Vision (ECCV 2008). Lecture Notes in Computer Science, pp. 460-473. Springer, Marseille, France (2008) · Zbl 1312.68202
[96] Patterson, RD; Allerhand, MH; Giguere, C, Time-domain modeling of peripheral auditory processing: a modular architecture and a software platform, J. Acoust. Soc. Am., 98, 1890-1894, (1995)
[97] Patterson, RD; Robinson, K; Holdsworth, J; McKeown, D; Zhang, C; Allerhand, M, Complex sounds and auditory images, Audit. Physiol. Percept., 83, 429-446, (1992)
[98] Poppe, R, A survey on vision-based human action recognition, Image Vis. Comput., 28, 976-990, (2010)
[99] Porta, M, Vision-based user interfaces: methods and applications, int. j. hum. comput. stud., 57, 27-73, (2002)
[100] Rivero-Moreno, CJ; Bres, S, Spatio-temporal primitive extraction using Hermite and Laguerre filters for early vision video indexing, Image Anal. Recogn., 3211, 825-832, (2004)
[101] Rothganger, F; Lazebnik, S; Schmid, C; Ponce, J, 3D object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints, Int. J. Comput. Vis., 66, 231-259, (2006)
[102] Sato, Y; Nakajima, S; Shiraga, N; Atsumi, H; Yoshida, S; Koller, T; Gerig, G; Kikinis, R, 3D multi-scale line filter for segmentation and visualization of curvilinear structures in medical images, Med. Image Anal., 2, 143-168, (1998)
[103] Schlute, R., Bezrukov, L., Wagner, H., Ney, H.: Gammatone features and feature combination for large vocabulary speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’07), vol. IV, pp. 649-652 (2007)
[104] Schoenberg, I.J.: On Pòlya frequency functions. ii. Variation-diminishing integral operators of the convolution type. Acta Sci. Math. (Szeged) 12, 97-106 (1950) · Zbl 0035.35201
[105] Schoenberg, I.J.: I. J. Schoenberg Selected Papers, vol. 2. Springer, Berlin (1988). Edited by C. de Boor
[106] Se, S; Lowe, DG; Little, JJ, Vision-based global localization and mapping for mobile robots, IEEE Trans. Robot., 21, 364-375, (2005)
[107] Shabani, A.H., Clausi, D.A., Zelek, J.S.: Improved spatio-temporal salient feature detection for action recognition. In: British Machine Vision Conference (BMVC’11), pp. 1-12. Dundee, U.K. (2011)
[108] Shao, L., Mattivi, R.: Feature detector and descriptor evaluation in human action recognition. In: Proceedings of ACM International Conference on Image and Video Retrieval CIVR’10, pp. 477-484. Xian, China (2010)
[109] Siciliano, B., Khatib, O. (eds.): Springer Handbook of Robotics. Springer, Berlin (2008) · Zbl 1171.93300
[110] Sporring, J., Colios, C.J., Trahanias, P.E.: Generalized scale selection. In: Proceedings of International Conference on Image Processing (ICIP’00), pp. 920-923. Vancouver, Canada (2000) · Zbl 1334.94034
[111] Surya, PVB; Vorotnikov, D; Pelapur, R; Jose, S; Seetharaman, G; Palaniappan, K, Multiscale Tikhonov-total variation image restoration using spatially varying edge coherence exponent, IEEE Trans. Image Process., 24, 5220-5235, (2015)
[112] ter Haar Romeny, B.: Front-End Vision and Multi-Scale Image Analysis. Series in mathematical imaging and vision. Springer, Berlin (2003)
[113] ter Haar Romeny, B., Florack, L., Nielsen, M.: Scale-time kernels and models. In: Proceedings of International Conference on Scale-Space and Morphology in Computer Vision (Scale-Space’01), Lecture Notes in Computer Science. Springer, Vancouver, Canada (2001) · Zbl 0991.68580
[114] Tuytelaars, T; Gool, L, Matching widely separated views based on affine invariant regions, Int. J. Comput. Vis., 59, 61-85, (2004)
[115] Tuytelaars, T., Mikolajczyk, K.: A Survey on Local Invariant Features, Foundations and Trends in Computer Graphics and Vision, vol. 3(3). Now Publishers (2008)
[116] van der Berg, E.S., Reyneke, P.V., de Ridder, C.: Rotational image correlation in the Gauss-Laguerre domain. In: Third SPIE Conference on Sensors, MEMS and Electro-Optic Systems: Proceedings of SPIE, vol. 9257, pp. 92,570F-1-92,570F-17 (2014)
[117] Sande, KEA; Gevers, T; Snoek, CGM, Evaluating color descriptors for object and scene recognition, IEEE Trans. Pattern Anal. Mach. Intell., 32, 1582-1596, (2010)
[118] Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: Proceedings of British Machine Vision Conference, London, U.K. (2009)
[119] Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 4305-4314 (2015)
[120] Weinland, D; Ronfard, R; Boyer, E, A survey of vision-based methods for action representation, segmentation and recognition, Comput. Vis. Image Underst., 115, 224-241, (2011)
[121] Willems, G., Tuytelaars, T., van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. Proceedings of European Conference on Computer Vision (ECCV 2008). Lecture Notes in Computer Science, vol. 5303, pp. 650-663. Springer, Marseille, France (2008)
[122] Witkin, A.P.: Scale-space filtering. In: Proceedings of 8th International Joint Conference Artificial Intelligence, pp. 1019-1022. Karlsruhe, Germany (1983)
[123] Wu, Q; Zhang, L; Shi, G, Robust multifactor speech feature extraction based on Gabor analysis, IEEE Trans. Audio Speech Lang. Process., 19, 927-936, (2011)
[124] Zelnik-Manor, L., Irani, M.: Event-based analysis of video. In: Proceedings of Computer Vision and Pattern Recognition, pp. II:123-130. Kauai Marriott, Hawaii (2001)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.