# zbMATH — the first resource for mathematics

Least square regression with indefinite kernels and coefficient regularization. (English) Zbl 1225.65015
Let $$(y_i,x_i)_{i=1}^m$$ be i.i.d. observations with $$y_i\in\mathbb R$$, $$x_i\in X$$, $$X$$ being some compact metric space. The authors consider estimates $$f_z$$ for the regression function $$f_\rho(x)=E(y_i|x_i)$$, where $$f_z=f_{\alpha^z}$$, $$f_\alpha(x)=\sum_{i=1}^m \alpha_i K(x,x_i)$$,
$\alpha^z=\arg\min_{\alpha\in R^m} {1\over m}\,\sum_{i=1}^m (y_i - f_\alpha(x_i))^2+\lambda m\sum_{i=1}^m\alpha_i^2,$
$$K:X\times X\to\mathbb R$$ is a continuous bounded function (kernel), $$\lambda$$ is a regularization parameter.
Consistency of $$f_z$$ is demonstrated under the assumptions that $$\lambda=\lambda(m)\to 0$$, $$\lambda^{3/2}\sqrt{m}\to\infty$$ and the true regression function belongs to the closure of $$\{f_\alpha\}$$ in some suitable reproducing kernel Hilbert space.
To analyze the rates of convergence the authors make assumptions of the form $$E\|L^{-r}f_\rho(x_i)\|^2<\infty$$ for some $$r>0$$, where $$L f(x)=E \tilde K(x,x_i)f(x_i)$$, $$\tilde K(x,t)=E_u K(x,u)K(t,u)$$. E.g. if $$r>1$$ then choosing $$\lambda=m^{1/5}$$ they get $$\|f_z-f_\rho\|_{L^2}=O(m^{-1/5})$$.
Results of simulations are presented for $$X=[0,1]$$ and the Gaussian kernel $$K$$.

##### MSC:
 65C60 Computational problems in statistics (MSC2010) 62J02 General nonlinear regression 46E22 Hilbert spaces with reproducing kernels (= (proper) functional Hilbert spaces, including de Branges-Rovnyak and other structured spaces)
Full Text:
##### References:
 [1] Aronszajn, N., Theory of reproducing kernels, Trans. amer. math. soc., 68, 337-404, (1950) · Zbl 0037.20701 [2] Bauer, F.; Pereverzev, S.; Rosasco, L., On regularization algorithms in learning theory, J. complexity, 23, 52-72, (2007) · Zbl 1109.68088 [3] Cannon, A.; Ettinger, J.M.; Hush, D.; Scovel, C., Machine learning with data dependent hypothesis classes, J. Mach. learn. res., 2, 335-358, (2002) · Zbl 1007.68156 [4] Cucker, F.; Smale, S., On the mathematical foundations of learning, Bull. amer. math. soc., 39, 1-49, (2001) · Zbl 0983.68162 [5] Cucker, F.; Smale, S., Best choices for regularization parameters in learning theory: on the bias-variance problem, Found. comput. math., 2, 413-428, (2002) · Zbl 1057.68085 [6] Evgeniou, T.; Pontil, M.; Poggio, T., Regularization networks and support vector machines, Adv. comput. math., 13, 1-50, (2000) · Zbl 0939.68098 [7] Lanckriet, G.R.G.; Cristianini, N.; Bartlett, P.; El Ghaoui, L.; Jordan, M.I., Learning the kernel matrix with semidefinite programming, J. Mach. learn. res., 5, 27-72, (2004), (electronic) · Zbl 1222.68241 [8] Liu, C., Gabor-based kernel PCA with fractional power polynomial models for face recognition, IEEE trans. pattern anal. Mach. intell., 26, 572-581, (2004) [9] Lo Gerfo, L.; Rosasco, L.; Odone, F.; De Vito, E.; Verri, A., Spectral algorithms for supervised learning, Neural comput., 20, 1873-1897, (2008) · Zbl 1147.68643 [10] Luss, R.; d’Aspremont, A., Support vector machine classification with indefinite kernels, Math. program. comput., 1, 97-118, (2009) · Zbl 1191.68511 [11] Micchelli, C.A.; Pontil, M., Learning the kernel function via regularization, J. Mach. learn. res., 6, 1099-1125, (2005), (electronic) · Zbl 1222.68265 [12] Poggio, T.; Smale, S., The mathematics of learning: dealing with data, Notices amer. math. soc., 50, 537-544, (2003) · Zbl 1083.68100 [13] Saigo, H.; Vert, J.; Ueda, N.; Akutsu, T., Protein homology detection using string alignment kernels, Bioinformatics, 20, 1682-1689, (2004) [14] Scholkopf, B.; Smola, A., Learning with kernels: support vector machines, regularization, optimization and beyond, (2002), MIT Press [15] Smale, S.; Zhou, D.-X., Shannon sampling and function reconstruction from point values, Bull. amer. math. soc. (N.S.), 41, 279-305, (2004), (electronic) · Zbl 1107.94007 [16] Smale, S.; Zhou, D.-X., Shannon sampling. II. connections to learning theory, Appl. comput. harmon. anal., 19, 285-302, (2005) · Zbl 1107.94008 [17] Smale, S.; Zhou, D.X., Learning theory estimates via integral operators and their approximations, Constr. approx., 26, 153-172, (2007) · Zbl 1127.68088 [18] Sun, H.; Wu, Q., Regularized least square regression with dependent samples, Adv. comput. math., 32, 2, 175-189, (2010) · Zbl 1191.68535 [19] Sun, H.; Wu, Q., Application of integral operator for regularized least square regression, Math. comput. modelling, 49, 276-285, (2009) · Zbl 1165.45310 [20] Sun, S.; Wu, Q., A note on application of integral operator in learning theory, Appl. comput. harmon. anal., 26, 416-421, (2009) · Zbl 1165.68059 [21] Vapnik, V., Statistical learning theory, (1998), Wiley New York · Zbl 0935.62007 [22] Wu, Q.; Ying, Y.; Zhou, D.X., Learning rates of least-square regularized regression, Found. comput. math., 6, 171-192, (2006) · Zbl 1100.68100 [23] Wu, Q.; Ying, Y.; Zhou, D.-X., Multi-kernel regularized classifiers, J. complexity, 23, 108-134, (2007) · Zbl 1171.65043 [24] Wu, Q.; Zhou, D.-X., SVM soft margin classifiers: linear programming versus quadratic programming, Neural comput., 17, 1160-1187, (2005) · Zbl 1108.90324 [25] Wu, Q.; Zhou, D.-X., Learning with sample dependent hypothesis spaces, Comput. math. appl., 56, 2896-2907, (2008) · Zbl 1165.68388 [26] Q.W. Xiao, D.X. Zhou, Learning by nonsymmetric kernels with data dependent spaces and $$\ell_1$$-regularizer, preprint. · Zbl 1221.68204 [27] Xu, Y.; Zhang, H., Refinable kernels, J. Mach. learn. res., 8, 2083-2120, (2007) · Zbl 1222.68337 [28] Xu, Y.; Zhang, H., Refinement of reproducing kernels, J. Mach. learn. res., 10, 107-140, (2009) · Zbl 1235.68211 [29] Zhang, T., Leave-one-out bounds for kernel methods, Neural comput., 15, 1397-1437, (2003) · Zbl 1085.68144 [30] Zhang, T., Learning bounds for kernel regression using effective data dimensionality, Neural comput., 17, 2077-2098, (2005) · Zbl 1080.68044
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.