# zbMATH — the first resource for mathematics

##### Examples
 Geometry Search for the term Geometry in any field. Queries are case-independent. Funct* Wildcard queries are specified by * (e.g. functions, functorial, etc.). Otherwise the search is exact. "Topological group" Phrases (multi-words) should be set in "straight quotation marks". au: Bourbaki & ti: Algebra Search for author and title. The and-operator & is default and can be omitted. Chebyshev | Tschebyscheff The or-operator | allows to search for Chebyshev or Tschebyscheff. "Quasi* map*" py: 1989 The resulting documents have publication year 1989. so: Eur* J* Mat* Soc* cc: 14 Search for publications in a particular source with a Mathematics Subject Classification code (cc) in 14. "Partial diff* eq*" ! elliptic The not-operator ! eliminates all results containing the word elliptic. dt: b & au: Hilbert The document type is set to books; alternatively: j for journal articles, a for book articles. py: 2000-2015 cc: (94A | 11T) Number ranges are accepted. Terms can be grouped within (parentheses). la: chinese Find documents in a given language. ISO 639-1 language codes can also be used.

##### Operators
 a & b logic and a | b logic or !ab logic not abc* right wildcard "ab c" phrase (ab c) parentheses
##### Fields
 any anywhere an internal document identifier au author, editor ai internal author identifier ti title la language so source ab review, abstract py publication year rv reviewer cc MSC code ut uncontrolled term dt document type (j: journal article; b: book; a: book article)
Unified approach to coefficient-based regularized regression. (English) Zbl 1228.62044
Summary: We consider the coefficient-based regularized least-squares regression problem with the ${l}^{q}$-regularizer ($1\le q\le 2$) and data dependent hypothesis spaces. Algorithms for data dependent hypothesis spaces perform well with the property of flexibility. We conduct a unified error analysis by a stepping stone technique. An empirical covering number technique is also employed in our study to improve sample errors. Comparing with existing results, we make a few improvements: First, we obtain a significantly sharper learning rate that can be arbitrarily close to $O\left({m}^{-1}\right)$ under reasonable conditions, which is regarded as the best learning rate in learning theory. Second, our results cover the case $q=1$, which is novel. Finally, our results hold under very general conditions.
##### MSC:
 62G08 Nonparametric regression 68T99 Artificial intelligence
##### References:
 [1] Aronszajn, N.: Theory of reproducing kernels, Trans. amer. Math. soc. 68, 337-404 (1950) · Zbl 0037.20701 · doi:10.2307/1990404 [2] Bartlett, P. L.: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network, IEEE trans. Inform. theory 44, 525-536 (2002) · Zbl 0901.68177 · doi:10.1109/18.661502 [3] Wu, Q.; Zhou, D. X.: Learning with sample dependent hypothesis spaces, Comput. math. Appl. 56, 2896-2907 (2008) · Zbl 1165.68388 · doi:10.1016/j.camwa.2008.09.014 [4] I. Steinwart, D. Hush, C. Scovel, Optimal rates for regularized least square regression, in: Proceedings of the 22nd Annual Conference on Learning Theory, 2009, pp. 79–93. [5] Sun, H. W.; Wu, Q.: Least square regression with indefinite kernels and coefficient regularization, Appl. comput. Harmon. anal. 30, 96-109 (2011) · Zbl 1225.65015 · doi:10.1016/j.acha.2010.04.001 [6] Xiao, Q. W.; Zhou, D. X.: Learning by nonsymmetric kernels with data dependent spaces and l1-regularizer, Taiwanese J. Math. 14, 1821-1836 (2010) · Zbl 1221.68204 [7] Donoho, D.: For most large undetermined systems of linear equations the minimal l1-norm solution is the sparsest solution, Comm. pure appl. Math. 59, 797-829 (2006) · Zbl 1113.15004 · doi:10.1002/cpa.20132 [8] Tong, H. Z.; Chen, D. R.; Yang, F. H.: Least square regression with lp-coefficient regularization, Neural comput. 38, 526-565 (2010) [9] Wu, Q.; Zhou, D. X.: SVM soft margin classifier: linear programming versus quadratic programming, Neural comput. 15, 1397-1437 (2003) [10] Smale, S.; Zhou, D. X.: Estimating the approximation error in learning theory, Anal. appl. 1, 17-41 (2003) · Zbl 1079.68089 · doi:10.1142/S0219530503000089 [11] Cucker, F.; Zhou, D. X.: Learning theory: an approximation theory viewpoint, (2007) [12] Smale, S.; Zhou, D. X.: Shannon sampling II: Connection to learning theory, Appl. comput. Harmon. anal. 19, 285-302 (2005) · Zbl 1107.94008 · doi:10.1016/j.acha.2005.03.001 [13] Wu, Q.; Ying, Y. M.; Zhou, D. X.: Learning rates of least-square regularized regression, Found. comput. Math. 6, 171-192 (2006) · Zbl 1100.68100 · doi:10.1007/s10208-004-0155-9 [14] Wu, Q.; Ying, Y. M.; Zhou, D. X.: Multi-kernel regularized classifiers, J. complexity 23, 108-134 (2007) · Zbl 1171.65043 · doi:10.1016/j.jco.2006.06.007 [15] Zhou, D. X.: The covering number in learning theory, J. complexity 18, 739-767 (2002) · Zbl 1016.68044 · doi:10.1006/jcom.2002.0635 [16] Pontil, M.: A note on different covering numbers in learning theory, J. complexity 19, 665-671 (2003) · Zbl 1057.68044 · doi:10.1016/S0885-064X(03)00033-5 [17] Zhou, D. X.: Capacity of reproducing kernel spaces in learning theory, IEEE trans. Inform. theory 49, 1743-1752 (2003) [18] Z.C. Guo, D.X. Zhou, Concentration estimates for learning with unbounded sampling (submitted for publication). [19] Wu, Z. M.: Compactly supported positive definite radial functions, Adv. comput. Math. 4, 283-292 (1995) · Zbl 0837.41016 · doi:10.1007/BF03177517 [20] Y.L. Feng, Regularized least-squares regression with dependent samples and q-penalty, Appl. Anal. (in press). · Zbl 1191.68535 · doi:10.1007/s10444-008-9099-y