×

Soft and hard classification by reproducing kernel Hilbert space methods. (English) Zbl 1106.62338

Summary: Reproducing kernel Hilbert space (RKHS) methods provide a unified context for solving a wide variety of statistical modelling and function estimation problems. We consider two such problems: We are given a training set \(\{y_i,t_i\), \(i=1,\dots,n\}\), where \(y_i\) is the response for the \(i\)th subject, and \(t_i\) is a vector of attributes for this subject. The value of \(y_i\) is a label that indicates which category it came from. For the first problem, we wish to build a model from the training set that assigns to each \(t\) in an attribute domain of interest an estimate of the probability \(p_j(t)\) that a (future) subject with attribute vector \(t\) is in category \(j\). The second problem is in some sense less ambitious; it is to build a model that assigns to each \(t\) a label, which classifies a future subject with that \(t\) into one of the categories or possibly “none of the above”. The approach to the first of these two problems discussed here is a special case of what is known as penalized likelihood estimation. The approach to the second problem is known as the support vector machine. We also note some alternate but closely related approaches to the second problem. These approaches are all obtained as solutions to optimization problems in RKHS. Many other problems, in particular the solution of ill-posed inverse problems, can be obtained as solutions to optimization problems in RKHS and are mentioned in passing. We caution the reader that although a large literature exists in all of these topics, in this inaugural article we are selectively highlighting work of the author, former students, and other collaborators.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
46N30 Applications of functional analysis in probability theory and statistics

Software:

gss
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] ADV COMPUT MATH 13 pp 1– (2000) · Zbl 0939.68098 · doi:10.1023/A:1018946025316
[2] 33 pp 82– (1971) · Zbl 0201.39702 · doi:10.1016/0022-247X(71)90184-3
[3] TRANS AM MATH SOC 68 pp 337– (1950) · doi:10.1090/S0002-9947-1950-0051437-7
[4] 92 pp 107– (1997) · doi:10.1080/01621459.1997.10473607
[5] 81 pp 96– (1986) · doi:10.1080/01621459.1986.10478243
[6] ANN STATIST 23 pp 1865– (1995) · Zbl 0854.62042 · doi:10.1214/aos/1034713638
[7] Klein, Archives of Ophthalmology 102 (4) pp 520– (1984) · doi:10.1001/archopht.1984.01040030398010
[8] COMMUN STAT SIMUL COMPUT 26 pp 765– (1997) · doi:10.1080/03610919708813408
[9] 1 pp 169– (1992)
[10] STAT SIN 6 pp 675– (1996)
[11] ANN STATIST 18 pp 1676– (1990) · Zbl 0719.62051 · doi:10.1214/aos/1176347872
[12] ANN STATIST 28 pp 734– (2000) · Zbl 1105.62329 · doi:10.1214/aos/1015951996
[13] DATA MINING KNOWL DISCOV 6 pp 259– (2002) · Zbl 05660804 · doi:10.1023/A:1015469627679
[14] PNAS 98 (26) pp 15149– (2001) · doi:10.1073/pnas.211566398
[15] MACH LEARN 46 pp 191– (2002) · Zbl 0998.68103 · doi:10.1023/A:1012406528296
[16] NUMER MATH 31 pp 377– (1979)
[17] Technometrics 21 pp 215– (1979) · doi:10.1080/00401706.1979.10489751
[18] ANN STATIST 14 pp 1101– (1986) · Zbl 0629.62043 · doi:10.1214/aos/1176350052
[19] ANN STATIST 13 pp 970– (1985) · Zbl 0585.62074 · doi:10.1214/aos/1176349650
[20] COMMUN STAT SIMUL 18 pp 1059– (1989) · Zbl 0695.62113 · doi:10.1080/03610918908812806
[21] NUMER MATH 56 pp 1– (1989) · Zbl 0665.65010 · doi:10.1007/BF01395775
[22] MACH LEARN 46 pp 131– (2002) · Zbl 0998.68101 · doi:10.1023/A:1012450327387
[23] CONSTR APPROXIMATION 2 pp 11– (1986) · Zbl 0625.41005 · doi:10.1007/BF01893414
[24] MON WEATHER REV 123 pp 3358– (1995) · doi:10.1175/1520-0493(1995)123<3358:ATONWP>2.0.CO;2
[25] 79 pp 832– (1984) · doi:10.1080/01621459.1984.10477100
[26] J COMPUT PHYS 59 pp 441– (1985) · Zbl 0626.65053 · doi:10.1016/0021-9991(85)90121-4
[27] SIAM J NUMER ANAL 14 pp 651– (1977) · Zbl 0402.65032 · doi:10.1137/0714044
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.