New concentration inequalities in product spaces.

*(English)*Zbl 0893.60001We consider that this paper is particularly rich in results, methods, comments explaining the results, outlines of history, accompanied by quotation of expository papers. In order not to increase the length of the review (author’s abstract would be a too short one), we present only the results entitled as theorems.

Let \((\Omega,\mu)\) be a probability space, \(P= \mu\otimes^n A\), \(A\subset\Omega^n\), and \(\nu\) be probabilities on \(\Omega^n\). If \(x= (x_i)\), \(y=(y_i)\in\Omega^n\), we define \(h_i(x, y)\) as 1 for \(x_i\neq y_i\) and 0 for \(x_i= y_i\). The first result (1.1) is \(\int e(A)dP\leq 1/P(A)\), where \(e(A, x)= \inf_{\nu(A)= 1}e(\nu, x)\), \(e(\nu, x)= \int c(y_1,y_2)d\nu(y_1) d\nu(y_2)\), \(c(y_1, y_2)= (5/4)^s\), \(s= \sum_i h_i(x, y_1) h_i(x, y_2)\). This result is deduced in Section 2, for \(\beta= 1/2\), from (2.1) stating the same inequality with the right member at the power \(2\beta^2/(1- 2\beta^2)\), \(\beta^2< 1/2\), where now \(e(\nu, x)= \int(\int \prod_i(1+ \beta h_i(x, y)\varepsilon_i) d\nu(y))^2 dP(\varepsilon)\), \(\Omega= \{\pm 1\}\) and \(\mu\) is the “uniform” one.

The second result (1.2), with \(\Omega\), \(\mu\) as in (2.1), consists in the existence of a universal constant \(K\) such that \(P(| Z-M|\geq t)\leq 2\exp(-\min(t^2/V^2,t/U)/K)\), where \(M\) is a median of \(Z\), \(Z(\varepsilon)= \|\sum_{i,j} x_{ij}\varepsilon_i\varepsilon_j\|\), \(x_{ij}= x_{ji}\) are elements of a Banach space \(W\), \(x_{ii}= 0\), \(U= \sup \sum_{i,j}\alpha_i \gamma_jx^*(x_{ij})\) over all \(\alpha\), \(\gamma\) with \(\|\cdot\|_2\leq 1\) and \(x^*\in W^*\) with \(\| x^*\|\leq 1\), \(V= E(\sup_{x^*} (\sum_j(\sum_i \varepsilon_ix^*(x_{ij}))^2))\).

In Theorem (1.3) we have \(n^2\) instead of \(n\), \(\Omega= \{0,1\}\), \(\mu\) is also uniform and \(x\), \(y\) are considered as \(n\times n\) matrices, \(d(x,y)\) being the operator norm of \(x-y\) when acting on \(\ell^2_n\). It states that \(P(d(A)\geq K_1 n^{1/4}(\log n)^{5/4})\leq 1/n^2\), for \(P(A)\geq 1/2\). In its proof (3.1) is used, establishing that for some \(L\), for every \(A\subset\{0,1\}^n\), \(x\in\{0, 1\}^n\) there exists a probability \(\nu\) on \(A\) and a \(p(x)\) such that for all \(\alpha_i\leq 1\), we have \[ \int\exp\Biggl(\Biggl(\sum_i \alpha_i h_i(x, y)\Biggr) \Biggl/ L\Biggr)d\nu(y)\leq \exp(\|\alpha\|_2(p(x)+ \log^{1/2}(en))) \] and also \(\int\exp(p(x)^2)dP(x)\leq 1/P(A)\).

(1.4) says that \[ P(| Z-EZ|\geq t)\leq K\exp(-t(KU)^{-1}\log(1+ tUV^{-1})), \] where \(Z(x)= \sup_k\sum_i f_k(x_i)\), \(f_k\), \(k=1,2,\dots\), are measurable on \(\Omega\), \(U= \sup_k\| f_k\|_\infty\), \(V= E_\mu(\sup_k \sum_i f(x_i)^2)\). It uses (4.2): \(\int\exp(m(A)/L)dP\leq 1/P(A)\), where \(\Omega\) is finite, \(m(A,x)= \inf_{\nu(A)= 1}m(\nu, x)\), \(m(\nu, x)= \sum_i\psi(d_i)d\mu\), \(\psi(x)= \tau x^2\), \(\tau= (\log 2)/2\) for \(x\leq 2\), \(\psi(x)= x\log x\) for \(x\geq 2\), \(d_i= d\nu_i/d\mu\), \(\nu_i\) is the image by \(y\to y_i\) of the restriction of \(\nu\) to \(\{y; y_i\neq x_i\}\).

(5.1): \(\int G_q(A)dP\leq 1/P(A)^q\), where \(G_q(A, x)= \inf_{\nu(A)= 1}G(\nu,\dots, \nu,x)\), \(\nu\) atomic, \[ G(\nu_1,\dots, \nu_q,x)= \int(a(q)+ 1)^{U(y_1,\dots,y_q;x)}d\nu_1(y_1)\cdots d\nu_q(y_q), \] \(a(q)= (1+ qt_q)^q/(1+ q)^{q-1}\) and \(t_q\) is the largest root of \((1-t)(1+ tq)^q= (1+ t(q+1))^{q-1}\).

(5.4): \(\int\exp(\tau V(A_1,\dots, A_q)/q)dP\leq \prod_r P(A_r)^{-1/q}\),

\(V(A_1,\dots, A_q; x)= \inf_{y_r\in A_r}V(y_1,\dots, y_q;\;x)\), \(V(y_1,\dots, y_q; x)= \text{card}\{i; \sum_r h_i(x,y_r)\geq 2\}\). Also (6.2): in (2.1) one may choose \(\beta\) such that \(\int (e(A)- 1)dP\leq (1- P(A))P(A)^{-1}(\log(e/(1- P(A))))^{-1}\) for all \(n\) and \(A\).

Let \((\Omega,\mu)\) be a probability space, \(P= \mu\otimes^n A\), \(A\subset\Omega^n\), and \(\nu\) be probabilities on \(\Omega^n\). If \(x= (x_i)\), \(y=(y_i)\in\Omega^n\), we define \(h_i(x, y)\) as 1 for \(x_i\neq y_i\) and 0 for \(x_i= y_i\). The first result (1.1) is \(\int e(A)dP\leq 1/P(A)\), where \(e(A, x)= \inf_{\nu(A)= 1}e(\nu, x)\), \(e(\nu, x)= \int c(y_1,y_2)d\nu(y_1) d\nu(y_2)\), \(c(y_1, y_2)= (5/4)^s\), \(s= \sum_i h_i(x, y_1) h_i(x, y_2)\). This result is deduced in Section 2, for \(\beta= 1/2\), from (2.1) stating the same inequality with the right member at the power \(2\beta^2/(1- 2\beta^2)\), \(\beta^2< 1/2\), where now \(e(\nu, x)= \int(\int \prod_i(1+ \beta h_i(x, y)\varepsilon_i) d\nu(y))^2 dP(\varepsilon)\), \(\Omega= \{\pm 1\}\) and \(\mu\) is the “uniform” one.

The second result (1.2), with \(\Omega\), \(\mu\) as in (2.1), consists in the existence of a universal constant \(K\) such that \(P(| Z-M|\geq t)\leq 2\exp(-\min(t^2/V^2,t/U)/K)\), where \(M\) is a median of \(Z\), \(Z(\varepsilon)= \|\sum_{i,j} x_{ij}\varepsilon_i\varepsilon_j\|\), \(x_{ij}= x_{ji}\) are elements of a Banach space \(W\), \(x_{ii}= 0\), \(U= \sup \sum_{i,j}\alpha_i \gamma_jx^*(x_{ij})\) over all \(\alpha\), \(\gamma\) with \(\|\cdot\|_2\leq 1\) and \(x^*\in W^*\) with \(\| x^*\|\leq 1\), \(V= E(\sup_{x^*} (\sum_j(\sum_i \varepsilon_ix^*(x_{ij}))^2))\).

In Theorem (1.3) we have \(n^2\) instead of \(n\), \(\Omega= \{0,1\}\), \(\mu\) is also uniform and \(x\), \(y\) are considered as \(n\times n\) matrices, \(d(x,y)\) being the operator norm of \(x-y\) when acting on \(\ell^2_n\). It states that \(P(d(A)\geq K_1 n^{1/4}(\log n)^{5/4})\leq 1/n^2\), for \(P(A)\geq 1/2\). In its proof (3.1) is used, establishing that for some \(L\), for every \(A\subset\{0,1\}^n\), \(x\in\{0, 1\}^n\) there exists a probability \(\nu\) on \(A\) and a \(p(x)\) such that for all \(\alpha_i\leq 1\), we have \[ \int\exp\Biggl(\Biggl(\sum_i \alpha_i h_i(x, y)\Biggr) \Biggl/ L\Biggr)d\nu(y)\leq \exp(\|\alpha\|_2(p(x)+ \log^{1/2}(en))) \] and also \(\int\exp(p(x)^2)dP(x)\leq 1/P(A)\).

(1.4) says that \[ P(| Z-EZ|\geq t)\leq K\exp(-t(KU)^{-1}\log(1+ tUV^{-1})), \] where \(Z(x)= \sup_k\sum_i f_k(x_i)\), \(f_k\), \(k=1,2,\dots\), are measurable on \(\Omega\), \(U= \sup_k\| f_k\|_\infty\), \(V= E_\mu(\sup_k \sum_i f(x_i)^2)\). It uses (4.2): \(\int\exp(m(A)/L)dP\leq 1/P(A)\), where \(\Omega\) is finite, \(m(A,x)= \inf_{\nu(A)= 1}m(\nu, x)\), \(m(\nu, x)= \sum_i\psi(d_i)d\mu\), \(\psi(x)= \tau x^2\), \(\tau= (\log 2)/2\) for \(x\leq 2\), \(\psi(x)= x\log x\) for \(x\geq 2\), \(d_i= d\nu_i/d\mu\), \(\nu_i\) is the image by \(y\to y_i\) of the restriction of \(\nu\) to \(\{y; y_i\neq x_i\}\).

(5.1): \(\int G_q(A)dP\leq 1/P(A)^q\), where \(G_q(A, x)= \inf_{\nu(A)= 1}G(\nu,\dots, \nu,x)\), \(\nu\) atomic, \[ G(\nu_1,\dots, \nu_q,x)= \int(a(q)+ 1)^{U(y_1,\dots,y_q;x)}d\nu_1(y_1)\cdots d\nu_q(y_q), \] \(a(q)= (1+ qt_q)^q/(1+ q)^{q-1}\) and \(t_q\) is the largest root of \((1-t)(1+ tq)^q= (1+ t(q+1))^{q-1}\).

(5.4): \(\int\exp(\tau V(A_1,\dots, A_q)/q)dP\leq \prod_r P(A_r)^{-1/q}\),

\(V(A_1,\dots, A_q; x)= \inf_{y_r\in A_r}V(y_1,\dots, y_q;\;x)\), \(V(y_1,\dots, y_q; x)= \text{card}\{i; \sum_r h_i(x,y_r)\geq 2\}\). Also (6.2): in (2.1) one may choose \(\beta\) such that \(\int (e(A)- 1)dP\leq (1- P(A))P(A)^{-1}(\log(e/(1- P(A))))^{-1}\) for all \(n\) and \(A\).

Reviewer: I.Cuculescu (Bucureşti)