# zbMATH — the first resource for mathematics

##### Examples
 Geometry Search for the term Geometry in any field. Queries are case-independent. Funct* Wildcard queries are specified by * (e.g. functions, functorial, etc.). Otherwise the search is exact. "Topological group" Phrases (multi-words) should be set in "straight quotation marks". au: Bourbaki & ti: Algebra Search for author and title. The and-operator & is default and can be omitted. Chebyshev | Tschebyscheff The or-operator | allows to search for Chebyshev or Tschebyscheff. "Quasi* map*" py: 1989 The resulting documents have publication year 1989. so: Eur* J* Mat* Soc* cc: 14 Search for publications in a particular source with a Mathematics Subject Classification code (cc) in 14. "Partial diff* eq*" ! elliptic The not-operator ! eliminates all results containing the word elliptic. dt: b & au: Hilbert The document type is set to books; alternatively: j for journal articles, a for book articles. py: 2000-2015 cc: (94A | 11T) Number ranges are accepted. Terms can be grouped within (parentheses). la: chinese Find documents in a given language. ISO 639-1 language codes can also be used.

##### Operators
 a & b logic and a | b logic or !ab logic not abc* right wildcard "ab c" phrase (ab c) parentheses
##### Fields
 any anywhere an internal document identifier au author, editor ai internal author identifier ti title la language so source ab review, abstract py publication year rv reviewer cc MSC code ut uncontrolled term dt document type (j: journal article; b: book; a: book article)
Online learning with Markov sampling. (English) Zbl 1170.68022

Summary: This paper attempts to give an extension of learning theory to a setting where the assumption of i.i.d. data is weakened by keeping the independence but abandoning the identical restriction. We hypothesize that a sequence of examples $\left({x}_{t},{y}_{t}\right)$ in $X×Y$ for $t=1,2,3,\cdots$ is drawn from a probability distribution ${\rho }_{t}$ on $X×Y$.

The marginal probabilities on $X$ are supposed to converge to a limit probability on $X$. Two main examples for this time process are discussed. The first is a stochastic one which in the special case of a finite space $X$ is defined by a stochastic matrix and more generally by a stochastic kernel. The second is determined by an underlying discrete dynamical system on the space $X$. Our theoretical treatment requires that this dynamics be hyperbolic (or “Axiom A”) which still permits a class of chaotic systems (with Sinai-Ruelle-Bowen attractors). Even in the case of a limit Dirac point probability, one needs the measure theory to be defined using Hölder spaces.

Many implications of our work remain unexplored. These include, for example, the relation to Hidden Markov Models, as well as Markov Chain Monte Carlo methods. It seems reasonable that further work should consider the push forward of the process from $X×Y$ by some kind of observable function to a data space.

##### MSC:
 68Q32 Computational learning theory 37D15 Morse-Smale systems 41A25 Rate of convergence, degree of approximation 60B11 Probability theory on linear topological spaces 60J10 Markov chains (discrete-time Markov processes on discrete state spaces)
##### Keywords:
reproducing kernel Hilbert space