A fast learning algorithm for deep belief nets. (English) Zbl 1106.68094

Summary: We show how to use “complementary priors” to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.


68T05 Learning and adaptive systems in artificial intelligence
68W05 Nonnumerical algorithms


Full Text: DOI


[1] DOI: 10.1109/34.993558 · Zbl 05111066
[2] DOI: 10.1023/A:1012454411458 · Zbl 0998.68102
[3] DOI: 10.1006/inco.1995.1136 · Zbl 0833.68109
[4] DOI: 10.2307/2287576
[5] DOI: 10.1162/089976602760128018 · Zbl 1010.68111
[6] DOI: 10.1126/science.7761831
[7] DOI: 10.1109/5.726791
[8] DOI: 10.1364/JOSAA.20.001434
[9] DOI: 10.1109/34.982899
[10] DOI: 10.1016/0004-3702(92)90065-6 · Zbl 0761.68081
[11] DOI: 10.1109/TIP.2005.852470 · Zbl 05453198
[12] DOI: 10.1016/0893-6080(89)90044-0
[13] DOI: 10.1162/jmlr.2003.4.7-8.1235 · Zbl 1139.68401
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.