Tuesday, December 29 2009 @ 00:00 +0100
In the previous part we went through a trivial example of a backprop network. I said before that the main focus is on Boltzmann Machines so let's kill the suspense here and now by cutting straight to the heart of the matter.
Cottrell's Science article provides a clear and easy to follow description of the spiral problem that we are going to implement. The executive summary is that we want to train an auto-encoder: a network that reproduces its input as output with a small encoding layer somewhere in between. By forcing the information through the bottleneck of the encoding layer the network should pick up a low dimensional code that represents the input, thus performing dimensionality reduction.
The function under consideration is f(x) = [x, sin(x), cos(x)]. It
is suprisingly difficult to learn the mapping from x to f(x). A
network architecture that is able to represent this transformation has
3 inputs, 10 neurons in the next layer, 1 neuron in the encoding
layer, 10 neurons again in the reconstruction part and 3 in the output
layer. However, randomly initialized backpropagation fails at learning
this; a better solution is to first learn a Deep Belief Network,
"unroll" it to a backprop network and use backprop to fine tune the
weights.