Update: This post out of date with regards to current MGL. Please refer to the documentation instead.

This is going to be the start of an introduction series on the MGL Common Lisp machine learning library. MGL focuses mainly on Boltzmann Machines (BMs). In fact, the few seemingly unrelated things it currently offers (gradient descent, conjugate gradient, backprop) are directly needed to implement the learning and fine tuning methods for different kinds of BMs. But before venturing too far into specifics, here is a quick glimpse at the bigger picture and the motivations.

Most of the current learning algorithms are based on shallow architectures: they are fundamentally incapable of basing higher level concepts on other, learned concepts. The most prominent example of succesful shallow learners is Support Vector Machines, for which there is a simple CL wrapper around libsvm, but that's a story for another day.

On the other hand, deep learners are theorethically capable of building abstraction on top of abstraction, the main hurdle in front of their acceptance being that they don't exist or - more precisely - we don't know how to train them.

A good example of a deep learner is the multi-layer perceptron: with only three layers it is a universal approximator which is not a particularly difficult achievement, and the practical implications of this result are not earth shattering: the number of required training examples and hidden units can be very high and generalization can be bad.

Deep architectures mimic the layered organization of the brain and, in theory, have better abstraction, generalization capability, higher encoding effeciency. Of course, these qualities are strongly related. While this has been known/suspected for a long time, it was only recently that training of deep architectures started to become feasible.

Of deep learners, boltzmann machines deserve special attention as they have demonstrated very good performance on a number of problems and have a biologically plausible, local, Hebbian learning rule.

Now that you are sufficiently motivated, stick around and in the later parts of this series we are going to see real examples.