On the Design of Matrix Libraries

Tags: AI, Lisp, Blog

UPDATE 2020-05-03 -- Things have been moving fast. This is a non-issue in Tensorflow and possibly in other frameworks, as well.

2015-02-26 -- I believe there is one design decision in MGL-MAT that has far reaching consequences: to make a single matrix object capable of storing multiple representations of the same data and let operations decide which representation to use based on what's the most convenient or efficient, without having to even know about all the possible representations.

This allows existing code to keep functioning if support for diagonal matrices (represented as a 1d array) lands and one can pick and choose the operations performance critical enough to implement with diagonals.

... read the rest of On the Design of Matrix Libraries.

Recurrent Nets

Tags: AI, Lisp, Blog

2015-01-19 -- I've been cleaning up and documenting MGL for quite some time now and while it's nowhere near done, a good portion of the code has been overhauled in the process. There are new additions such as the Adam optimizer and Recurrent Neural Nets. My efforts were mainly only the backprop stuff and I think the definition of feed-forward:

(build-fnn (:class 'digit-fnn)
  (input (->input :size *n-inputs*))
  (hidden-activation (->activation input :size n-hiddens))
  (hidden (->relu hidden-activation))
  (output-activation (->activation hidden :size *n-outputs*))
  (output (->softmax-xe-loss :x output-activation)))

and recurrent nets:

(build-rnn ()
  (build-fnn (:class 'sum-sign-fnn)
    (input (->input :size 1))
    (h (->lstm input :size n-hiddens))
    (prediction (->softmax-xe-loss
                 (->activation h :name 'prediction :size *n-outputs*)))))

is fairly straight-forward already. There is still much code that needs to accompany such a network definition, mostly having to do with how to give inputs and prediction targets to the network and also with monitoring training. See the full examples for feed-forward and recurrent nets in the documentation.

Higgs Boson Machine Learning Challenge Bits and Pieces

Tags: AI, Lisp, Blog

2014-09-23 -- The Higgs Boson contest on Kaggle has ended. Sticking to my word at ELS 2014, I released some code that came about during these long four months.

MGL-GPR is no longer a Genetic Programming only library because it got another Evolutionary Algorithm implementation: Differential Evolution. My original plan for this contest was to breed input features that the physicists in their insistence on comprehensibility overlooked, but it didn't work as well as I had hoped for reasons specific to this contest and also because evolutionary algorithms just do not scale to larger problem sizes.

In other news, MGL got (, bagging and stratification support in the brand new MGL-RESAMPLE package documented with MGL-PAX which all of you will most definitely want to use. My winning submission used bagged cross-validated dropout neural networks with stratified splits so this is where it's coming from.

MGL itself and MGL-MAT were updated to work with the latest CL-CUDA. The neural network code also saw some additions such as ->MAX-CHANNEL activation (which originated as LWTA) and also gaussian multiplicative noise. The next steps here are further cleanups to MGL, writing documentation and moving it to github. Also, there is some hope that one day CL-CUDA can be included in quicklisp allowing my stuff there to be updated to their latest versions.

The code for this contest is available at ( from now on doubles as my skeleton for lisp projects that need to be delivered as source and as binary. It sucks in all dependencies from quicklisp available at a certain date, clones the necessary repositories not available in quicklisp, builds an executable, and has a simple 'make dist' rule as well.

There is also a fairly generic ensembling algorithm that I will factor out of the code later.

Higgs Boson Machine Learning Challenge Post-Mortem

Tags: AI, Lisp, Blog

2014-09-23 -- Actually, I'll only link to the post-mortem I wrote in the forum. There is a also a model description included in the git repo. A stand-alone distribution with all library dependencies and an x86-64 linux precompiled binary is also available.

This has been the Kaggle competition that attracted the most contestants so it feels really good to come out on top even though there was an element of luck involved due to the choice of evaluation metric and the amount of data available. The organizers did a great job explaining the physics, why there is no more data, motivating the choice of evaluation metric, and being prompt in communication in general.

I hope that the HEP guys will find this useful in their search for more evidence of tau tau decay of the Higgs boson. Note that I didn't go for the 'HEP meets ML Award' so training time is unnecessarily high (one day with a GTX Titan GPU). By switching to single precision floating point and a single neural network, training time could be reduced to about 15 minutes with an expected drop in accuracy from 3.805 to about 3.750. Even with the bagging approach the code logs out-of-bag estimates of the evaluation metric after training each constituent model and the training process can be C-c'ed early. Furthermore, the model can be run on a CPU with BLAS about 10 times slower than on a Titan.

Liblinear Support Added to cl-libsvm

Tags: AI, Lisp, Blog

2013-04-09 -- In addition to the cl-libsvm asdf system, there is now another asdf system in the cl-libsvm library: cl-liblinear that, predictably enough, is a wrapper for liblinear. The API is similar to that of cl-libsvm.

Stackoverflow Post-Mortem

Tags: AI, Lisp, Blog

2013-04-09 -- After almost two years without a single competition, last September I decided to enter the Stackoverflow contest on Kaggle. It was a straightforward text classification problem with extremely unbalanced classes.

Just as Bocsimack├│ did the last time around, his lazier sidekick Malacka (on the right) brought success. I would have loved to be lazy and still win, but the leaderboard was too close for comfort.

... read the rest of Stackoverflow Post-Mortem.


Tags: AI, Lisp, Blog

2010-12-27 -- It hasn't been a year yet since I first promised that alpha-beta snippet and it is already added to micmac in all its 35 line glory. The good thing about not rushing it out the door is that it saw more a bit more use. For a tutorialish tic-tac-toe example see test/test-game-theory.lisp.

The logging code in the example produces output, which is suitable for cut and pasting into an org-mode buffer and exploring it by TABbing into subtrees to answer the perpetual 'What the hell was it thinking?!' question.

Nash equilibrium finder

Tags: AI, Lisp, Blog

2010-12-26 -- While I seem to be unable to make my mind up on a good interface to alpha-beta with a few bells and whistles, I added a Nash equilibrium finder to Micmac that's becoming less statistics oriented. This was one of the many things in Planet Wars that never really made it.

Let's consider the Matching pennies game. The row player wins iff the two pennies show the same side. The payoff matrix is:

|       | Heads | Tails |
| Heads |     1 |    -1 |
| Tails |    -1 |     1 |

Find the mixed strategy equilibrium:

(find-nash-equilibrium '((-1 1) (1 -1)))
#(49 51)
#(50 50)

That is both players should choose heads 50% of the time and the expected payoff (for the row player) is zero of which -0.01 is an approximation:

(find-nash-equilibrium '((-1 1) (1 -1)) :n-iterations 1000)
#(499 501)
#(500 500)

Planet Wars Post-Mortem

Tags: AI, Lisp, Blog

2010-12-01 -- I can't believe I won.

I can't believe I won decisively at all.

The lead in the last month or so was an indicator of having good chances, but there was a huge shuffling of ranks in the last week and some last minute casualties.


Note that the git repository is available at (gitweb).


I had promised myself not to enter this one and resisted for about two weeks when my defenses were worn away and I was drawn into the fray.

The game didn't look very exciting at first. I thought that the bots would soon reach a point of near perfect tactics and the rock-paper-scissors scenarios would dominate (more on this later).

That's enough of tribute, let's steer off the trodden path.

... read the rest of Planet Wars Post-Mortem.

Important Update to the Planet Wars Starter Package

Tags: AI, Lisp, Blog

2010-10-25 -- First, is it possible to get something as simple as RESOLVE-BATTLE wrong? Apparently, yes. That's what one gets for trying to port Python code that's pretty foreign in the sense of being far from the way I'd write it.

More importantly, I found out the hard way that sbcl 1.0.11 that's still on the official servers has a number of bugs in its timer implementation making WITH-TIMEOUT unreliable. Also, it can trigger timeouts recursively eventually exceeding the maximum interrupt nesting depth. Well, "found out" is not the right way to put it as we did fix most of these bugs ages ago.

In the new starter package (v0.8 in git, latest tarball), you'll find timer.lisp that's simply backported almost verbatim from sbcl 1.0.41 to sbcl 1.0.11. Seems to work for me, but I also had to lower the timeout to 0.8 from 0.98 because the main server is extremely slow.

The rate at which games are played on the servers is so low that it takes several days to ascend through the leaderboard. Nevertheless, an old buggy version is sitting on the top right now. Mind you, introducing bugs is a great way exlopore the solution space and it's quite worrisome just how adept I am at this poor man's evolutionary programming. Most of them have since been fixed while the ideas they brought to light remain, making the current version much stronger.

Planet Wars Common Lisp Starter Package Actually Works

Tags: AI, Lisp, Blog

2010-09-21 -- Released v0.6 (git, latest tarball). The way the server compiles lisp submissions was fixed and this revealed a problem where MyBot.lisp redirected *STANDARD-OUTPUT* to *ERROR-OUTPUT* causing the server to think compilation failed.

Planet Wars Common Lisp Starter Package

Tags: AI, Lisp, Blog

2010-09-19 -- The Google AI Challange is back with a new game that's supposed to be much harder than Tron was this spring. The branching factor of the game tree is enormous which only means that straight minimax is out of question this time around. Whether some cleverness can bring the game within reach of conventional algorithms remains to be seen.

Anyway, I'm adding yet another starter package (latest tarball) to the lot. It is based heavily on aerique's.

... read the rest of Planet Wars Common Lisp Starter Package.

Google AI Challange 2010 Results

Tags: AI, Lisp, Blog

2010-03-01 -- For what has been a fun ride, the official results are now available. In the end, 11th out of 700 is not too bad and it's the highest ranking non-C++ entry by some margin.

I entered the contest a bit late with a rather specific approach in mind: UCT, an algorithm from the Monte Carlo tree search family. It has been rather successful in Go (and in Hex too, taking the crown from Six). So with UCT in mind, to serve as a baseline I implemented a quick minimax with a simple territory based evaluation function ... that everyone else in the competition seems to have invented independently. Trouble was looming because it was doing too well: with looking ahead only one move (not even considering moves of the opponent) it played a very nice positional game. That was the first sign that constructing a good evaluation function may not be as hard for Tron as it is for Go.

... read the rest of Google AI Challange 2010 Results.

Google AI Challenge 2010

Tags: AI, Lisp, Blog

2010-02-11 -- Tron is a fun little game of boxing out the opponent and avoiding crashing into a wall first. The rules are simple so the barrier to entry into this contest is low. Thanks to aeruiqe who made to Common Lisp starter pack it took as little as a few hours to get a very bare bones algorithm going. It's doing surprisingly well: it is number 23 on the leaderboard at the moment with 43 wins, 2 losses and 9 draws.

Micmac Initial Release

Tags: AI, Lisp, Blog

2010-02-06 -- From a failed experiment today I salvaged Micmac, a statistical library wannabe, that for now only has Metropolis-Hastings MCMC and Metropolis Coupled MCMC implemented. The code doesn't weigh much but I think it gets the API right. In other news MGL v0.0.6 was released.

Deep Boltzmann Machine on MNIST

Tags: AI, Lisp, Blog

2010-01-18 -- Let me interrupt the flow of the MGL introduction series with a short report on what I learnt playing with Deep Boltzmann Machines. First, lots of thanks to Ruslan Salakhutdinov, then at University of Toronto now at MIT, for making the Matlab source code for the MNIST digit classification problem available.

The linked paper claims a record of 99.05% in classification accuracy on the permutation invariant task (no prior knowledge of geometry). A previous approach trained a DBN in an unsupervised manner and fine tuned it with backpropagation. Now there is one more step: turning the DBN into a DBM (Deep Boltzmann Machine) and tune it further before handing the baton over to backprop. While in a DBN the constituent RBMs are trained one by one, the DBM is trained as a whole which, in theory, allows it to reconcile bottom-up and top-down signals, i.e. what you see and what you think.

... read the rest of Deep Boltzmann Machine on MNIST.

Introduction to MGL (part 3)

Tags: AI, Lisp, Blog

UPDATE: This post out of date with regards to current MGL. Please refer to the documentation instead.

2009-12-29 -- In Introduction to MGL (part 2), we went through a trivial example of a backprop network. I said before that the main focus is on Boltzmann Machines so let's kill the suspense here and now by cutting straight to the heart of the matter.

Cottrell's Science article provides a clear and easy to follow description of the spiral problem that we are going to implement. The executive summary is that we want to train an auto-encoder: a network that reproduces its input as output with a small encoding layer somewhere in between. By forcing the information through the bottleneck of the encoding layer the network should pick up a low dimensional code that represents the input, thus performing dimensionality reduction.

... read the rest of Introduction to MGL (part 3).

Introduction to MGL (part 2)

Tags: AI, Lisp, Blog

UPDATE: This post out of date with regards to current MGL. Please refer to the documentation instead.

2009-12-17 -- After Introduction to MGL (part 1), today we are going to walk through a small example and touch on the main concepts related to learning within this library.

At the top of the food chain is the generic function TRAIN:

(defgeneric train (sampler trainer learner)
  (:documentation "Train LEARNER with TRAINER on the examples from
SAMPLER. Before that TRAINER is initialized for LEARNER with
INITIALIZE-TRAINER. Training continues until SAMPLER is finished."))

... read the rest of Introduction to MGL (part 2).

Introduction to MGL (part 1)

Tags: AI, Lisp, Blog

UPDATE: This post out of date with regards to current MGL. Please refer to the documentation instead.

2009-12-02 -- This is going to be the start of an introduction series on the MGL Common Lisp machine learning library. MGL focuses mainly on Boltzmann Machines (BMs). In fact, the few seemingly unrelated things it currently offers (gradient descent, conjugate gradient, backprop) are directly needed to implement the learning and fine tuning methods for different kinds of BMs. But before venturing too far into specifics, here is a quick glimpse at the bigger picture and the motivations.

... read the rest of Introduction to MGL (part 1).

Active Learning for cl-libsvm

Tags: AI, Lisp, Blog

2009-06-22 -- Along the lines of active learning with python & libsvm, I added support for calculating distance of a point from the separating hyperplane to cl-libsvm. In binary classification there is only one SVM involved and one hyperplane. However, with N class problems there is a binary SVM for each of the N*(N-1)/2 pairs of classes and there are as many separating hyperplanes, something the linked python code fails to take into account. As per the libsvm FAQ, the absolute value of the decision value (see PREDICT-VALUES, wrapper of svm_predict_values) divided by the norm of the normal vector of the separating hyperplane is the distance. PREDICT-VALUES and MODEL-W2S are sufficient to calculate it. Note that among the distributed binaries only the linux-x86 version has been recompiled with the necessary changes, but patched sources are also included for your recompiling pleasure.

2008 Computer Games Olympiad

Tags: AI, Blog

2008-12-11 -- It seems that the competition has not been standing still (as opposed to Six) and this year marks the end of the golden era. Congratulations to both Wolve and MoHex who beat Six! Thanks to Ryan Hayward who, again, kindly registered Six for the Olympiad.

About the future, I don't really plan on resuming work on Hex in general (and Six in particular), although losing does irk me a bit.