Gábor Melis' () blog

Global Compiler Policy

A quick note to library implementors: the effects of DECLAIM are permitted to persist after the containing file is compiled and it is unkind to mutate your user's settings. Personally, I find DECLAIM too blunt and prefer to add declarations within functions, even going as far as introducing LOCALLY subforms just to have a place on which to hang declarations. But if you are really set on using DECLAIM, please wrap it like this:

 (eval-when (:compile-toplevel)
  (declaim (optimize speed)))

to ensure that the body is evaluated in the dynamic execution context of the compiler which makes a practical difference on Allegro. It goes without saying that you don't want (PROCLAIM '(OPTIMIZE ...)) in a library, either.

Active Learning for cl-libsvm

Along the lines of active learning with python & libsvm, I added support for calculating distance of a point from the separating hyperplane to cl-libsvm. In binary classification there is only one SVM involved and one hyperplane. However, with N class problems there is a binary SVM for each of the N*(N-1)/2 pairs of classes and there are as many separating hyperplanes, something the linked python code fails to take into account. As per the libsvm FAQ, the absolute value of the decision value (see PREDICT-VALUES, wrapper of svm_predict_values) divided by the norm of the normal vector of the separating hyperplane is the distance. PREDICT-VALUES and MODEL-W2S are sufficient to calculate it. Note that among the distributed binaries only the linux-x86 version has been recompiled with the necessary changes, but patched sources are also included for your recompiling pleasure.

Calling Convention Hacks

SBCL's calling convention is rather peculiar. Frames are allocated and mostly set up by the caller. The new frame starts with a pointer to the old frame, then comes the return address, an empty slot and the stack arguments (the first three are passed in registers on x86).

Software archeology aside, the only reason I can see for this scheme is that stack arguments are easier to manipulate when they are after the return address, old frame pointer part, in particular tail calls with any number of arguments can be made without re[al]locating the frame.

The first step towards callee allocated frames is swapping the return address and old fp slots. Asking an innocent question on #lisp accomplished most of the work as Alastair Bridgewater had a patch for x86 against a 0.9ish version that does exactly this.

Read more

X86oid Pseudo Atomic

The relatively recent chit-chat about allocation and interrupts have had me looking at ways to speed up pseudo atomic in SBCL.

(defmacro pseudo-atomic (&rest forms)
  (with-unique-names (label)
    `(let ((,label (gen-label)))
       (inst or (make-ea :byte :disp (* 4 thread-pseudo-atomic-bits-slot))
             (fixnumize 1) :fs)
       ,@forms
       (inst xor (make-ea :byte :disp (* 4 thread-pseudo-atomic-bits-slot))
             (fixnumize 1) :fs)
       (inst jmp :z ,label)
       ;; if PAI was set, interrupts were disabled at the same
       ;; time using the process signal mask.
       (inst break pending-interrupt-trap)
       (emit-label ,label))))

EBP

Read more

Code alignment on x86

There has always been a lot of wiggling of SBCL boinkmarks results. It's easy to chalk this up to system load, but the same can be observed running the cl-bench benchmarks under more ideal circumstances. Part of the reason is the insufficient number of iterations of some tests: measurement accuracy is really bad when the run time is below 0.2s and it is abysmal when there is other activity on the system which is easy to tell even in retrospect by comparing the real and user time columns.

But that's not the end of the story, take for instance FPRINT/PRETTY: it takes more than two seconds but often experiences changes up to 7% caused by seemingly unrelated changes. People have fingered alignment as a likely cause.

Recently this issue has become more pressing as I've been trying to reduce the overhead of x86's pseudo atomic. Unfortunately, the effect is smallish which makes measurement difficult so I tried aligning loops on 16 byte boundaries. This being on x86, that meant aligning code similarly first (it's 8 byte aligned currently).

Read more

Space Cadet

Emacs users often report problems caused by strain on the pinky finger that's used to press the Control key. The standard answer to that is to map Caps Lock to Control. I believe that there is a better way.

Note the placement of modifiers: Control, Meta, Super, Hyper on both sides of Space in this order, with Control being the closest to it. Touch typers especially find having two of each key absolutely essential and the symmetric placement appeals to me.

Also note the Rubout key, next to A where Caps Lock resides on modern keyboards. Rubout is like Backspace and is better to have on the home row than the most useless and annoying key in history.

Read more

2008 Computer Games Olympiad

It seems that the competition has not been standing still (as opposed to Six) and this year marks the end of the golden era. Congratulations to both Wolve and MoHex who beat Six! Thanks to Ryan Hayward who, again, kindly registered Six for the Olympiad.

About the future, I don't really plan on resuming work on Hex in general (and Six in particular), although losing does irk me a bit.

Backup

My carefully updated list of files to backup had grown so long that it made me worry about losing something important and the backup didn't fit on a single dvd so I invested in a WD passport and created an encrypted file system on it:

 modprobe cryptoloop
 modprobe aes
 losetup -e aes /dev/loop0 /dev/sdb
 mke2fs /dev/loop0
 tune2fs -i 0 -c 0 -j /dev/loop0

Then taking a backup is an rsync and some setup-up/tear-down code away:

Read more

Important remainder

An example may speak a hundred words, but sometimes not even that is enough and you want to be very explicit about the dangers of hand grenades on board.

Finally, some trash talk carefully designed to intimidate. Never die without having made the necessary arrangements.

All pictures were taken at Málaga airport.

First post

After a long time of waiting to write my own blog software like true hackers with infinite time do (and those irritated by Wordpress), I bit the bullet and installed blorg - a very low overhead emacs blog engine - on top of org-mode that I happen to use as an organizer. Blorg basically converts an org mode buffer to html files so it is completely static: send me email if you have comments, I have no desire to maintain a more complex solution with comment filtering.

Small fixes had to be made for blorg to be able to deal with org-mode 5.17a and I only had time to bring it to some basic level of functionality. That said here is the blorg.el file I'm using right now.