Hacking and Other Thoughts

Tue, 30 May 2006

Ingo's SMP lock validator...

Just hitting linux-kernel today, and mostly merged into -mm is an SMP lock validator written by Ingo Molnar. I've known about this for some time, because as he was writing it Ingo passed along some networking locking bugs that this sucker has found.

It's very clever, and really will revolutionalize our ability to make sure that locking schemes in the kernel are not deadlock prone. First, here is what Ingo's patches do not do, they do not verify that data structures are protected with proper locking. That is not what his tool aims to do. Rather it uses feedback during lock acquisition and release to determine if deadlocks are possible.

When a lock is taken or released, it looks at the context (are we in a hard IRQ, a soft IRQ, etc.) and the other locks currently held. It uses this state to see if there is a conflict with any previously seen usage.

It can thus notice things like classic AB-BA deadlocks. Also, it can see that a lock is taken in interrupt context yet not protected with IRQ disabling when taken in non-interrupt context.

Naturally these verifications are expensive, as the whole set of locking posibilities have to be investigated when we get a new case. But, it only happens when a new locking scenerio is encountered. Each lock state snapshot computes a hash, and Ingo's code keeps in the hash table all the locking scenerios seen so far. If there is a hit, no computations or verifications need to be performed, we know it's good already.

The big deal is that previously all of this kind of stuff either had to be audited by a human by hand, or determined based upon a bug report (which is usually in the form of the user's machine being totally wedged when the deadlock occurs).

There will be a bunch of false positives at first, for sure, and some extra smarts will be added to the validator to handle some strange corner-case things which look invalid but which are in fact fine. But once that is sorted out this will be a very useful tool. You will be able to find SMP deadlocks requiring lots of cpus on a uniprocessor machine.

Make the computer do the work.