Hacking and Other Thoughts

Sat, 29 Nov 2008

NMI profiling on sparc64...

One thing that always has been suboptimal on sparc64 has been the profiling with oprofile.

Yes it worked but we only have the dumb timer interrupt profiling available. The biggest loss in this is that IRQ disabled code sequences do not get profiled. This leads to "clumps" in the profile, where all the code in an IRQ disabled sequence can show up as one big hit at the point where IRQs get re-enabled.

This makes the profile often non-representative and, at worst, completely unusable.

Say what you want about levelled interrupts, they do provide a level of flexibility that can be useful. Sparc64 chips have a PIL register, you indicate the interrupt levels (out of 15) you want to block out by writing the highest level to block out into the register. So writing zero enables all interrupt levels, and writing 15 blocks all of them out.

Device interrupts work by interrupt vectors, which are not blocked by the PIL mechanism. These are processed quickly in a trap handler and revectored into a PIL levelled interrupt by software.

Under Linux we use one PIL level for all of the device interrupts. A few of the other PIL levels we use for specific SMP cross-call types.

On UltraSPARC-III (cheetah) and later we (finally) have a profiling counter overflow interrupt. This arrives at PIL level 15. So naturally I had the idea to run the majority of the kernel only disabled up to PIL level 14. The result is that was can use these profile counter overflow interrupts to provide a pseudo-NMI oprofile implementation.

This works quite well and is checked into the sparc-next-2.6 GIT tree, so it will show up in 2.6.29