Analysis would easily show which section(s) of the WU client is consuming the most CPU time.
I propose the developers post this section of the code on one of these boards (you could look it up yourself on the source pages).
Then pose a contest for people to write code that computes the function(s) in the least ammount of time, while achieving identical mathmatical results.
This code will evetually be running on hundreds of thousands of CPUs, so one would think every CPU cycle would count.
It couldn't just be generic code for all CPUs. Coders could use CPU identification code, and then different versions of the computation for different models of CPUs.
To make it work, someone would have to develop a "workbench" code to run the computation subroutine, and display the results of how long it took, along with enough sample data to crunch to verify the computation was working correctly. (Both sample source data, and sample set of valid results, for various current platforms P4/Linux, P3/Win2K, Mac/PPC 970FX).
People could use such things as MMX, MMX+, Streaming SIMD Extensions [Floating Point], 3DNow!. Whatever advantages the CPU offered that produced valid results (and was faster, of course)
Results would have be submitted in a standardized fasion. Code + times + which CPUs it was optimized for.