Tuesday, June 29, 2010

On CPU profiling

CPUs are different ... no really, they are. And when working with task scheduling over heterogeneous peers, which is the metier of Scavenger, this heterogeneity of the peers makes it very hard to come up with precise running time estimates. When estimating a task's running time on a peer one needs two things: 1) the running time of the task (with the given input) on a known peer, and 2) a "strength" rating on each peer that says something about how strong the peer is w.r.t. processing work. Having this information an estimated running time can be derived simply by assuming that a peer of strength x can perform a given task twice as fast as a peer of strength x/2.

This is all fairly straight forward, but the tricky part is finding that magical "strength" value for each peer. In related systems, such as Spectra and Chroma, the CPUs clock speed and its Linux BogoMIPS rating has been used, respectively. I expected that these values were pretty useless at estimating peer strength, and my experiments up until now seem to confirm that suspicion. Because of that I in Scavenger chose to use a real CPU benchmark for the strength value, thinking that a CPU benchmark would be written specifically to exercise all the various parts of a CPU and therefore would be more apt at describing its relative strength. I have used the nbench benchmarking suite - primarily because it is available as source code, so that I can compile and run it on any platform that I would like to test. Nbench returns two scores, an integer and a floating point score, so I chose to use the average of these two scores as peer strength. My experiments show that this value does indeed model CPU strength much better than clock speed or BogoMIPS - but it is far from perfect!

Because of that I have started work on a new profiling approach - one that will either 1) increase the accuracy when estimating running time on hitherto unknown peers, or 2) show that the performance of different CPU architectures can not be scored on a linear scale... Of course I am hoping for 1 but I somehow expect to conclude 2 :-)