Thank you somebody for teaching me about Super Low Frequency Mode (SLFM) on the mobile chips like your P8400.
My best guess would be that if one core is loaded then the other core isn't allowed to use SLFM. When you are testing with a single threaded app, one core is going to be averaging near 9.0 and the other core might be cycling between 6.0 and 8.5 depending on background activity. If you average those two out then an average multiplier somewhere around 8.0X would make sense but don't quote me on that.
Interesting that you were able to come up with an approximation of 15ms for a SpeedStep transition. I've been telling people the multiplier can change hundreds of times a second. 15ms is about 67 times a second so if true, I wasn't too far wrong. I think the transition from 3X to 6X probably happens in one step as SLFM is either enabled or disabled but the transitions between 6X and 8.5X might be a little more gradual.
I really like your graphing of the multiplier. I've been thinking of adding a graphing feature to i7 Turbo. I already trust your graph results way more than I'll ever trust TMonitor.
I brought up a few issues I have with TMonitor to a writer at Tom's Hardware and I also introduced him to i7 Turbo. He was very interested and contacted Intel for testing and clarification. Supposedly an engineer at Intel is going to contact me someday but that was a month ago and I haven't heard from either of them since.
I came to the conclusion that maybe I need to add a fancy colored graph to my program so they can have something pretty to put in their reviews. XBit Labs wasn't quite as shy so at least they mentioned i7 Turbo in a recent review.
http://www.xbitlabs.com/articles/mainboards/display/asus-p7p55d-deluxe_8.html
I think whenever new software comes along and says something different than CPUID / CPU-Z then everyone assumes you must be doing something wrong but I don't think I've done anything wrong by closely following the Intel Turbo White Paper. It's nice to see your apps reporting the exact same thing as i7 Turbo.
Why not put some polish into your graphing program. I think something a little bigger would make it easier to see the slight multiplier variations. An option to adjust the time frame or samples per second would also be great so it's easy to see the SpeedStep transitions and their duration.
Here's some interesting testing I just did.
With my load tester program, I can run the equivalent of 50% load on one core which on a Dual Core is an average load of ~25%. By spacing the load out so it is on/off on/off in increments of 100ms on and 100ms off, the multiplier on both cores never goes above the Speedstep default of 6.0X. I've always heard arguments in forums that Speedstep is so fast that any load on a CPU will cause the multi to instantly jump up to the maximum of 9.0X. Well, in this example, that's simply not true.
If the load is more consistent like in this next example where there is full load for 200ms consecutively:
then the CPU is able to start using its maximum multiplier much sooner even with a much smaller average load. Interesting.
Edit: The math is interesting in this case. In theory it should be using the full 9.0X multiplier 20% of the time and the Speedstep 6.0X multiplier the other 80% of the time.
( 9.0 x 20% ) + ( 6.0 x 80% ) = 6.6
That agrees very closely with the average reported multiplier of ~6.7
A little bit of background activity kicking in likely keeps it slightly above the theoretical 6.6 value.