Now you are using the broad definition of instruction great...
K15 has 4 IPC per core it has 4 Execution units to process those instructions so 4 IPC 4 x 3.9 GHz = 15.6 IPS
K10 has 6 IPC per core it has 3 Executions units per clock to process those instructions so 3 IPC 3 x 3.7GHz = 11.7 IPS
Well good riddance retard
You can only have high IPS when you have a good clock rate and good IPC
Which is because of ICC
That is MaxxMem not SiSoft
We've been discussing Instructions Per Cycle, not Instructions Per Second. Furthermore, my "idiot wrongliness" chart on XS defined real world performance to achieve performance differences in Instructions Per Cycle.
The amount of "execution units" you are talking about (FPU units), 15h has four 256-bit FPU and 10h has six 128bit. BD has essentially 8 128-bit FPU units through Flex-FP however they service less Instructions Per Cycle than 10h. FX-6100 never really wins in the real world vs 1100T does it?
"Good riddance retard"? Nice one. Your calculations are wrong and you don't know what you are talking about.
By the way, most people measure bandwidth with AIDA Extreme Edition (Used to be Lavalys Everest.).
Sisoftware Sandra reads up to 20GB/s on Phenom II depending on overclocks and Everest will only read 12GB/s.
Again:
K15 has 4 IPC per core it has 4 Execution units to process those instructions so 4 IPC 4 x 3.9 GHz = 15.6 IPS
K10 has 6 IPC per core it has 3 Executions units per clock to process those instructions so 3 IPC 3 x 3.7GHz = 11.7 IPS
Well good riddance retard
FPU decode width:
Phenom II - 3 wide in single thread
Bulldozer - 4 wide 256bit, 2 wide 128 bit
4 256-bit IPC (select programs like 256-bit AVX) / clock on BD, 2 128-bit IPC / clock
3 128-bit IPC / clock on Phenom II
8 "cores" threads * 2 FMAC = 16
6 cores * 3 FMAC = 18
BD can only execute 4 256-bit FMAC at once, so 4 "cores" / threads * 4 FMAC = 16. For 256-bit FMAC, BD is essentially a quad core CPU. Integer and normal 128-bit FMAC however BD acts as an 8 core and is marketed as such.
So you need 12.5% extra clockspeed over Phenom II to make it's 8 128-bit FMAC threads = 6 Phenom II FPU threads, and 50% extra clockspeed over Phenom II to make its 6 = 6. I'm not sure if you noticed, but Phenom II's turbo never works for one, and applies to only 1 thread. Turbo on BD is 3.9 up to all cores and 4.2 on one. In benchmarks with Turbo enabled, BD spends 80% of its time at 3.9 while X6 is stuck at 3.3. That's an 18% difference, so BD's performance often wins when all 8 cores are used. Congrats! ...right?
On BD, IPC decreases. This translates to real world performance, which only increases in applications that use 256-bit FMAC (ie. 256-bit AVX)
Then you have integer calculations...which also play into real world performance.