• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Bulldozer Core-Count Debate Comes Back to Haunt AMD

Joined
May 3, 2014
Messages
965 (0.25/day)
System Name Sham Pc
Processor i5-2500k @ 4.33
Motherboard INTEL DZ77SL 50K
Cooling 2 bay res. "2L of fluid in loop" 1x480 2x360
Memory 16gb 4x4 kingstone 1600 hyper x fury black
Video Card(s) hfa2 gtx 780 @ 1306/1768 (xspc bloc)
Storage 1tb wd red 120gb kingston on the way os, 1.5Tb wd black, 3tb random WD rebrand
Display(s) cibox something or other 23" 1080p " 23 inch downstairs. 52 inch plasma downstairs 15" tft kitchen
Case 900D
Audio Device(s) on board
Power Supply xion gaming seriese 1000W (non modular) 80+ bronze
Software windows 10 pro x64
i dont disagree...
But there is a "TRADITIONAL CORE" what people are used to. even amd admit it in the paper above.
Then there are modules And you can call whats in those anything you want to call them aslong as you make a destinction, Which amd briefly did.

the issue is they decided to just go on the "look at us 8 reall cores" marketing thing. I was always against it and spoke out at the time.
Would i try and sue them for it?? no because I knew what to expect..
Would my family have known better?? hell no.
Should people who were duped be allowed to try and get some justice?? yes..

Should this law suit continue?? Yes..
If for no other reason that for mfrs to just accept they cant just trick the uneducated.
You havd vista ready and vista capable years ago that ended up in the same situation as this. and thats even less clear cut.

i have no doubt in my mind if moduels had been super effective, then every one would have moved on to call them cores. But they werent and so we didnt and we wont.
 

cdawall

where the hell are my stars
Joined
Jul 23, 2006
Messages
27,680 (4.13/day)
Location
Houston
System Name All the cores
Processor 2990WX
Motherboard Asrock X399M
Cooling CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory 4x16GB G.Skill 3600
Video Card(s) (2) EVGA SC BLACK 1080Ti's
Storage 2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s) Dell UP2414Q 3840X2160@60hz
Case Caselabs Mercury S5+pedestal
Audio Device(s) Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply Seasonic Prime 1200w
Mouse Thermaltake Theron, Steam controller
Keyboard Keychron K8
Software W10P
and less than 2 cores would have, point is mute.

no one complained for them bing modules.. the issue is they call them 8 cores when they are demonstrably not.

Except for the time they showed identical if not better scaling than any other 8 core product on the market when placed into a multithreaded environment.

You're the only person I quoted, bub. I'm not @cdawall. Also IEEE has members from just about every major hardware vendor.

With 420,000 members they have more than one :roll:

1st line of the quote
"Just adding traditional cores isn’t going to be enough, says AMD’s Moore. "

which Also right there says they arent cores. AMD said they arent cores right there in that stupid thing you just quoted.

oh and the paper is for a "module" not a "core"

All the evidence you bring just contradicts what you say.. and yet you still say it!

I take it you actually read neither of the articles I linked. The second article speaks about the natural evolution of multicore designs as was seen in 2010. It brings up GPU cores, which apparently aren't cores, it brings up CELL cores, which apparently aren't all real cores and mentions how this is how the evolution of things is going. The article did falsely say Bulldozer would be great (paraphrasing), but a lot of what they said is absolutely holding true. We have mixed cores in so many different devices and Bulldozers design absolutely is a part of that.

Here is another quote, since you didn't bother to read the first article linked which was written by no less than 8 PhD holding people from various companies including AMD, HP, HAL, etc.

The module includes two independent integer cores but shares the fetch, decode, floating-point, and L2 cache units to maximize single-threaded performance and multi-threaded throughput while significantly improving power and area efficiency compared to fully replicated CPU cores.

I want you to read that out loud to yourself. This is merely from the abstract. I would link to the actual article sections, but seeing how the conversation is going I could link a McDonald's menu and you would argue about it not being real fast food or some nonsense.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Should this law suit continue?? Yes..
If for no other reason that for mfrs to just accept they cant just trick the uneducated.
Except it's not a trick. You literally got 8 shitty cores instead of 4 or 6 decent ones.
 
Joined
May 3, 2014
Messages
965 (0.25/day)
System Name Sham Pc
Processor i5-2500k @ 4.33
Motherboard INTEL DZ77SL 50K
Cooling 2 bay res. "2L of fluid in loop" 1x480 2x360
Memory 16gb 4x4 kingstone 1600 hyper x fury black
Video Card(s) hfa2 gtx 780 @ 1306/1768 (xspc bloc)
Storage 1tb wd red 120gb kingston on the way os, 1.5Tb wd black, 3tb random WD rebrand
Display(s) cibox something or other 23" 1080p " 23 inch downstairs. 52 inch plasma downstairs 15" tft kitchen
Case 900D
Audio Device(s) on board
Power Supply xion gaming seriese 1000W (non modular) 80+ bronze
Software windows 10 pro x64
Except it's not a trick. You literally got 8 shitty cores instead of 4 or 6 decent ones.
you got 4 decent cores and some other bits that could usually but not always do tasks in conjunction

@cdawall read that out loud and it says "integer cores" and "compared to fully replicated CPU cores "
like i said they can call them "integer, imaginary, lite " whatever they want as long as they define it..

But loe oand behold it didnt say "8 integer cores, not traditional" on the box.
Youd think theyd be yelling that from the roof tops if they were as good or better. Or that they wouldnt bother if they were trying to trick consumers.
 
Last edited:

cdawall

where the hell are my stars
Joined
Jul 23, 2006
Messages
27,680 (4.13/day)
Location
Houston
System Name All the cores
Processor 2990WX
Motherboard Asrock X399M
Cooling CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory 4x16GB G.Skill 3600
Video Card(s) (2) EVGA SC BLACK 1080Ti's
Storage 2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s) Dell UP2414Q 3840X2160@60hz
Case Caselabs Mercury S5+pedestal
Audio Device(s) Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply Seasonic Prime 1200w
Mouse Thermaltake Theron, Steam controller
Keyboard Keychron K8
Software W10P
you got 4 decent cores and some other bits that could usually but not always do tasks in conjunction

Those same bits don't even exist in multiple histories of multiple CPU's that were standalone. Per the actual IEEE tech publication for bulldozer each module consists of 2 integer cores with some shared resources.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
you got 4 decent cores and some other bits that could usually but not always do tasks in conjunction
The only SMT-esqe part about this is the FPU... and now we're going full circle.
 

cdawall

where the hell are my stars
Joined
Jul 23, 2006
Messages
27,680 (4.13/day)
Location
Houston
System Name All the cores
Processor 2990WX
Motherboard Asrock X399M
Cooling CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory 4x16GB G.Skill 3600
Video Card(s) (2) EVGA SC BLACK 1080Ti's
Storage 2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s) Dell UP2414Q 3840X2160@60hz
Case Caselabs Mercury S5+pedestal
Audio Device(s) Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply Seasonic Prime 1200w
Mouse Thermaltake Theron, Steam controller
Keyboard Keychron K8
Software W10P
you got 4 decent cores and some other bits that could usually but not always do tasks in conjunction

@cdawall read that out loud and it says "integer cores" like i said they can call them "integer, imaginary, lite " whatever they want as long as they define it..

But loe oand behold it didnt say "8 integer cores, not traditional" on the box.
Youd think theyd be yelling that from the roof tops if they were as good or better. Or that they wouldnt bother if they were trying to trick consumers.

Traditional cores do not have an FPU. I honestly don't know why that is difficult to understand, but since you cannot get that through your thick skull I guess you win.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Or that they wouldnt bother if they were trying to trick consumers.
A typical consumer doesn't even know what a FPU is or what it does, man. Marketing has to be simple.
 

cdawall

where the hell are my stars
Joined
Jul 23, 2006
Messages
27,680 (4.13/day)
Location
Houston
System Name All the cores
Processor 2990WX
Motherboard Asrock X399M
Cooling CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory 4x16GB G.Skill 3600
Video Card(s) (2) EVGA SC BLACK 1080Ti's
Storage 2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s) Dell UP2414Q 3840X2160@60hz
Case Caselabs Mercury S5+pedestal
Audio Device(s) Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply Seasonic Prime 1200w
Mouse Thermaltake Theron, Steam controller
Keyboard Keychron K8
Software W10P
Actually. I am fixing this for myself. Guy can't figure out that an integer calculation and floating point calculation are not the same thing. This is not worth my time. Hope you folks had a good read.

 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Actually. I am fixing this for myself. Guy can't figure out that an integer calculation and floating point calculation are not the same thing. This is worth my time. Hope you folks had a good read.

You are a wise man. :respect:
 
Joined
May 3, 2014
Messages
965 (0.25/day)
System Name Sham Pc
Processor i5-2500k @ 4.33
Motherboard INTEL DZ77SL 50K
Cooling 2 bay res. "2L of fluid in loop" 1x480 2x360
Memory 16gb 4x4 kingstone 1600 hyper x fury black
Video Card(s) hfa2 gtx 780 @ 1306/1768 (xspc bloc)
Storage 1tb wd red 120gb kingston on the way os, 1.5Tb wd black, 3tb random WD rebrand
Display(s) cibox something or other 23" 1080p " 23 inch downstairs. 52 inch plasma downstairs 15" tft kitchen
Case 900D
Audio Device(s) on board
Power Supply xion gaming seriese 1000W (non modular) 80+ bronze
Software windows 10 pro x64
Traditional cores do not have an FPU. I honestly don't know why that is difficult to understand, but since you cannot get that through your thick skull I guess you win.

ok tell me what part of this you dont understand..
Bulldozers were slow. because each module did not act like 2 cores in windows.
ms had to change the sceduler to eliviate the issue.
amd call the modules Integer cores and define catagorically that they are not actuall cores.
amd later abandon the moduels thing because its just worse than using reall cores.

BUT amd advertized bulldozer as having 8 reall cores.
people were upset and so started a law suit.

all the evidence you have presented shows catagorically that amd didnt think they were traditional cores. didnt call them traditional cores, defined them as diferent to traditional cores and they cut parts out to reduce power.
And yet advertized them as cores.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.46/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
I could've very easily disabled 1 core from each "module" on my FX 6300 and guess what Windows would boot just fine and software ran as it should, floating point functionality still intact. How could that have worked if Piledriver didn't have independent cores ? What am I missing ? Can you still not see that your assertion that something like the 8350 didn't have 8 independent core is plain and simple wrong ?
Pretty sure you can't. Bulldozer and sons power control scope is limited to modules. A module can soft-shutdown an idle integer core to conserve power but that's not something software has any control over. An independent core can be completely powered off.

How many clock cycles would it take for an BD module to process two single cycle cost integer math tasks.

How much is the multicore speedup for integer math tasks between a 4 module AMD is it around 3.xx or 7.xx? Now compare that to a traditional 4 core setup.

In these specific scenarios it is quite easy to see any and all argument to say that the AMD design act as any standard 8 core unit would. Just because single threaded performance was dreadful, doesn't have anything to do with it scaling linear across all 8 cores available assuming integer calculations.
It seems ghazi brushed on the point here:
Core count is a more accurate predictor of performance than thread count, and also is a term that common people are familiar with. Even in the case of Bulldozer, the 8-core chips scaled around ~6.7x in multithreaded workloads -- closer to 8 than 4, unlike quad-cores with SMT that don't do much better than 5x.
16% slower than an independent 8-core because of shared components. If I had a Bulldozer, I'd want my 16% back that I was promised. On the flipside, SMT processors promise you 100% but you're getting 125%. That's a bargain, not theft. AMD could have marketed 4-module processors as having 34% better SMT performance than Intel's 4-core processors but, no, they didn't do the smart and honest thing. Sad.
 
Joined
Mar 16, 2017
Messages
2,112 (0.75/day)
Location
Tanagra
System Name Budget Box
Processor Xeon E5-2667v2
Motherboard ASUS P9X79 Pro
Cooling Some cheap tower cooler, I dunno
Memory 32GB 1866-DDR3 ECC
Video Card(s) XFX RX 5600XT
Storage WD NVME 1GB
Display(s) ASUS Pro Art 27"
Case Antec P7 Neo
Personally, I think this is going to be hard to prove. AMD can argue that each integer core was fully independent of the other, despite being part of the same module. Each had its own integer scheduler, register file and 16KB L1 data cache. Yes, they shared an FPU core, but that FPU was capable of handling 2 threads, and both integer cores could access both threads. It was certainly a unique design, but I think the only thing they could prove is that it was a bad design, but we don’t exactly need the court of law to prove that one.

Heck, Atom was launched as an in-order execution CPU, something we hadn’t seen from Intel since before Pentium Pro. For best performance on a Silvermonte, you needed to target an x86 architecture from before 1995.
 

cdawall

where the hell are my stars
Joined
Jul 23, 2006
Messages
27,680 (4.13/day)
Location
Houston
System Name All the cores
Processor 2990WX
Motherboard Asrock X399M
Cooling CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory 4x16GB G.Skill 3600
Video Card(s) (2) EVGA SC BLACK 1080Ti's
Storage 2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s) Dell UP2414Q 3840X2160@60hz
Case Caselabs Mercury S5+pedestal
Audio Device(s) Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply Seasonic Prime 1200w
Mouse Thermaltake Theron, Steam controller
Keyboard Keychron K8
Software W10P
You are a wise man. :respect:

Like you said the discussion became a circular argument. That is not worth the time of day. The documentation was provided and approved by IEEE in 2012. The processor can complete 8 simultaneous integer core problems per clock cycle and the design existed to try and reduce the foot print of a core.

Pretty sure you can't. Bulldozer and sons power control scope is limited to modules. A module can soft-shutdown an idle integer core to conserve power but that's not something software has any control over. An independent core can be completely powered off.

I still have an FX9370 and CHV. You absolutely can power off 1 core in a module. I could boot the chip 4 modules and 4 cores right now.

It seems ghazi brushed on the point here:

16% slower than an independent 8-core because of shared components. If I had a Bulldozer, I'd want my 16% back that I was promised.

This is a review for the 9700K and 9900K. I will use cinebench as a basis for this. Now mind you this is the absolutely latest Intel product.

https://www.techspot.com/review/1730-intel-core-i9-9900k-core-i7-9700k/

The 9700K is an 8 core 8 thread CPU. It scores 214 points for the single threaded CB test, it scores 1513 points for the multithreaded test. That is a 7.07x speed up. In 2012 AMD was able to pull off a 6.7x speed up in that same benchmark and you are going to sit there and tell me it only had 4 cores?
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.46/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
I still have an FX9370 and CHV. You absolutely can power off 1 core in a module. I could boot the chip 4 modules and 4 cores right now.
Fantastic! Still waiting on numbers on this thread:
https://www.techpowerup.com/forums/...ount-on-bulldozer.217327/page-21#post-3535907

Modules use less power fully powered than two independent cores fully powered; however, one independent core will use less power than a semi-powered down module.

The 9700K is an 8 core 8 thread CPU. It scores 214 points for the single threaded CB test, it scores 1513 points for the multithreaded test. That is a 7.07x speed up. In 2012 AMD was able to pull off a 6.7x speed up in that same benchmark and you are going to sit there and tell me it only had 4 cores?
It's ironic you mention the 9700K...a deliberately nerfed processor compared to one that isn't (other than having an unconventional design). 9900K is 9.48x using 8 independent cores versus allegedly "6.7x" using 8 shared cores. That's on the order of 41% improvement instead of 16% loss. There's a reason why AMD dropped modules like it's hot and went SMT too.
 
Last edited:
Joined
Dec 5, 2017
Messages
157 (0.06/day)
It's ironic you mention the 9700K...a deliberately nerfed processor compared to one that isn't (other than having an unconventional design). 9900K is 9.48x using 8 independent cores versus allegedly "6.7x" using 8 shared cores. That's on the order of 41% improvement instead of 16% loss. There's a reason why AMD dropped modules like it's hot and went SMT too.

To the contrary, the 9900K gets a ~19% improvement from its 8 virtual threads. If the FX were a 4-core, 8-thread CPU, its "virtual" (hardware) threads would give it a 68% improvement. That's more in-line with the performance uplift from fully independent cores than that of SMT. Let's also remember that fully independent cores don't scale totally perfectly either.
 

cdawall

where the hell are my stars
Joined
Jul 23, 2006
Messages
27,680 (4.13/day)
Location
Houston
System Name All the cores
Processor 2990WX
Motherboard Asrock X399M
Cooling CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory 4x16GB G.Skill 3600
Video Card(s) (2) EVGA SC BLACK 1080Ti's
Storage 2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s) Dell UP2414Q 3840X2160@60hz
Case Caselabs Mercury S5+pedestal
Audio Device(s) Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply Seasonic Prime 1200w
Mouse Thermaltake Theron, Steam controller
Keyboard Keychron K8
Software W10P
Fantastic! Still waiting on numbers on this thread:
https://www.techpowerup.com/forums/...ount-on-bulldozer.217327/page-21#post-3535907

More like 2 modules and 4 integer cores. The modules are going to use more power in a semi-powered down state than independent cores in a full power down state simply because a lot more transistors aren't being used.

I am curious how it will do. If I have time this weekend I will see if I can get it up and running again. I actually have been wanting to turn it into an XP box for some older games for a while. Have a pair of 7950's it is going to get stuffed into it.


I'd rather trust my own program in the link above than Cinebench. I know my program is extremely asynchronous, relies on ALU performance over FPU, and performance patterns fall exactly inline with expectations.

It's ironic you mention the 9700K...a deliberately nerfed processor compared to one that isn't (other than having an unconventional design). 9900K is 9.48x using 8 independent cores versus allegedly "6.7x" using 8 shared cores. That's on the order of 41% improvement instead of 16% loss. There's a reason why AMD dropped modules like it's hot and went SMT too.

I specifically picked the 9700K, because I thought we were comparing apples to apples. That is an 8 core 8 thread CPU compared to an 8 core 8 thread CPU. If the argument is that they aren't cores, then that is absolutely A-OK, we can compare a 7700K for the 4 core 8 thread scaling vs 4 module 8 threads.

https://techreport.com/review/31179/intel-core-i7-7700k-kaby-lake-cpu-reviewed/13

7700K pulls off 197 single threaded and 998 multithreaded for a 5.06x speed up. Again those 8 "shared" cores as you called them did 6.7x I would say if we were to purely compare HT vs CMT this particular application is showing substantial gains to CMT, almost at the same level as traditional cores.

This trend actually got better with more cores added. The quad core dual module FX based stuff did not do as well per core, still heftily beat the intel HT offerings, but was not nearly as good as cores.

So my personal A10-7800 ran 91 single and 308 multi 3.38x speed up (4/4)
The G4560 just a couple notches up ran 142 single and 352 multi 2.47x speed up (2/4)
Another random i3 6100 193 single and 491 multi 2.54 speed up (2/4)
and the 4/4 4690K 171 single and 646 multi 3.77 speed up (4/4)
another 6600K 193 single and 729 multi 3.77 speed up (4/4)

these are just yanked off of the CB thread

https://www.techpowerup.com/forums/threads/post-your-cinebench-score.213237/
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.46/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
To the contrary, the 9900K gets a ~19% improvement from its 8 virtual threads. If the FX were a 4-core, 8-thread CPU, its "virtual" (hardware) threads would give it a 68% improvement. That's more in-line with the performance uplift from fully independent cores than that of SMT. Let's also remember that fully independent cores don't scale totally perfectly either.
Hyperthreading and Zen: 19% improvement from juggling two threads per core across 8 cores. When one thread hits a blocking state, it switches context to the other thread to maximize the usage of hardware resources.

Bulldozer, Piledriver, and Steamroller: 67% improvement from integrating two integer cores. They should only theoretically block when either is faced with a major FPU instruction; however, there's no switching available to keep the integer cores fully tasked. It gets more performance improvement because there's more transistors behind it but it cannot exceed 100% because it lacks integer core SMT.

I specifically picked the 9700K, because I thought we were comparing apples to apples. That is an 8 core 8 thread CPU compared to an 8 core 8 thread CPU. If the argument is that they aren't cores, then that is absolutely A-OK, we can compare a 7700K for the 4 core 8 thread scaling vs 4 module 8 threads.

https://techreport.com/review/31179/intel-core-i7-7700k-kaby-lake-cpu-reviewed/13

7700K pulls off 197 single threaded and 998 multithreaded for a 5.06x speed up. Again those 8 "shared" cores as you called them did 6.7x I would say if we were to purely compare HT vs CMT this particular application is showing substantial gains to CMT, almost at the same level as traditional cores.

This trend actually got better with more cores added. The quad core dual module FX based stuff did not do as well per core, still heftily beat the intel HT offerings, but was not nearly as good as cores.

So my personal A10-7800 ran 91 single and 308 multi 3.38x speed up (4/4)
The G4560 just a couple notches up ran 142 single and 352 multi 2.47x speed up (2/4)
Another random i3 6100 193 single and 491 multi 2.54 speed up (2/4)
and the 4/4 4690K 171 single and 646 multi 3.77 speed up (4/4)
another 6600K 193 single and 729 multi 3.77 speed up (4/4)

these are just yanked off of the CB thread

https://www.techpowerup.com/forums/threads/post-your-cinebench-score.213237/
My problem with all of this is I'm not sure how Cinebench even works. Is it ALU heavy, FPU heavy, or a mixture of both? Is it synchronous multithreading or asynchronous? From what I gather, it's a rendering benchmark which is FPU heavy. Assuming that, it's good to see that Bulldozer's FPU can manage 83.75% but from your own numbers, you can clearly see that there's a significant difference between where Bulldozer performs compared to independent cores (e.g. 9700K at 88.375%), especially when considering that Bulldozer is getting 100% of possible threads, architecturally, compared to 9700K's 50% of possible threads architecturally. Your figure of 7700K demonstrates that: 126.5% performance out of four independent cores. versus 83.75% out of eight integer cores or 167.5% out of four modules.

That's what I don't get: AMD could have owned the module argument. 167.5% per module is more attractive than 83.75% per "core." They stabbed themselves in the back by calling them "cores" because it just doesn't stand up to the 120%+ that Hyper-Threading can do. This is looking at it from the perspective of a customer comparing an "8-core" Intel/Zen to an "8-core" Bulldozer/Piledriver/Steam Roller.
 
Last edited:
Joined
Oct 30, 2008
Messages
1,768 (0.30/day)
System Name Lailalo
Processor Ryzen 9 5900X Boosts to 4.95Ghz
Motherboard Asus TUF Gaming X570-Plus (WIFI
Cooling Noctua
Memory 32GB DDR4 3200 Corsair Vengeance
Video Card(s) XFX 7900XT 20GB
Storage Samsung 970 Pro Plus 1TB, Crucial 1TB MX500 SSD, Segate 3TB
Display(s) LG Ultrawide 29in @ 2560x1080
Case Coolermaster Storm Sniper
Power Supply XPG 1000W
Mouse G602
Keyboard G510s
Software Windows 10 Pro / Windows 10 Home
Easy case to win. Just look at benchmarks. It beat i7s back then in multithreading. Pretty clear when you got into the heavy workloads that it wasn't a quad core. Physical cores always are better than SMT. Sure it sucked big time in single thread and everything else, but it definitely was an *8 core.
 

cdawall

where the hell are my stars
Joined
Jul 23, 2006
Messages
27,680 (4.13/day)
Location
Houston
System Name All the cores
Processor 2990WX
Motherboard Asrock X399M
Cooling CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory 4x16GB G.Skill 3600
Video Card(s) (2) EVGA SC BLACK 1080Ti's
Storage 2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s) Dell UP2414Q 3840X2160@60hz
Case Caselabs Mercury S5+pedestal
Audio Device(s) Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply Seasonic Prime 1200w
Mouse Thermaltake Theron, Steam controller
Keyboard Keychron K8
Software W10P
My problem with all of this is I'm not sure how Cinebench even works. Is it ALU heavy, FPU heavy, or a mixture of both? Is it synchronous multithreading or asynchronous? From what I gather, it's a rendering benchmark which is FPU heavy. Assuming that, it's good to see that Bulldozer's FPU can manage 83.75% but from your own numbers, you can clearly see that there's a significant difference between where Bulldozer performs compared to independent cores (e.g. 9700K at 88.375%), especially when considering that Bulldozer is getting 100% of possible threads, architecturally, compared to 9700K's 50% of possible threads architecturally. Your figure of 7700K demonstrates that: 126.5% performance out of four independent cores. versus 83.75% out of eight integer cores or 167.5% out of four modules.

That's what I don't get: AMD could have owned the module argument. 167.5% per module is more attractive than 83.75% per "core." They stabbed themselves in the back by calling them "cores" because it just doesn't stand up to the 120%+ that Hyper-Threading can do. This is looking at it from the perspective of a customer comparing an "8-core" Intel/Zen to an "8-core" Bulldozer/Piledriver/Steam Roller.

So you are calling a 5% difference between the 83% BD core for core and 88% coffee lake significantly more important why?

In the time frame from 2012 to to 2019 Intel was able to offer 5% better multithreading efficiency comparing core for core in what is considered a heavy workload. You are correct I don't know if it is alu or fpu heavy, but it performs very well for efficiency on both sides of the map.

Either way you cut this up either in 2012 they had nearly equaled Intel 2019 multithreading ability or in 2012 CMT so vastly outperformed both amds in replacement SMT and Intels HT it isn't even funny. Either way you chalk that up you are saying the chip performed admirably in this specific scenario. Now mind you I do get what you are saying with the 7700k holding a 126% per core efficiency, but it's per thread would be worse than bulldozer. That would be what that speed up shows. You can mix those numbers however you want, but the root of it doesn't change. Intels own 7700k when compared to a 9700k showed the same thing. 126% vs 88% when compared the same way. So why is it ok for Intels efficiency, but not ok for amd again you are comparing a 2012 product to 2018/2019 right now as well.
 
Last edited:

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.46/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
Easy case to win. Just look at benchmarks. It beat i7s back then in multithreading.
Because they were going against 4 cores. AMD is still doing the same thing today but with 8 cores instead of 4 modules.

Pretty clear when you got into the heavy workloads that it wasn't a quad core.
But it's also clear it isn't an 8 core either. AMD made a mistake not marketing them as 4 modules or 8 integer cores and they're liable to pay for it now.

Physical cores always are better than SMT.
No doubt but SMT increases efficiency of physical cores. That's why almost all modern CISC architectures do it.

So you are calling a 5% difference between the 83% BD core for core and 88% coffee lake significantly more important why?
Because you're under tasking the Coffee Lake architecture. 100% load both, you're looking at 125% versus 83%. That's the reason why Bulldozer/Piledriver/Steamroller didn't take mainframe marketshare by storm but Zen is.

Either way you cut this up either in 2012 they had nearly equaled Intel 2019 multithreading ability or in 2012 CMT so vastly outperformed both amds in replacement SMT and Intels HT it isn't even funny. Either way you chalk that up you are saying the chip performed admirably in this specific scenario. Now mind you I do get what you are saying with the 7700k holding a 126% per core efficiency, but it's per thread would be worse than bulldozer. That would be what that speed up shows. You can mix those numbers however you want, but the root of it doesn't change. Intels own 7700k when compared to a 9700k showed the same thing. 126% vs 88% when compared the same way. So why is it ok for Intels efficiency, but not ok for amd again you are comparing a 2012 product to 2018/2019 right now as well.
I'm not saying and never did say that CMT was a bad architecture. AMD just went about describing it poorly to the public. Think the Seagate lawsuit about the definition of "GB." That's what this is fundamentally about but it's "core" instead.


Edit: Circling back to Cinebench, the fact 9700K is 12% loss in scaling, I'd say the multithreading code is either synchronous or has a lot of blocking scenarios. Async code with little cross talk between threads should get damn close to 100%. The fact SMT in the same test gives what is effectively a 37% uplift in performance proves it is not a good multithreading benchmark.
 
Last edited:
Joined
Jan 31, 2010
Messages
5,543 (1.02/day)
Location
Gougeland (NZ)
System Name Cumquat 2021
Processor AMD RyZen R7 7800X3D
Motherboard Asus Strix X670E - E Gaming WIFI
Cooling Deep Cool LT720 + CM MasterGel Pro TP + Lian Li Uni Fan V2
Memory 32GB GSkill Trident Z5 Neo 6000
Video Card(s) Sapphire Nitro+ OC RX6800 16GB DDR6 2270Cclk / 2010Mclk
Storage 1x Adata SX8200PRO NVMe 1TB gen3 x4 1X Samsung 980 Pro NVMe Gen 4 x4 1TB, 12TB of HDD Storage
Display(s) AOC 24G2 IPS 144Hz FreeSync Premium 1920x1080p
Case Lian Li O11D XL ROG edition
Audio Device(s) RX6800 via HDMI + Pioneer VSX-531 amp Technics 100W 5.1 Speaker set
Power Supply EVGA 1000W G5 Gold
Mouse Logitech G502 Proteus Core Wired
Keyboard Logitech G915 Wireless
Software Windows 11 X64 PRO (build 23H2)
Benchmark Scores it sucks even more less now ;)
I can remember years ago when first getting into PC's some motherboards for the 286 CPU had 2 sockets on them 1 for the x86 integer CPU and 1 for the x87 FPU side of things did it work without the x87 FPU yes was it slower without it yes but only in FPU intensive tasks ... So as far as I'm concerned any CPU that has 2 x86 compute units is a dual core CPU

1x Module = 2x Integer CPU cores + 1 FPU core
4x Modules = 8 Integer CPU cores + 4 FPU cores

so technically an 8 core CPU if all you want to do is x86 integer operations
 
Joined
Oct 27, 2009
Messages
1,184 (0.21/day)
Location
Republic of Texas
System Name [H]arbringer
Processor 4x 61XX ES @3.5Ghz (48cores)
Motherboard SM GL
Cooling 3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory 16x gskill DDR3 1600 cas6 2gb
Video Card(s) blah bigadv folder no gfx needed
Storage 32GB Sammy SSD
Display(s) headless
Case Xigmatek Elysium (whats left of it)
Audio Device(s) yawn
Power Supply Antec 1200w HCP
Software Ubuntu 10.10
Benchmark Scores http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww
8 alu from memory, for integer ops its an 8 core for floating point its a 4 core, simples, i personally would have called it an 8 thread cpu, not and 8 core. at the same time its up to the customer to do some research, as a quad core its decent perf but for an 8 core its kind of pathetic
Negative, for floating point it is an 8 core... it had a double wide floating point unit (AVX) that could operate 2x 128bit fmac at a time or 1 double wide. The problem was the design just didn't work as designed and the shared scheduler hamstrung it. With scheduler changes in windows it greatly improved and did fine on many multithreaded applications. Just because it had a poor design and architectural bottleneck doesn't mean it isn't an 8 core.

" C-Ray, a simple raytracer designed to test the floating-point CPU performance "



i7 990x 6c/12t got 6x improvement.
fx8150 8c/8t got also 6x improvement. No one is arguing its a failed architecture, but the lawsuit is meritless... there are in fact 8 cores both int and fp.

bulldozer lost significant IPC from Magnycours or thuban. on the server side replacing 12c magny with 16c bulldozer yielded the same performance at the same clock.
Bulldozer was on a newer node and used less power and scaled to higher clocks. I kept the cinebench crown with 48 Magnycours cores till bricktown (Intel 4p ivy) came out. (60c/120t), 3.8ghz 48c magny beat off 64c 4.2ghz interlagos (bulldozer take 2)... then it was gobstomped by bricktown lol.

That said FP was not half but more like 75% efficient, it was painfully bottlenecked. you can be mad that it was a shit architecture, but you cannot claim the cores weren't there... they clearly were.
Don't mean to be rude, it is just a greatly misunderstood architecture, it went backwards from magny...and then made 5-10% gains per refresh as intel was making 20% ipc uplifts.
 
Last edited:
Joined
Feb 3, 2017
Messages
3,756 (1.32/day)
Processor Ryzen 7800X3D
Motherboard ROG STRIX B650E-F GAMING WIFI
Memory 2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s) INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage 2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s) 42" LG C2 OLED, 27" ASUS PG279Q
Case Thermaltake Core P5
Power Supply Fractal Design Ion+ Platinum 760W
Mouse Corsair Dark Core RGB Pro SE
Keyboard Corsair K100 RGB
VR HMD HTC Vive Cosmos
The public and industry understands a "core" as these components;
1. A Control Bus/Control Logic.
2. An Instruction Bus.
3. An Address/Data Bus which is usually connected to a Load/Store Unit.
4. A datapath, this the ALU/AGU.

The Bulldozer module has;
2 Retire Queues -> Instruction Bus.
2 Schedulers(etc componentry) -> Control Logic.
2 clusters of 2 ALU/2 AGLUs -> Superscalar datapath
2 Address/Data buses which interconnect to a Load/Store unit. -> Address/Data Bus
Thus,
2 Cores.

The Bulldozer module by industrial and educational definition is two real/processing/physical cores.

Front-end of the module <-- not part of the core.
FPU in the module <-- not part of the core.
Shared L2 cache unit in the module <-- not part of the core.

The cores in the Bulldozer module are as independent as any fully replicated microprocessor.
Public and industry understands core as a unit that can take instructions from a set (in this case x86), execute them and get the compute results out. Front end is definitely part of the core. The gist of it is - you cannot take that integer core out and use it as a functional x86 CPU.

By the way, this lines up with how software treats a core as well as how a core logically should be treated. Blaming Microsoft here is shortsighted, they clearly went by AMDs suggestions in showing Bulldozers as 8 core, a decision which had to be changed later. Linux changed the OS level scheduling far quicker and with less arguments.
Split Bulldozer Module;
2x 2K L2 BTB
2x 256K L1 BTB
2x Branch Predictor
2x 32 KB L1i
2x 16B Fetcher/Prefetcher
2x IBB/Pick
2x 2-wide decode
2x 2 ALU/2AGU
2x 1 FMAC+1 FMMX(FMISC/FSTORE)
2x LSU
2x 16 KB L1d
2x 1 MB L2

You can easily split an Bulldozer module and get two functional cores. However, those two cores would utilize more space than the two-core module. Which in turn would provide less performance than the Bulldozer module.
No you can't split a Bulldozer module into two functional cores. In a Bulldozer module there is one Branch Predictor, one Fetcher, one Decode, One L2 cache etc.
 
Last edited:

qubit

Overclocked quantum bit
Joined
Dec 6, 2007
Messages
17,865 (2.88/day)
Location
Quantum Well UK
System Name Quantumville™
Processor Intel Core i7-2700K @ 4GHz
Motherboard Asus P8Z68-V PRO/GEN3
Cooling Noctua NH-D14
Memory 16GB (2 x 8GB Corsair Vengeance Black DDR3 PC3-12800 C9 1600MHz)
Video Card(s) MSI RTX 2080 SUPER Gaming X Trio
Storage Samsung 850 Pro 256GB | WD Black 4TB | WD Blue 6TB
Display(s) ASUS ROG Strix XG27UQR (4K, 144Hz, G-SYNC compatible) | Asus MG28UQ (4K, 60Hz, FreeSync compatible)
Case Cooler Master HAF 922
Audio Device(s) Creative Sound Blaster X-Fi Fatal1ty PCIe
Power Supply Corsair AX1600i
Mouse Microsoft Intellimouse Pro - Black Shadow
Keyboard Yes
Software Windows 10 Pro 64-bit
Top