Friday, November 6th 2015
AMD Dragged to Court over Core Count on "Bulldozer"
This had to happen eventually. AMD has been dragged to court over misrepresentation of its CPU core count in its "Bulldozer" architecture. Tony Dickey, representing himself in the U.S. District Court for the Northern District of California, accused AMD of falsely advertising the core count in its latest CPUs, and contended that because of they way they're physically structured, AMD's 8-core "Bulldozer" chips really only have four cores.
The lawsuit alleges that Bulldozer processors were designed by stripping away components from two cores and combining what was left to make a single "module." In doing so, however, the cores no longer work independently. Due to this, AMD Bulldozer cannot perform eight instructions simultaneously and independently as claimed, or the way a true 8-core CPU would. Dickey is suing for damages, including statutory and punitive damages, litigation expenses, pre- and post-judgment interest, as well as other injunctive and declaratory relief as is deemed reasonable.
Source:
LegalNewsOnline
The lawsuit alleges that Bulldozer processors were designed by stripping away components from two cores and combining what was left to make a single "module." In doing so, however, the cores no longer work independently. Due to this, AMD Bulldozer cannot perform eight instructions simultaneously and independently as claimed, or the way a true 8-core CPU would. Dickey is suing for damages, including statutory and punitive damages, litigation expenses, pre- and post-judgment interest, as well as other injunctive and declaratory relief as is deemed reasonable.
511 Comments on AMD Dragged to Court over Core Count on "Bulldozer"
A FPU can not function as a processor of any kind by itself. Integer math is a requirement for any modern day machine used personally or in servers. Even GPUs which are designed to do massively parallel floating point computations must have the ability to do integer math because floating point means nothing without it. Is it really so hard to comprehend that a CPU can exist without a FPU but a CPU can't exist without integer logic?
Also, IBM's POWER7 has four DP FPUs per core and can do SMT with up to 4 threads per core. The dedicated FPUs didn't make it a core but, the singular pairs of ALUs and AGUs did. How is that not any different from the reverse case? If I recall correctly, multi-core POWER CPUs have shared instruction decode logic that gets put on to queues for each core. So not only does it have dedicated FPUs contained within a single "core", it has shared logic for all of the cores to dispatch instructions. By your logic, the POWER7 is a one core CPU because it shared resources between all of the cores but, could be 4 times as many cores because of the number of FPUs.
Either way, even if BD had a more FPUs or a beefier FPU, I think people would have still called foul on the terrible integer performance which begins with single-threaded applications running alone. AMD hoped that more cores was going to offset the degradation of IPC but, they were wrong. Haswell's integer core has twice as many ALUs as BD and one more AGU. That alone should tell you something.
Simple fact is that AMD told the public that Bulldozer was going to have a 256-bit FMA FPU per module. There was no deception. The problem is that most people don't know what the hell that means. People also don't probably know that their Intel CPU probably has dual dispatch 256-bit FPUs per integer core. Different CPUs with different goals. That's it.
The FPU is like x87 where it is connected to a system bus (crossbar in UltraSPARC T1). It's a discreet processor that handles its own instructions with its own caches. It shares nothing with any core. In Bulldozer, one instruction decoder handles three components (FPU + two integer clusters). No processor exists before or since with that kind of layout. I never said it couldn't but in recent history, everytime it was done, it was considered an error in hindsight. Examples: UltraSPARC T1 had one FPU to 8 cores; UltraSPARC T2 moved the FPU into the 8 cores so there's a total of 8. Bulldozer and sons had one FPU per two integer clusters; Zen is moving to one FPU per core. Gimping the FPU is a great way to lose processor sales to the competition. So technically it can be done but in application, it's foolish. Oh look, it's all packed into each core like expected:
Seriously, stop thinking so hard. It is very simple.
Long pipelines: Pentium 4 --USA -> Core I#
Short pipelines: Pentium M --Israel-> Core/Core 2 (I think it lives on today as Atom)
HTT was never technically gone--they just weren't launching new processors of its design because Netburst was a clusterfuck that took years to clean up. That said, I really don't get your line of thought with this comment.
It almost appears that it has at least two ALUs and two FPUs. And why not? With 8 threads in the core, it can certainly keep them busy. I got no problem with multiple integer clusters and floating point clusters inside a core. The point is, each one does not constitute a core--the whole of it does. Instruction to result, it never leaves the core. The same should be said of Bulldozer's "module."
Looks to me that instruction dispatcher is shared between 4 fixed point units, and it's all inside core boundary ... and since it's already shared isn't that what really matter how wide it is - how many instructions per clock can it dispatch ... how is this different than having a single double wide dispatcher out of core boundaries shared between two cores?
The answer is, it doesn't matter, this power 7 core could be split into 2 weaker cores that would be less super scalar on their own, each would need more cycles for wider instructions, it would be truly two independent but weaker cores.
I so see the similarities between that and Bulldozer yet IBM calls it what it is: a core. AMD does not. Like I said, all data points to AMD lying to making the processors look better next to Intel.
To be very clear: I have no issue with Bulldozer's design. I have an issue with AMD doubling the "core" count.
Single-threaded performance is peripheral to the lawsuit. Yeah, it isn't the best but there's really nothing misleading about that part. AMD struggled in that department since Intel has prioritized it. Because the whole of it is one core--not a component inside. If IBM called those two "Fixed Point Units" "cores," I'd be as up in arms over that as I am over Bulldozer. But they didn't because sense. If only AMD had sense.
Zen is going to have 4 ALUs and 2 AGUs. Does that redefine what a core is? Nope, it just increases the amount of parallelism the processor is capable of. Adding a second integer cluster does the same damn thing (not a "core").
Remember that Bulldozer was AMD's first attempt at simultaneous multithreading. First try was pretty bad (Bulldozer) and they improved it with each iteration but they couldn't fundamentally fix the blocking problems and poor single-threaded performance. Zen throws out Bulldozer's ideas and replaces it with HTT-like simultaneous multithreading. I'm not expecting AMD's Zen SMT performance to match HTT because Intel has lot of practice. At least it is a step in the right direction.
8 Intel cores is going to beat 8 Bulldozer "cores." Intel is going to charge you a lot more for the privilege though.
Diagrams above showed 75% gain at best, 25% at worst, not "near 100%" (that would be a real dual core, not a hybrid like Bulldozer is). AMD sacrificed single-threaded performance for that though where Intel did not for 0-50% gain.
www.bit-tech.net/hardware/cpus/2011/10/12/amd-fx-8150-review/2