Friday, November 6th 2015

AMD Dragged to Court over Core Count on "Bulldozer"

This had to happen eventually. AMD has been dragged to court over misrepresentation of its CPU core count in its "Bulldozer" architecture. Tony Dickey, representing himself in the U.S. District Court for the Northern District of California, accused AMD of falsely advertising the core count in its latest CPUs, and contended that because of they way they're physically structured, AMD's 8-core "Bulldozer" chips really only have four cores.

The lawsuit alleges that Bulldozer processors were designed by stripping away components from two cores and combining what was left to make a single "module." In doing so, however, the cores no longer work independently. Due to this, AMD Bulldozer cannot perform eight instructions simultaneously and independently as claimed, or the way a true 8-core CPU would. Dickey is suing for damages, including statutory and punitive damages, litigation expenses, pre- and post-judgment interest, as well as other injunctive and declaratory relief as is deemed reasonable.
Source: LegalNewsOnline
Add your own comment

511 Comments on AMD Dragged to Court over Core Count on "Bulldozer"

#376
FordGT90Concept
"I go fast!1!11!1!"
Compare two of Intel's "cores" to two of AMD's "cores"...
The only time two threads within a Bulldozer Module could clash...
...is an impossibility in the former. Hyper-Threading Technology does not add "cores." The comparison is therefore invalid.

AMD should have called:
-core -> integer cluster
-module -> core
...lawsuit never would have happened because the description matches the product.
Posted on Reply
#377
Aquinus
Resident Wat-man
FordGT90ConceptCompare two of Intel's "cores" to two of AMD's "cores"...

...is an impossibility in the former. Hyper-Threading Technology does not add "cores." The comparison is therefore invalid.

AMD should have called:
-core -> integer cluster
-module -> core
...lawsuit never would have happened because the description matches the product.
So you're entire argument is based upon not being able to dispatch two 256-bit (quad-precision,) FP ops at the same time? I'm sure you're running full precision AVX all the time. :laugh:

Once again, you're not addressing the real problem which is the width of the FPU, not the fact that it's shared. When I said re-read my post, I did mean the entire thing, including the second half of it.
AquinusThe argument falls apart when you consider what would happen if AMD had doubled the width of the single FPU (not add a second one,) per module and it's impact it would have had on floating point performance and I'm willing to bet that you would instantly make up the difference but, that still doesn't fix the integer cores which is where a lot of performance is lost. Once again, the class action makes it sound like bulldozer sucks because it has a shared FPU when it's really because it has gimped FPUs. Sharing it was smart, slimming it out was not. A similarly clocked Intel quad core will have double the floating point performance than an "8 core" BD chip at the same clock. It also happens to be the case (as I said before,) that the FPU per module is half of the width of the FPU on K12 and SB through at least Haswell. If BD had FPUs that were twice as wide, it would still be shared but, if you consider the clocks that BD runs at, you make up some of that difference and floating point performance would line up more with a 6c Intel CPU if that were the case instead of somewhere between a dual-core and quad-core Intel chip at the same clock.

Simply put, you could still have a FPU on every core but, if they make the FPU half as wide than it is now per every module, you're still stuck with the same crappy performance because your ability to dispatch hasn't been improved. When using any streaming SIMD task with floating point data, the wider FPU at any given clock speed will always be faster than a narrower one because half the width means twice as many cycles to do the same thing and fewer cycles to complete a task means better IPC. So despite having twice as many FPUs, the reduced width of each unit harms overall throughput.

tl;dr: Increasing the width of the already shared FPU by double would have the same performance characteristics as doubling the number of FPUs with the current width which is reason alone to reject the "it's not 8 cores," claim based strictly on the FPU itself. Simply put, caveat emptor.
Posted on Reply
#378
FordGT90Concept
"I go fast!1!11!1!"
And "you're not addressing the real problem" of AMD overstating the capabilities of their processors.
Posted on Reply
#379
Aquinus
Resident Wat-man
FordGT90ConceptAnd "you're not addressing the real problem" of AMD overstating the capabilities of their processors.
AquinusSimply put, you could still have a FPU on every core but, if they make the FPU half as wide than it is now per every module, you're still stuck with the same crappy performance because your ability to dispatch hasn't been improved.
I'm at least making an effort unlike your witty remarks which aren't proving anything. The only point they're making is that you disagree with me.
Posted on Reply
#380
FordGT90Concept
"I go fast!1!11!1!"
There's already been enough proof provided in this thread that to do so again would simply be an exercise in repetition. AMD redefined "core" so it stands out on the shelf next to Intel (8 "cores" for $800 versus 4 cores for $1000). Everything about it stinks of misleading the public. The details really don't matter in the eyes of consumer protection law.
Posted on Reply
#381
eidairaman1
The Exiled Airman
No point of beating a dead horse. The cpus have 8 physical cores in 4 modules, they share resources between 2 cores, not much different than any cpu sharing the cache.
Posted on Reply
#382
64K
Eventually someone is going to tire in this debate and be rolled away in a wheelchair.
Posted on Reply
#383
Aquinus
Resident Wat-man
eidairaman1No point of beating a dead horse. The cpus have 8 physical cores in 4 modules, they share resources between 2 cores, not much different than any cpu sharing the cache.
...and people don't seem to get that the problem is that the width of the FPU is 100% responsible for the poor performance, not the fact that there is only one of them. Bulldozer and Haswell both have the same number of FPUs and on a purely floating point benchmark, Haswell will be twice as fast because the width of the FPU is twice as big. So at any given clock speed, you'll see half the performance. Half the width, half the performance. That's not too hard to understand.
Posted on Reply
#384
FordGT90Concept
"I go fast!1!11!1!"
eidairaman1No point of beating a dead horse. The cpus have 8 physical cores in 4 modules, they share resources between 2 cores, not much different than any cpu sharing the cache.
Except that sharing caches is normal. L3, for example, is often accessible by all of the cores in the CPU. The only cache that usually isn't shared is L1. Each Bulldozer "module" has one L1 for instructions and two L1s for data. A single "core" has one L1 for instructions and one L1 for data. The lack of a second L1 instruction cache is another indicator the "module" is an extended "core," not a dual core.
Aquinus...and people don't seem to get that the problem is that the width of the FPU is 100% responsible for the poor performance, not the fact that there is only one of them. Bulldozer and Haswell both have the same number of FPUs and on a purely floating point benchmark, Haswell will be twice as fast because the width of the FPU is twice as big. So at any given clock speed, you'll see half the performance. Half the width, half the performance. That's not too hard to understand.
...evidence that AMD cheated consumers...
Posted on Reply
#385
Aquinus
Resident Wat-man
FordGT90Concept...evidence that AMD cheated consumers...
...but not having 8 cores like the lawsuit is about. I'm not disagreeing that AMD screwed consumers. I'm disagreeing with the statement that it doesn't have 8 cores.
Posted on Reply
#386
cdawall
where the hell are my stars
FordGT90ConceptCompare two of Intel's "cores" to two of AMD's "cores"...

...is an impossibility in the former. Hyper-Threading Technology does not add "cores." The comparison is therefore invalid.

AMD should have called:
-core -> integer cluster
-module -> core
...lawsuit never would have happened because the description matches the product.
A module isn't a core, it has two integer clusters mind finding me a core with two?

Seriously your argument is AMD used an undefined word differently. A module isn't a core, the only thing you can argue is the lack of an FPU per core, but guess what it's still a core at that point.
Posted on Reply
#387
Aquinus
Resident Wat-man
cdawallSeriously your argument is AMD used an undefined word differently. A module isn't a core, the only thing you can argue is the lack of an FPU per core, but guess what it's still a core at that point.
...but Ford has a definition of a core and this doesn't agree with it. Are you telling me that doesn't make him right?! What a shocker. :laugh:

Edit: I don't want to make this a rant but, as a software engineer, there are a lot of cases where I opt for integers (fixed point,) over floating point values for reasons of performance, accuracy and precision. When you write software that translates something into your earnings or projected earnings, you don't want round-off error or any "lost" data, you want every penny. You want everything to add up to exactly what it's supposed to be, not just for earnings but, so you can confirm your data against an audit if people think you're full of shit. We can't be like Wells Fargo and screw people out of their fractions of a cent like on Office Space because people are scummy. In reality, you need control of that if you're going to be an ethical institution that isn't willing to lie about progress.

I just wanted to get that out there because from my perspective, floating point is a different animal than integer math all together and I treat it completely differently. For that reason, I can't consider the FPU directly part of the core. It's important but, it's a special case to me.
Posted on Reply
#388
FordGT90Concept
"I go fast!1!11!1!"
That's what decimal128 is for. ;) It still incurs a performance penalty though.


When AMD said "dual core" with FX-62, they didn't mean two integer clusters, they meant two complete processors in one package, each having its own instruction decoder, instruction cache, floating point cluster, integer cluster, and data cache. Intel followed suite with Pentium D. If AMD meant "core" was only the integer cluster, then why did it have two of everything relevant to processing all tasks relevant to x86 (especially instruction decoder and floating point)? That hardware represents standard features in x86 and has been since the 90s.

Logic would suggest having two FPUs per core would be better than two integer clusters per core because of the performance penalties FPUs incur. AMD, instead did the opposite. They took the worst performing part of a processor and shared it with two threads without bolstering it. It was gimped from the day of conception.


"Core" is very well defined and certainly no court would accept your argument that it is "undefined."
Posted on Reply
#389
cdawall
where the hell are my stars
It's so defined you yourself had to find basically an urban dictionary of computers to do so.
Posted on Reply
#390
Prima.Vera
I see a lot of bla-bla-bla going on on the last 16 pages. However, everybody seems to miss the point of the article. Let's review it:
"Tony Dickey, representing himself in the U.S. District Court for the Northern District of California, accused AMD of falsely advertising the core count in its latest CPUs, and contended that because of they way they're physically structured, AMD's 8-core "Bulldozer" chips really only have four cores.
The lawsuit alleges that Bulldozer processors were designed by stripping away components from two cores and combining what was left to make a single "module." In doing so, however, the cores no longer work independently. Due to this, AMD Bulldozer cannot perform eight instructions simultaneously and independently as claimed, or the way a true 8-core CPU would."

So he is suing AMD because of this statement. That's all. Personally I don't have CPU design experience so I will not take any side. However, if he wins the process, then he was right, and end of discussion. If not, the same.
Posted on Reply
#391
FordGT90Concept
"I go fast!1!11!1!"
cdawallIt's so defined you yourself had to find basically an urban dictionary of computers to do so.
Does the equivalent literature exist to say otherwise? Has anyone, other than AMD, published a document that defined what a "core" is?
Posted on Reply
#392
Prima.Vera
OK, one simple question and I'm gone. If AMD has indeed 8 cores, can you or can you not clock each individual core with a different frequency?
Posted on Reply
#393
BiggieShady
FordGT90ConceptDoes the equivalent literature exist to say otherwise?
So the equivalent literature doesn't say otherwise which makes urban dictionary the best source for the definition which makes the term well defined.
Not getting your logic.
Posted on Reply
#394
Aquinus
Resident Wat-man
Now, it’s absolutely true that the Bulldozer family of products has had much lower single-thread performance than either previous AMD CPUs (in many cases) or Intel chips (in virtually all cases). But this lawsuit doesn’t appear to argue that AMD mismarketed its CPUs because single-threaded performance was weaker than expected, but because multi-threaded scaling was critically harmed by the decision to share various aspects of the underlying architecture. Weak single-threaded performance and high power consumption created a situation in which BD could neither hit its target clock frequencies nor its IPC targets. Critically, these issues do not disappear when the CPU is run in one-thread per module mode.

Dickey’s lawsuit is wrong on other areas of fact as well. Bulldozer does share a single FPU block per work unit, but consumer workloads are rarely FPU-heavy. Each CPU module does contain the eight integer pipelines you’d expect in a typical dual-core conventional chip (4 ALU + 4 AGU per module). Dickey refers to Bulldozer as being unable to “perform eight calculations simultaneously,” but this is imprecise, inexact language that does not reflect the complexity of how a CPU executes code. Bulldozer is absolutely capable of executing eight threads simultaneously, and executing eight threads on an eight-core FX-8150 is faster than running that same chip in a four-thread, four-module mode. Bulldozer can decode 16 instructions per clock (not eight) and it can keep far more than eight instructions in flight simultaneously.

This lawsuit essentially asks a court to define what a core is and how companies should count them. As annoying as it is to see vendors occasionally abuse core counts in the name of dubious marketing strategies, asking a courtroom to make declarations about relative performance between companies is a cure far worse than the disease. From big iron enterprise markets to mobile devices, companies deploy vastly different architectures to solve different types of problems. An eight-core, Cortex-A7-based, mobile SoC is a very different beast from an eight-core big.Little Cortex-A57 / Cortex-A53 configuration. That chip is very different from an Oracle M7 or the SPARC T5. The T5 doesn’t pack the per-core performance of Intel’s 18-core Xeons, or IBM’s Power8.
www.extremetech.com/extreme/217672-analysis-amd-lawsuit-over-false-bulldozer-chip-marketing-is-without-merit
Posted on Reply
#395
cdawall
where the hell are my stars
Prima.VeraOK, one simple question and I'm gone. If AMD has indeed 8 cores, can you or can you not clock each individual core with a different frequency?
You can each core is independent for clocks
Posted on Reply
#396
Aquinus
Resident Wat-man
cdawallYou can each core is independent for clocks
Can you really control the multiplier on each integer core independently of the other core in the module in the BIOS? If true, I would actually find that really interesting.
Posted on Reply
#397
FordGT90Concept
"I go fast!1!11!1!"
@MalakiLab claims it is possible to change the clockspeeds on the integer clusters which begs the question what speed is the FPU, instruction decoder, and so on running at? Also note in the picture how Linux calls the FX-6350 a tri-core.
BiggieShadySo the equivalent literature doesn't say otherwise which makes urban dictionary the best source for the definition which makes the term well defined.
Not getting your logic.
There's a plethora of samples that have everything packed into one unit collectively called a "core." When an entire industry does something the same across a wide variety of hardware, it becomes the definition of what that something is. Case in point: Oxford English Dictionary now recognizes the words 'Merica and YOLO. The definition of "core" was established with K8 X2, K10, Conroe, Penryn, Nehalem, Westmere, and Sandy Bridge (I'm leaving a lot out) before AMD redefined it with Bulldozer in late 2011. Its launch goes against the five years of established precedent. None of those processors shared anything required to process any type of data.
Posted on Reply
#398
BiggieShady
FordGT90ConceptWhen an entire industry does something the same across a wide variety of hardware
But the whole point here is that the entire industry has many different examples of cpu cores ... and you claim that it is the same, not different. That's pure fantasy.
If we look only at x86 instruction set with extensions and according market shares today, then truly intel's core is de facto definition what core is :roll: but let's not do that for science sake.
Posted on Reply
#399
FordGT90Concept
"I go fast!1!11!1!"
No, the industry doesn't. What constitutes a core is fairly consistent in the last decade excluding Bulldozer. Even ARM tends to have the same components as x86 in each core. The only hardware that is remotely similar to Bulldozer is SPARC which is designed explicitly for databases. In SPARC, it has 8 instruction decoders for 8 cores. Even though the FPU is separate in SPARC, it behaves like an internal coprocessor. None of the cores claim ownership of it nor share hardware directly with it. AMD shares way too much hardware for an FX-8350 to be considered an octo-core.

I'm not talking about sales, just processor designs that exist.
Posted on Reply
#400
cdawall
where the hell are my stars
AquinusCan you really control the multiplier on each integer core independently of the other core in the module in the BIOS? If true, I would actually find that really interesting.
Depending on the BIOS and board you can clock per core, inside of the OS you can set core clocks per core and voltages independently. I do it with my pair of 12 core optys.
FordGT90ConceptThere's a plethora of samples that have everything packed into one unit collectively called a "core." When an entire industry does something the same across a wide variety of hardware, it becomes the definition of what that something is. Case in point: Oxford English Dictionary now recognizes the words [URL='http://www.cnn.com/2016/09/12/health/oxford-new-words-trnd/index.html']'Merica and YOLO[/URL]. The definition of "core" was established with K8 X2, K10, Conroe, Penryn, Nehalem, Westmere, and Sandy Bridge (I'm leaving a lot out) before AMD redefined it with Bulldozer in late 2011. Its launch goes against the five years of established precedent. None of those processors shared anything required to process any type of data.
If it was established why can't you find a definition?
Posted on Reply
Add your own comment
Nov 28th, 2024 00:42 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts