Friday, November 6th 2015

AMD Dragged to Court over Core Count on "Bulldozer"

This had to happen eventually. AMD has been dragged to court over misrepresentation of its CPU core count in its "Bulldozer" architecture. Tony Dickey, representing himself in the U.S. District Court for the Northern District of California, accused AMD of falsely advertising the core count in its latest CPUs, and contended that because of they way they're physically structured, AMD's 8-core "Bulldozer" chips really only have four cores.

The lawsuit alleges that Bulldozer processors were designed by stripping away components from two cores and combining what was left to make a single "module." In doing so, however, the cores no longer work independently. Due to this, AMD Bulldozer cannot perform eight instructions simultaneously and independently as claimed, or the way a true 8-core CPU would. Dickey is suing for damages, including statutory and punitive damages, litigation expenses, pre- and post-judgment interest, as well as other injunctive and declaratory relief as is deemed reasonable.
Source: LegalNewsOnline
Add your own comment

511 Comments on AMD Dragged to Court over Core Count on "Bulldozer"

#126
FordGT90Concept
"I go fast!1!11!1!"
Pill MonsterIntel simply state mutiple threads can run on one core, and not similtaneously.
You underlined simultaneously on the screenshot.


@lilhasselhoffer: Most of that I covered already but I want to be very clear about something. In AMD's slides, they always say "8 integer cores" (accurately describes the product) and everywhere that isn't engineering slide, they omit that important word "integer." FX-8### prominently display "8-core" on the box, FX-6### prominently display "6-core" on the box, and FX-2### prominently displays "4-core" on the box. That's an outright lie. It doesn't have 8 cores; it has 4 cores with "8 integers cores." AMD is going to get nailed for false advertising. The plaintiff can easily make the argument that if everyone that bought it received half the processor they thought they were going to get, the other of plaintiff's charges fall into place:
Consumer Legal Remedies Act:
(a) The following unfair methods of competition and unfair or deceptive acts or practices undertaken by any person in a transaction intended to result or which results in the sale or lease of goods or services to any consumer are unlawful:
◦(8) Disparaging the goods, services, or business of another by false or misleading representation of fact.
California’s Unfair Competition Law
17200. As used in this chapter, unfair competition shall mean and
include any unlawful, unfair or fraudulent business act or practice
and unfair, deceptive, untrue or misleading advertising and any act
prohibited by Chapter 1 (commencing with Section 17500) of Part 3 of
Division 7 of the Business and Professions Code.
Fraud:
1903. Negligent Misrepresentation

Tony Dickey claims he was harmed because AMD negligently misrepresented an important fact. To establish this claim, Tony Dickey must prove all of the following:

1. That AMD represented to Tony Dickey that an important fact was true;

2. That AMD’s representation was not true;

3. That although Tony Dickey may have honestly believed that the representation was true, AMD had no reasonable grounds for believing the representation was true when AMD made it;

4. That AMD intended that Tony Dickey rely on this representation;

5. That Tony Dickey reasonably relied on AMD’s representation;

6. That Tony Dickey was harmed; and

7. That Tony Dickey’s reliance on AMD’s representation was a substantial factor in causing his harm.
breach of express warrant:
(1)The warranty of fitness for a particular purpose
negligent misrepresentation:
A judgment that may be rendered in a contract misrepresentation case involving false statements that induced one party to enter into a contract. In negligent representation, the defendant is judged not to have known that the statements made were false, but not to have had reasonable grounds for believing they were true.
unjust enrichment:
The retention of a benefit conferred by another, that is not intended as a gift and is not legally justifiable, without offering compensation, in circumstances where compensation is reasonably expected.
As I said before, I can't see AMD winning. Their exclusion of the word "integer" is misleading to the point of being fraud and they did so knowing it wasn't an accurate statement.
BiggieShadyBehold

from the xbitlabs article www.xbitlabs.com/articles/cpu/display/sandy-bridge-microarchitecture_3.html
Also important for intel architectures since nehalem is ring interconnect bus for l3 cache

from the same article www.xbitlabs.com/articles/cpu/display/sandy-bridge-microarchitecture_4.html
I posted that earlier. I was talking more Haswell, Devil's Canyon, and Skylake. I can only find pictures of old architects.
behrouzwww.cpu-monkey.com/en/cpu-intel_core_i7_2600k-6
www.cpu-monkey.com/en/cpu-amd_fx_8350-7
www.cpu-monkey.com/en/cpu-intel_core_i7_5820k-440

Cinebench R11.5, 64bit (Multi-Core)
Intel core i7 2600k = 6.83
AMD FX-8350 = 6.94

-------------------------------------
Cinebench R11.5, 64bit (Single-Core)
Intel core i7 2600k = 1.66
AMD FX-8350 = 1.11

Multi thread doesn't mean scalar liner , but my calc shows that AMD FX-8350 acts as 8 Core with very poor IPC.If AMD's IPC was 1.66 , Number of Cinebench R11.5, 64bit (Multi-Core) would be 10.378 , almost 52% faster than Core i7 2600K.
2600K (95W) is a quad core. FX-8350 (125W) barely edges out (1.6%! hardly noteworthy) Intel's competitive quad core in multithreading (it should be 70-90% faster if it were really a 8 core). AMD's quad core falls way behind in single threaded performance. Even going off your theoretical 52%, that's a lot closer to Intel's Hyper-Threading Technology (boosts 30% in some benchmarks) than an actual 8-core processor (70-90%). AMD FX-8### is not an 8-core. There's no empirical data to prove it. It is a quad-core with a more advanced version of SMT.

And that link repeatedly proves my point: At no point does FX-8### look like an actual 8-core processor in benchmarks. It looks like a quad-core with SMT.
NC37Well either way at the end of the day...AMD will still have the first consumer 8 core once Zen comes out. Unless Intel decides to jump on it too. They did finally bring in a 6 core. Can't expect Intel to sit quietly as AMD unleashes an 8 core 16 thread monster on them.

Actually, I hope Intel does because right now I suspect AMD wouldn't price it competitively enough unless Intel had something for them to undercut.
What is 5960X? And if you believe AMD's BS (which clearly you don't), AMD put out the first 8-core consumer CPU in 2011 (FX-8150).
Posted on Reply
#127
Xuper
FordGT90Concept2600K (95W) is a quad core. FX-8350 (125W) barely edges out (1.6%! hardly noteworthy) Intel's competitive quad core in multithreading (it should be 70-90% faster if it were really a 8 core). AMD's quad core falls way behind in single threaded performance. Even going off your theoretical 52%, that's a lot closer to Intel's Hyper-Threading Technology (boosts 30% in some benchmarks) than an actual 8-core processor (70-90%). AMD FX-8### is not an 8-core. There's no empirical data to prove it. It is a quad-core with a more advanced version of SMT.

And that link repeatedly proves my point: At no point does FX-8### look like an actual 8-core processor in benchmarks. It looks like a quad-core with SMT.
Nope , It's not 4 core with SMT.I can't Call it as advanced SMT.whether you try or not , you can not apply it as SMT.SMT is different story.there is no word "advanced version of SMT" in CPU World.It's you that define base on definition of a core from the aspects of Intel's processors.
Posted on Reply
#128
FordGT90Concept
"I go fast!1!11!1!"
I prefer the term "hybridized simultaneous multithreading." Instead of the two threads being funneled into one pipeline, they generally stay in their own pipelines. The pipelines are inseparable; however, which makes the entire package a core.

A core, in the context of CPUs and GPUs, usually refers to a complete computation unit that exists more than once in multiprocessor designs--each individually programmable with discreet outputs. Bulldozer "module" fits that definition, not "integer core."
Posted on Reply
#129
Xuper
You say it because you compare it to Intel Core i7-5960X Haswell-E 8-Core, Because you think if it was 8 core , at least this should match Intel core i7 5960x or close.on other hand your reference is Intel.even Phenom II 1070 is worse than Core i3-4370.
Think that Intel is dead! Think that Intel went to God and lives along with our great grandfather and watches us.All your defined are base on Intel.I bet if Bulldozer's Performance was near Intel 5960x you wouldn't bring this flame war into this thread.
FordGT90ConceptIt doesn't have 8 cores; it has 4 cores with "8 integers cores."
You can have 8 Int+ 4 FPU 256 or 8 Int + 8 FPU 128.
Posted on Reply
#130
VulkanBros


So that means that what the OS reporting is wrong? It clearly says Cores: 4, Logical processors: 8
Posted on Reply
#131
rtwjunkie
PC Gaming Enthusiast
moproblems99Were you this mad when nvidia put 4GB of memory on the 970 but only 3.5GB were useful?
Why would he be? Look at his GPU in System Specs. But yes, as a general principle he took issue with that too.

I just do NOT understand why people assume and imply fanboyism just because they see someone making an argument against a product in a brand name? It's crazy!
Posted on Reply
#132
FordGT90Concept
"I go fast!1!11!1!"
behrouzI bet if Bulldozer's Performance was near Intel 5960x you wouldn't bring this flame war into this thread.
If an 8 core Bulldozer behaved like an 8 core Bulldozer, I'd be laughing at Dickey.
VulkanBrosSo that means that what the OS reporting is wrong? It clearly says Cores: 4, Logical processors: 8
Windows says what it sees and that is absolutely correct. It's only AMD that is pulling everyone's leg.


Took some digging but finally found a Phenom (K10) block diagram and die shot:
www.tomshardware.com/reviews/spider-weaves-web,1728-2.html
Posted on Reply
#133
Pill Monster
FordGT90ConceptYou underlined simultaneously on the screenshot.


.
Are you taking the piss or what? It says runs applications simultaneously. Not related to SMT in any way shape or form.


Posted on Reply
#134
Xuper
I have Cpu that doesn't have FPU unit.how many core does it have ? you made FPU unit as reference.
VulkanBros

So that means that what the OS reporting is wrong? It clearly says Cores: 4, Logical processors: 8
Why does windows say "AMD FX(tm)-9590 Eight-Core Processor" ? Windows 10 should say 4 Module not 4 Core.
Posted on Reply
#135
FordGT90Concept
"I go fast!1!11!1!"
Pill MonsterAre you taking the piss or what? It says runs applications simultaneously. Not related to SMT in any way shape or form.
*sigh* Virtually every article written about SMT mentions Intel Hyper-Threading Technology by name.

Here's three scholarly articles:
meseec.ce.rit.edu/eecc722-fall2012/722-9-3-2012.pdf page 2
www.d.umn.edu/~salu0005/smt.pdf page 21 (lower right corner)
www.cs.washington.edu/research/smt/index.html

Arstechnica (cached): webcache.googleusercontent.com/search?q=cache:DVYvpVnXe9sJ:arstechnica.com/features/2002/10/hyperthreading/+&cd=1&hl=en&ct=clnk&gl=us
behrouzI have Cpu that doesn't have FPU unit.how many core does it have ? you made FPU unit as reference.
What processor do you have? If it was made after 1995, it most likely does have a dedicated FPU in each core (excluding Bulldozer's definition of "core," of course).
behrouzWhy does windows say "AMD FX(tm)-9590 Eight-Core Processor" ? Windows 10 should say 4 Module not 4 Core.
What's the difference besides AMD's marketing? It is disengenious on three fronts: calling integer clusters "cores," calling two integer clusters and an FPU a "module" when it is really a core, and calling the core a "module" when it is not modular (certainly no more modular than every other core out there).
Posted on Reply
#136
Uplink10
VulkanBrosSo that means that what the OS reporting is wrong? It clearly says Cores: 4, Logical processors: 8
I am a bit skeptical about Windows. I have 1 TB drive and it shows I have 931.51 Gigabytes when it is actually 931.51 Gibibytes.
Posted on Reply
#137
Xuper
LOL! he believes Module = Core.Good luck.

You don't get what I say,I want to tell you : You made FPU unit as Reference that's why you said 4 Core With 8 Int Core.Core can be different base on different architecture.there is no defined standard , Not even close to a commonly accepted standard.base on my CPU's architecture, I can define CPU as 4 Core that contains 4 Int Unit with just one FPU unit that is capable of running 4 FPU Thread.
You're trying hard.
Posted on Reply
#138
Pill Monster
FordGT90Concept*sigh* Virtually every article written about SMT mentions Intel Hyper-Threading Technology by name.

Here's three scholarly articles:
A scholarly article on SMT mentioned Intel. So what? Could u be any more vague.

And how is this related to you associating SMT with mulititasking in Windows as in tyour last post.


Have u got something specific to point out because it seems like a strawman escape from the battle.
Tbh I thought better of you.......
Posted on Reply
#139
FordGT90Concept
"I go fast!1!11!1!"
behrouzI can define CPU as 4 Core that contains 4 Int Unit with just one FPU unit that is capable of running 4 FPU Thread.
Negative. Cores are complete compute units. It would be classified as a single core with 4 threads per core by anyone that isn't AMD.

@Pill Monster: Since you clearly don't like scholarly articles, try Wikipedia on for size: en.wikipedia.org/wiki/Simultaneous_multithreading
All of the above explain SMT in detail. Some describe Hyper-Threading in detail.
Posted on Reply
#141
FordGT90Concept
"I go fast!1!11!1!"
The design of UltraSPARC T1 is completely different from Bulldozer. Namely, the FPU isn't directly attached/associated with any core. It's more akin to the FPU co-processors from circa-1990. Everything that isn't an integer, it outsources to the FPU via the processor crossbar. There is no sharing of resources inside each core besides cache.

Block diagram of core (note each core accepts 4 threads; not SMT, it only works on one thread at once but rapidly switches between them):


Processor layout:


Do realize that SPARC processors are specifically engineered for databases. It was discussed previously in this thread.

UltraSPARC T1 is a true 8 core, 32 thread processor.

Edit: JBUS...HA!
Posted on Reply
#142
Xuper
Whether you like or not , We talk about Core definition.like I said i can define core myself base on my architecture.I can say that The design of Bulldozer is completely different from Intel Haswell.there is no Rule that Core should have a dedicated FPU or a Shared FPU or at least one FP instruction per cycle.AMD never said Bulldozer have 8 FP cores!
8 Core =! 8 FPU
Period.
Posted on Reply
#143
Aquinus
Resident Wat-man
behrouzWhether you like or not , We talk about define of Core.like I said i can define core myself base on my architecture.I can say that The design of Bulldozer is completely different from Intel Haswell.there is no Rule that Core should have a dedicated FPU or a Shared FPU or at least one FP instruction per cycle.AMD never said Bulldozer have 8 FP cores!
8 Core =! 8 FPU
Period.
An x86 CPU isn't really an x86 CPU without integer cores. If there is dedicated hardware for driving an integer core, the I would call that a core. The fact that floating point math can be written in software to be done on an integer core (and is on embedded applications that lack FPUs,) is reason enough (for me,) to say that the shared FPU is not a significant enough factor to exclude a "core" designation. Between that and the fact that how well using 4 threads versus 8 threads on an FX CPU scales versus on an i7 shows very clearly how they're real cores. HT will never give you near linear scaling where FX cores do (for purely parallel workloads.)

If 4 FPUs isn't enough for you then, GPGPU probably could be your friend.
Posted on Reply
#144
FordGT90Concept
"I go fast!1!11!1!"
FPU isn't the only component shared. The entire instruction decoder and associated L1 cache covers FPU and both integer clusters. The only thing that makes Bulldozer unique is the fact it has two integer clusters instead of one big one. The whole cohesive unit is still a core.


HT better utilizes existing hardware. It doesn't add much hardware to accomplish that. Bulldozer, on the other hand, added a lot of hardware to accelerate SMT. This is why Bulldozer benefits more from heavy multithreaded load but you're still better off having an actual 8 core (or even a 6 core, as Phenom II X6 demonstrates).
Posted on Reply
#145
bobjr94
By his logic, he is also going to need to sue companies like Microsoft, windows says it has 8 cores, falsely reporting core count. So does cpu-z.
Posted on Reply
#146
FordGT90Concept
"I go fast!1!11!1!"
Windows 8 and newer says FX-8### and FX-9### has 4 cores, 8 logic processors. Microsoft is not falsely reporting core count; AMD is on their retail packaging.
Posted on Reply
#147
Aquinus
Resident Wat-man
FordGT90ConceptFPU isn't the only component shared. The entire instruction decoder and associated L1 cache covers FPU and both integer clusters. The only thing that makes Bulldozer unique is the fact it has two integer clusters instead of one big one. The whole cohesive unit is still a core.
The only shared components are the fetch/decode units, L1 instruction cache, L2 cache, and FPU.

Come piledriver, AMD went from a 4-way decoder to two 2-way decoders which both either server up one of the integer cores or the floating point unit, which leaves the fetch unit, L1i, L2, and the FPU.

The Core 2 had a shared L2 cache and it is considered to have two cores, so I consider the L2 argument moot, which leaves the fetch unit, the L1i, and the FPU.

The fetch unit, testing seems to indicate that it is not a bottleneck and that improving it won't yield much tangible benefits:
Agner’s tests, however, may shed some light on the problem. According to his work, the fetch units on Bulldozer, Piledriver, and Steamroller, despite being theoretically capable of handling up to 32 bytes (16 bytes per core) tops out in real-world tests at 21 bytes per clock. This implies that doubling the decode units couldn’t help much — not if the problem is farther up the line. Steamroller does implement some features, like a very small loop buffer, that help take pressure off the decode stages by storing very small previously decoded loops (up to 40 micro-instructions), but the fact that doubling up on decoder stages only modestly improved overall performance implies that significant bottlenecks still exist.
source

So that would leave the L1i and the FPU. The FPU is undoubtably shared, not denying that and the L1i cache is shared because it makes sense when the fetch units are also shared. So that leaves just L1i and FPU for shared components that may make a difference.

What blows my mind is that people forget that AMD went from the Phenom II being able to execute 3 integer operations per clock cycle to two on the current architecture, which could have some serious implications for purely integer code. However, I think the source I provided earlier seems to sum it up best:
According to Agner, ” Two of the pipes have all the integer execution units while the other two pipes are used only for memory read instructions and address generation (not LEA), and on some models for simple register moves. This means that the processor can execute only two integer ALU instructions per clock cycle, where previous models can execute three. This is a serious bottleneck for pure integer code. The single-core throughput for integer code can actually be doubled by doing half of the instructions in vector registers, even if only one element of each vector is used.”

This has been the case since Bulldozer debuted — but issues here could explain why integer performance on Steamroller is so low compared to other cores. This is where things become frustratingly opaque — each of the areas we’ve identified could be the principle bottleneck — or it’s possible that the bottleneck is a combination of multiple factors (long pipelines, low fetch, cache collisions and low integer performance).
I'm not disagreeing that Bulldozer's performance sucks, that's why I got my 3820 but, I'm not convinced that it's the shared components but rather skimpy dedicated components that could be impacting performance. Xen, having a beefier integer core, very well might make up for the shortcomings of the dedicated components in these CPUs.

That's my only point. There is nothing to stop the dedicated hardware from being the bottleneck, even more so if they chopped it down to fit two of any given component in.

With that all said, I still think the really long pipeline is probably the main issue.
Posted on Reply
#148
FordGT90Concept
"I go fast!1!11!1!"
The lawsuit is about false advertising...
LegalNewsLineIn claiming that its new Bulldozer CPU had “8-cores,” which means it can perform eight calculations simultaneously, AMD allegedly tricked consumers into buying its Bulldozer processors by overstating the number of cores contained in the chips. Dickey alleges the Bulldozer chips functionally have only four cores—not eight, as advertised.
...with lacking performance used as evidence of being damaged...
LegalNewsLineThe suit alleges AMD built the Bulldozer processors by stripping away components from two cores and combining what was left to make a single “module.” In doing so, however, the cores no longer work independently. As a result, Dickey argues that AMD’s Bulldozer CPUs suffer from material performance degradation, and cannot perform eight instructions simultaneously and independently as claimed. He alleges that average consumers in the market for computer CPUs lack the requisite technical expertise to understand the design of AMD's processors and trust the company to convey accurate specifications regarding its CPUs. Because AMD did not convey accurate specifications, Dickey argues that tens of thousands of consumers have been misled into buying Bulldozer CPUs that cannot perform the way a true eight-core CPU would.
...everyone and their dog knows Bulldozer performance was underwhelming. The lawsuit explicitly targets AMD marketing FX-8### and FX-9### as having twice the number of cores when the guts simply aren't there to make two complete cores.

Note: I would dispute the underlined statements. There are circumstances where it can work on 8 threads simultaneously.
Posted on Reply
#149
Pill Monster
AquinusYeah but that's not because the L2 is shared, that's because AMD sucks at making fast SRAM cache stores. The Core 2 chips had a shared L2 between two full cores and they didn't suck. :p
Well admitidly I'm speculating somewhat there. ;) Though I will say the hotfix was relased to avoid a performance hit from L2 sharing in lightly threaded workloads. Y


OS default is sequential assignment, not ideal for PD because under 4 threads they run on 0,1,2,3. which are the first 2 modules. The scheduling was updated to 0,2,4,6, 1,3,5,7 so up to 4 would all have exlusive acess to fetch/decode L2 etc..


Man this spellcheck is pissing me off... anyway if there's any inrease in performance it's not noticble to me eihter way.... but I have noticed something in SuperPi. SuperPi on one core is much faster than on 8, like about 100x faster.
so there;s some food for thought..

I know AMD can't match Intel for latency or banwidtgh but wtf, BD/PD cache access is 4 times slower than Phenom or Athlon??? I looked at my old Phenom times, 5ms L3 access with 2400mhz IMC.
PD is around 30ms at 2800mhz wtf 0_o lol


Does anyone have a fix for spellcjeck not wotkinng?








But
Posted on Reply
#150
HumanSmoke
Pill MonsterDoes anyone have a fix for spellcjeck not wotkinng?
Secondary school English classes?

j/k
Posted on Reply
Add your own comment
Nov 27th, 2024 22:38 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts