Friday, November 6th 2015
AMD Dragged to Court over Core Count on "Bulldozer"
This had to happen eventually. AMD has been dragged to court over misrepresentation of its CPU core count in its "Bulldozer" architecture. Tony Dickey, representing himself in the U.S. District Court for the Northern District of California, accused AMD of falsely advertising the core count in its latest CPUs, and contended that because of they way they're physically structured, AMD's 8-core "Bulldozer" chips really only have four cores.
The lawsuit alleges that Bulldozer processors were designed by stripping away components from two cores and combining what was left to make a single "module." In doing so, however, the cores no longer work independently. Due to this, AMD Bulldozer cannot perform eight instructions simultaneously and independently as claimed, or the way a true 8-core CPU would. Dickey is suing for damages, including statutory and punitive damages, litigation expenses, pre- and post-judgment interest, as well as other injunctive and declaratory relief as is deemed reasonable.
Source:
LegalNewsOnline
The lawsuit alleges that Bulldozer processors were designed by stripping away components from two cores and combining what was left to make a single "module." In doing so, however, the cores no longer work independently. Due to this, AMD Bulldozer cannot perform eight instructions simultaneously and independently as claimed, or the way a true 8-core CPU would. Dickey is suing for damages, including statutory and punitive damages, litigation expenses, pre- and post-judgment interest, as well as other injunctive and declaratory relief as is deemed reasonable.
511 Comments on AMD Dragged to Court over Core Count on "Bulldozer"
@lilhasselhoffer: Most of that I covered already but I want to be very clear about something. In AMD's slides, they always say "8 integer cores" (accurately describes the product) and everywhere that isn't engineering slide, they omit that important word "integer." FX-8### prominently display "8-core" on the box, FX-6### prominently display "6-core" on the box, and FX-2### prominently displays "4-core" on the box. That's an outright lie. It doesn't have 8 cores; it has 4 cores with "8 integers cores." AMD is going to get nailed for false advertising. The plaintiff can easily make the argument that if everyone that bought it received half the processor they thought they were going to get, the other of plaintiff's charges fall into place:
And that link repeatedly proves my point: At no point does FX-8### look like an actual 8-core processor in benchmarks. It looks like a quad-core with SMT. What is 5960X? And if you believe AMD's BS (which clearly you don't), AMD put out the first 8-core consumer CPU in 2011 (FX-8150).
A core, in the context of CPUs and GPUs, usually refers to a complete computation unit that exists more than once in multiprocessor designs--each individually programmable with discreet outputs. Bulldozer "module" fits that definition, not "integer core."
Think that Intel is dead! Think that Intel went to God and lives along with our great grandfather and watches us.All your defined are base on Intel.I bet if Bulldozer's Performance was near Intel 5960x you wouldn't bring this flame war into this thread. You can have 8 Int+ 4 FPU 256 or 8 Int + 8 FPU 128.
So that means that what the OS reporting is wrong? It clearly says Cores: 4, Logical processors: 8
I just do NOT understand why people assume and imply fanboyism just because they see someone making an argument against a product in a brand name? It's crazy!
Took some digging but finally found a Phenom (K10) block diagram and die shot:
www.tomshardware.com/reviews/spider-weaves-web,1728-2.html
Here's three scholarly articles:
meseec.ce.rit.edu/eecc722-fall2012/722-9-3-2012.pdf page 2
www.d.umn.edu/~salu0005/smt.pdf page 21 (lower right corner)
www.cs.washington.edu/research/smt/index.html
Arstechnica (cached): webcache.googleusercontent.com/search?q=cache:DVYvpVnXe9sJ:arstechnica.com/features/2002/10/hyperthreading/+&cd=1&hl=en&ct=clnk&gl=us What processor do you have? If it was made after 1995, it most likely does have a dedicated FPU in each core (excluding Bulldozer's definition of "core," of course). What's the difference besides AMD's marketing? It is disengenious on three fronts: calling integer clusters "cores," calling two integer clusters and an FPU a "module" when it is really a core, and calling the core a "module" when it is not modular (certainly no more modular than every other core out there).
You don't get what I say,I want to tell you : You made FPU unit as Reference that's why you said 4 Core With 8 Int Core.Core can be different base on different architecture.there is no defined standard , Not even close to a commonly accepted standard.base on my CPU's architecture, I can define CPU as 4 Core that contains 4 Int Unit with just one FPU unit that is capable of running 4 FPU Thread.
You're trying hard.
And how is this related to you associating SMT with mulititasking in Windows as in tyour last post.
Have u got something specific to point out because it seems like a strawman escape from the battle.
Tbh I thought better of you.......
@Pill Monster: Since you clearly don't like scholarly articles, try Wikipedia on for size: en.wikipedia.org/wiki/Simultaneous_multithreading
All of the above explain SMT in detail. Some describe Hyper-Threading in detail.
en.wikipedia.org/wiki/UltraSPARC_T1
Block diagram of core (note each core accepts 4 threads; not SMT, it only works on one thread at once but rapidly switches between them):
Processor layout:
Do realize that SPARC processors are specifically engineered for databases. It was discussed previously in this thread.
UltraSPARC T1 is a true 8 core, 32 thread processor.
Edit: JBUS...HA!
8 Core =! 8 FPU
Period.
If 4 FPUs isn't enough for you then, GPGPU probably could be your friend.
HT better utilizes existing hardware. It doesn't add much hardware to accomplish that. Bulldozer, on the other hand, added a lot of hardware to accelerate SMT. This is why Bulldozer benefits more from heavy multithreaded load but you're still better off having an actual 8 core (or even a 6 core, as Phenom II X6 demonstrates).
Come piledriver, AMD went from a 4-way decoder to two 2-way decoders which both either server up one of the integer cores or the floating point unit, which leaves the fetch unit, L1i, L2, and the FPU.
The Core 2 had a shared L2 cache and it is considered to have two cores, so I consider the L2 argument moot, which leaves the fetch unit, the L1i, and the FPU.
The fetch unit, testing seems to indicate that it is not a bottleneck and that improving it won't yield much tangible benefits:source
So that would leave the L1i and the FPU. The FPU is undoubtably shared, not denying that and the L1i cache is shared because it makes sense when the fetch units are also shared. So that leaves just L1i and FPU for shared components that may make a difference.
What blows my mind is that people forget that AMD went from the Phenom II being able to execute 3 integer operations per clock cycle to two on the current architecture, which could have some serious implications for purely integer code. However, I think the source I provided earlier seems to sum it up best: I'm not disagreeing that Bulldozer's performance sucks, that's why I got my 3820 but, I'm not convinced that it's the shared components but rather skimpy dedicated components that could be impacting performance. Xen, having a beefier integer core, very well might make up for the shortcomings of the dedicated components in these CPUs.
That's my only point. There is nothing to stop the dedicated hardware from being the bottleneck, even more so if they chopped it down to fit two of any given component in.
With that all said, I still think the really long pipeline is probably the main issue.
Note: I would dispute the underlined statements. There are circumstances where it can work on 8 threads simultaneously.
OS default is sequential assignment, not ideal for PD because under 4 threads they run on 0,1,2,3. which are the first 2 modules. The scheduling was updated to 0,2,4,6, 1,3,5,7 so up to 4 would all have exlusive acess to fetch/decode L2 etc..
Man this spellcheck is pissing me off... anyway if there's any inrease in performance it's not noticble to me eihter way.... but I have noticed something in SuperPi. SuperPi on one core is much faster than on 8, like about 100x faster.
so there;s some food for thought..
I know AMD can't match Intel for latency or banwidtgh but wtf, BD/PD cache access is 4 times slower than Phenom or Athlon??? I looked at my old Phenom times, 5ms L3 access with 2400mhz IMC.
PD is around 30ms at 2800mhz wtf 0_o lol
Does anyone have a fix for spellcjeck not wotkinng?
But
j/k