Friday, November 6th 2015
AMD Dragged to Court over Core Count on "Bulldozer"
This had to happen eventually. AMD has been dragged to court over misrepresentation of its CPU core count in its "Bulldozer" architecture. Tony Dickey, representing himself in the U.S. District Court for the Northern District of California, accused AMD of falsely advertising the core count in its latest CPUs, and contended that because of they way they're physically structured, AMD's 8-core "Bulldozer" chips really only have four cores.
The lawsuit alleges that Bulldozer processors were designed by stripping away components from two cores and combining what was left to make a single "module." In doing so, however, the cores no longer work independently. Due to this, AMD Bulldozer cannot perform eight instructions simultaneously and independently as claimed, or the way a true 8-core CPU would. Dickey is suing for damages, including statutory and punitive damages, litigation expenses, pre- and post-judgment interest, as well as other injunctive and declaratory relief as is deemed reasonable.
Source:
LegalNewsOnline
The lawsuit alleges that Bulldozer processors were designed by stripping away components from two cores and combining what was left to make a single "module." In doing so, however, the cores no longer work independently. Due to this, AMD Bulldozer cannot perform eight instructions simultaneously and independently as claimed, or the way a true 8-core CPU would. Dickey is suing for damages, including statutory and punitive damages, litigation expenses, pre- and post-judgment interest, as well as other injunctive and declaratory relief as is deemed reasonable.
511 Comments on AMD Dragged to Court over Core Count on "Bulldozer"
Windows 8.1 and newer see "sockets," "cores," and "logical processors." Is Windows wrong? That's why I asked. Linux may be in agreement with Windows that FX-6350 is a tri-core.
and I apoligize windows 8.1 and 10 are the only ones that list them as logical, which is merely a function of the task scheduler.
News to me :laugh:
It's a frivolous lawsuit.
The GTX970 comparison isn't comparable; nvidia outright lied about things such as ROP/TMU count. The 3.5 + 0.5 memory is up in the air.
AMD sold you 8 cores, and you got 8 cores you can assign work to. When I encode 8 music tracks at once on my "4 core" FX-8320, I get double the encoding performance as I do encoding 4 tracks at once. Shocker.
I count four. Top left, top right, bottom left, bottom right.
Here's an actual 8-core Xeon 7500 series:
Like this:
You can read more about where this die shot comes from here: www.theregister.co.uk/2012/11/05/amd_opteron_6300_server_chip/
So, yeah, that die you say you can see 8 cores in. It is an 8-core bulldozer die, not a 16. So you just admitted that it is 8-Cores, not 4. Ooooooops.
I see 8 in your picture, not 16.
As to my mistake, I do see 8 components, two in each white box but there's no clear line separating them; hence, I see four cores.
Here's the post: www.techpowerup.com/forums/threads/amd-dragged-to-court-over-core-count-on-bulldozer.217327/page-3#post-3367260
So YES, modules are totally a hardware construct for the kernel, but so are also the cores. But no they are not invisible for the kernel, it's just the same thing for it, address1/2/3, that's it. The kernel is not asking the processor : "Hey, are you a core or a module?".
You remind me of the retards when hyperthreading got out, saying : "look, look, i have 8 processors", and we responded by : "no it's not processors, it's cores, and 4 of them go to the same core, but Windows see them as cores". And arguing eternally because they don't know anything of what they are talking about.
It's what we are trying to explain to you, it's working the EXACT same way as hyperthreading, but at the end, instead of having 8 "logical cores", alternating between two entries on the same physical core, it's going to a real physical core. Instead of using one thread to fill holes in the physical core by dispatching two threads in same time, it is a module dispatching two threads between two physical cores in same time in the module. It behave like hyperthreading, but it is not hyperthreading.
If you want it explained by someone being in engineering :
Edit: And now I see another problem. The line on the second from the top big box needs to be shifted to the left a little bit. Got links for that? FTFY
I called it "hybridized simultaneous multithreading." It's not Hyper-Threading Technology but it also isn't a dual core. It's something in between. At bare minimum, they should have called it "8-core*" with fine print saying it has 8 integer clusters and explaining what that means in laymen terms. Nothing new in there.
Tell me, by that definition does the Intel 8080 or 8088 not contain a CPU core because it lacks a FPU?
You come here and you pretend you know more than people who studied way more than you did. That's what i do for a living, i study how programs compile on multicore processors, how the cores are behaving, how the threads are behaving, how the kernel handling the the threads, how GCC is handling the threads, then how the compiled program handle the threads. Then i optimize the sources of the package and usually make recommendations to the kernel and GCC dev teams on how the CFLAGS are behaving on precise architecture and microarchitecture.
I can tell that,
CFLAGS="-march=bdver2 -mmmx -msse -msse2 -msse3 -mssse3 -msse4a -mcx16 -msahf -maes -mpclmul -mpopcnt -mabm -mlwp -mfma -mfma4 -mxop -mbmi -mtbm -mavx -msse4.2 -msse4.1 -mlzcnt -mf16c -mprfchw -mfxsr -mxsave --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver2 -fstack-protector-strong -O2 -pipe"
will do a good job at handling all cores and producing quality binaries.
But in fact,
CFLAGS="-march=bdver2 -mmmx -msse -msse2 -msse3 -mssse3 -msse4a -mcx16 -msahf -maes -mpclmul -mpopcnt -mabm -mlwp -mfma -mfma4 -mxop -mbmi -mtbm -mavx -msse4.2 -msse4.1 -mlzcnt -mf16c -mprfchw -mfxsr -mxsave --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver2 -fstack-protector-strong -O3 -pipe" is too much vectorization for the -mavx, and the FP Scheduler can't follow and the FPU becomes a burden too heavy for the advantage of CPU cycles you get by using it to vectorize instead of of SSE. I am programming in assembly, i invoke the instructions myself and see how the processor behave if i begin threading it to the max, what is the breaking point.
You don't even know how the scheduler works. Even AMD engineers took so much time understanding how to handle such a complex task. They went even nearer with the Excavator architecture, but it would take too much time to get it to perfection, meanwhile Intel is improving it's IPC to the max, and the they can't compete. People want more power.
What they learnt with the Bulldozer is not wasted. Because they are introducing some ARM co-processor in their processor, which will have to be shared.
Just go check en.wikipedia.org/wiki/ARM_big.LITTLE
The big.LITTLE architecture is recognized as Octocore. Even if the Cortex-A7 is shared with an A15 in a virtual core module, and in that configuration, they share the same VFPv4 Floating Point Unit, and the Cortex-A7 is mainly just used for low power THUMB-2 instructions. It is not uncommon practice, and no engineer ever will tell you those two cores inside the module are just one. Not a single one.
If you are interested in seeing a real operating system taking full advantage of the AMD cores, i began doing some videos of distributed compilation, on which i study afterwards the details of how the processor performed. 25 years of experience in engineering.
For me the debate finish here. If you are really interested in knowing how much you are wrong, go study x86 instruction set and more precisely x87 subset and how they are implemented in modern processors. When you will be ready in a few years, we will have the same conversation.
The 8088 was also small enough where a system could be wired out to support more than one 8080 on the same system. Software merely didn't support it.
Edit: Also, die shot:
I see 8 distinct processors. If this goes to court, it will be legally defined. Been over all that already. FPUs have been intrinsic to x86 design since the mid 1990s. A core has been established as a complete processor (as in take an uniprocessor, take away the memory controller, slap it on a die with another uniprocessor, add back in the memory controller, and a bus to communicate with the rest of the system) since circa 2006. You're using excuses from lifetimes ago in technology to justify what AMD did. AMD should know better that if you're going to redefine things, you better make it damn clear on the box. AMD is going to lose this and they're going to lose hard. The case against AMD is stronger than the case against Seagate which Seagate lost.
en.wikibooks.org/wiki/Microprocessor_Design/Multi-Core_Systems
You can't tell me IBM is wrong. The PPE don't have any FPU, it transfers them to the external SPEs, who just do Floating-Point calculations external to the PPE. Yet the PPE and the SPEs are called cores. Because that's what they are, cores. Even if you can't do nothing alone with an SPE, it's still called a core, seen as cores, behave like cores. The processor is considered as 9 cores.
You can't even dare say you know better than IBM what constitute or don't constitute a core.
Or you can take Nvidia too. They claim they have 2560 Cuda cores in their GTX 1080. When ALL they have in a core is an FPU, and there's 50 Cores in a module, with only a scheduler, a dispatcher, and all it can do is floating-point operations in parallel.
No one, no company can define with precision what consist of a core. If AMD decide it's 2 Piledriver cores fused together with one shared FPU, it is. Even if they had no FPU at all. It is still 2 cores in one module.
I'd argue GPUs don't have cores because they can't function without a CPU to give them orders. They are always incomplete. They're more like SPEs (co-processors) than PPEs (processors). That said, AMD, Intel, NVIDIA, ARM, etc. play fast and loose with what they call most of the components of their GPUs. One can't reasonably draw a box around a broad component and call that something universal. Case in point: compare the above to GCN:
Not much agreement on what to call anything. That said, NVIDIA choose to mimic GCN with Pascal to improve Pascal's virtual reality performance.
The nature of the thread has dictated CPU layout. GPU layout has changed drastically over the years.
So, you can make the same case until your blue in the face but, the simple fact is that X86 is nothing without integer cores and can operate without an FPU just fine, just because all general purposes CPUs now come with FPUs is beside the point. Do AMD and nVidia GPUs only have one core because they only have one UVD (or similar,) core for video playback? Every GPU nowadays has one, so why not? It's not mandatory for the operating of a GPU but most people get a lot of use out of video acceleration. The FPU is no different. It makes your life easier but, it is by no way required for the operation of X86. Whether it's connected by and external bus or on the same die, it's still a co-processor.
Simply put, there is no such thing as an x86 CPU capable of only doing floating point math while lacking integer ALUs and compute. That alone should tell anyone that the FPU is not a dictator for what should be considered a compute core. Once again x87, SSE, MMX, and all the other floating point math extensions are exactly that, extensions.
The only reason why x86 doesn't include the x87 standard is because they both go back to distinct processors decades ago. Most architectures designed after the creation of the IEEE754 standard have FPU features standard. One example of this is IA-64. The only reason x87 remains an extension is because of backwards compatibility. In practice, it is not separate. Every processor built on it in the last decade, almost two, have an FPU--even Bulldozer.
This is all very irrelevant anyway because a core is a core, not an integer cluster. AMD, at best, is going to settle which means they don't admit guilt to misleading the public. At worse, it will go to court, AMD will lose, and they'll likely have to pay out hundreds of millions or billions for making consumers think they got twice what they got.