Wednesday, January 23rd 2019

Bulldozer Core-Count Debate Comes Back to Haunt AMD

Jan 23rd, 2019 02:51 Discuss (369 Comments)

AMD in 2012 launched the FX-8150, the "world's first 8-core desktop processor," or so it says on the literal tin. AMD achieved its core-count of 8 with an unconventional CPU core design. Its 8 cores are arranged in four sets of two cores each, called "modules." Each core has its own independent integer unit and L1 data cache, while the two cores share a majority of their components - the core's front-end, a branch-predictor, a 64 KB L1 code cache, a 2 MB L2 cache, but most importantly, an FPU. There was much debate across tech forums on what constitutes a CPU core.

Multiprocessor-aware operating systems had to be tweaked on how to properly address a "Bulldozer" processor. Their schedulers would initially treat "Bulldozer" cores as fully independent (as conventional logic would dictate), until AMD noticed multi-threaded application performance bottlenecks. Eventually, Windows and various *nix kernels received updates to their schedulers to treat each module as a core, and each core as an SMT unit (a logical processor). The FX-8350 is a 4-core/8-thread processor in the eyes of Windows 10, for example. These updates improved the processors' performance but not before consumers started noticing that their operating systems weren't reporting the correct core-count. In 2015, a class-action lawsuit was filed against AMD for false marketing of FX-series processors. The wheels of that lawsuit are finally moving, after a 12-member Jury is set up to examine what constitutes a CPU core, and whether an AMD FX-8000 or FX-9000 series processor can qualify as an 8-core chip.

US District Judge Haywood Gilliam of the District Court for the Northern District of California rejected AMD's claim that "a significant majority of" consumers understood what constitutes a CPU core, and that they had a fair idea of what they were buying when they bought AMD FX processors. AMD has two main options before it. The company can reach an agreement with the plaintiffs that could cost the company millions of Dollars in compensation; or fight it out in the Jury trial, by trying to prove to 12 members of the public (not necessarily from an IT background) what constitutes a CPU core and why "Bulldozer" qualifies as an 8-core silicon.

The plaintiffs and defendants each have a key technical argument. The plaintiffs could point out operating systems treating 8-core "Bulldozer" parts as 4-core/8-thread (i.e. each module as a core and each "core" as a logical processor); while the AMD could run multi-threaded floating-point benchmark tests to prove that a module cannot be simplified to the definition of a core. AMD's 2017 release of the "Zen" architecture sees a return to the conventional definition of a core, with each "Zen" core being as independent as an Intel "Skylake" core. We will keep an eye on this case.

Source: The Register

Add your own comment

369 Comments on Bulldozer Core-Count Debate Comes Back to Haunt AMD

#201

londiste

Again. Public and industry understands core as a unit that can take instructions from a set (in this case x86), execute them and get the compute results out. Front end is definitely part of the core. The gist of it is - you cannot take that integer core out and use it as a functional x86 CPU.

#202

FordGT90Concept

"I go fast!1!11!1!"

Patriot

I waved a magic wand and turned that Thuban core into a pseudo Bulldozer core for you:

Two integer cores in one core. Not rocket science. Blue box aligns with the Thuban picture I posted.

#203

Patriot

londisteAgain. Public and industry understands core as a unit that can take instructions from a set (in this case x86), execute them and get the compute results out. Front end is definitely part of the core. The gist of it is - you cannot take that integer core out and use it as a functional x86 CPU.

You can't take any singular cores out of any modern cpu. They are too highly integrated. Sharing memory controllers and caches.
Being unable to seperate highly integrated chips into individually working components is NOT the definition of a core.

Independent execution is.
If you wanted your core to work by itself you would have to add all those other components that have been optimized out.
Same way if you wanted to break a Bulldozer module apart.

FordGT90ConceptI waved a magic wand and turned that Thuban core into a pseudo Bulldozer core for you:
Two integer cores in one core. Not rocket science. Blue box aligns with the Thuban picture I posted.

LOL 0/10, would get sued and lose.

You missed the 2nd FPU unit.
That only has 1 128bit FPU, bulldozer has 2.

Bulldozer was AMD's first AVX supporting chip and they did something different with it.
It required 2 fpu units sharing resources that could also operate independently.

AMD's claim is 8 real cores and we did something different in the name of saving space to give you 8 cores.
It has 8 independently operating cores...... and it does multi-threaded performance Better than 4c/8t.
It is exactly as advertised.

I have tried to explain CPU architectures but you all clearly desire an argument more than the truth, Peace.

#204

FordGT90Concept

"I go fast!1!11!1!"

PatriotYou missed the 2nd FPU unit.

Bulldozer only has one FPU but it is capable of SMT.

Seriously, this is irrelevant. Look at AMD's "Zen" Core slide again. The literally put it next to "Excavator" and it's devoid of any other uses of the word "core." It's plain as day to see AMD acknowledges Excavator "modules" were in fact, Excavator "cores" or they wouldn't compare it to Zen as they did.

#205

londiste

PatriotYou can't take any singular cores out of any modern cpu. They are too highly integrated. Sharing memory controllers and caches.
Being unable to seperate highly integrated chips into individually working components is NOT the definition of a core.

Independent execution is.

Independent execution is effectively the same point.

Memory controllers and caches are not part of the core, no more than FPUs. In consumer space Athlon64 was the CPU moving memory controller into the CPU, it used to be in northbridge. Same with cache, there were (even x86) CPUs without cache and it was outside the CPU at first.

#206

NdMk2o1o

Just started from page 3 about 30 mins ago... an oldie but a goodie and truly fitting for this thread I think :toast:

#207

FordGT90Concept

"I go fast!1!11!1!"

PatriotIt has 8 independently operating cores......

They are not independent. If the fetcher fails in one module, both integer cores become unreachable. If the same scenario played out in Zen or Thuban, it would only be down one integer core because that's not a shared component (nothing is shared except memory subsystems which are native to computer design).

#208

Patriot

Yup, confirming, done, yall cant read worth shit.

But please, go actually try and learn what makes a cpu a cpu. And study the evolution of multiprocessor design and how the cores share resources.... it would be enlightening.

#209

Vya Domus

FordGT90Concept2 integer cores != 2 cores

Don't you think something is awfully amiss here that you have to use the word "core" for both ?

FordGT90ConceptIf the fetcher fails in one module, both integer cores become unreachable.

That's just a lack of redundancy, doesn't mean anything in particular.

FordGT90ConceptBulldozer only has one FPU but it is capable of SMT.

It has two (technically three), per module. The diagrams have been posted to death by this point, seriously. SMT has nothing to do with any of this.

#210

Shambles1980

Vya DomusDon't you think something is awfully amiss here that you have to use the word "core" for both ?

i have to use the word dog for a dog and a dog fish..
must be the same right.

#211

seronx

FordGT90ConceptBulldozer only has one FPU but it is capable of SMT.

It has a single floating point core. With the capabilities of;
2x Lo 80-bit
2x Hi 64-bit
For FMACs Pipe0 and Pipe1. Lo0+Hi0 = P0 and Lo1+Hi1 = P1
2x Mid 128-bit
For MMXs Pipe2 and Pipe3.

To Steamroller/Excavator;
2x Lo 80-bit
2x Hi 64-bit
For FMACs Pipe0/Pipe1.
1x Mid 128-bit
For MMX Pipe2.

The units themselves are each a FPU. While, all of them are part of the whole floating point core. The FP core however can also be called the Floating Point Unit.

The cores in Bulldozer can also be called Integer clusters. The module can also be called a core. However, most of these distinctions are marketing.

Bulldozer via Industrial+Educational standards has
2x AMD64 cores
1x AMD64 floating-point core.

The cores don't execute x86-64, they execute an internal ISA. The cores are thus separate from the dispatch of those decoded instructions. The core begins at the instruction bus which is the retire queue and ends at the load/store which is the load/store buffers.

Even if AMD went from 2x LSU to 1x LSU there will still be two cores.

#212

FordGT90Concept

"I go fast!1!11!1!"

Even if it had "2x AMD64 cores" and "2x AMD64 floating-point cores", so long as the fetcher is shared, the whole is collectively referred to as the core (aka, processor), not the individual components you enumerated. Hell, drop it down to just "2x AMD64 cores" and forget the "AMD64 floating-point cores" as long as they share that fetcher, the whole is still considered a core.

If you look at how the word "core" is used in the context of processors, it is the lowest common denominator across all architectures. It describes the discreet hardware that takes an instruction with operands and turns it into a result: cache to cache. The fetcher is a critical component of that going back to at least the 80386:

Excavator "core" is therefore a singular core with two discreet ALUs handling two concurrent threads.

Jump ahead to Pentium 3 there's a fetcher per core:

If the CPU can't fetch an instruction to decode, it literally remains forever idle.

#213

seronx

FordGT90Conceptas long as they share that fetcher

There is no shared fetching, and even if there was it would still be two cores.

Core 0 can't fetch Core 1's instructions.
Core 1 can't fetch Core 0's instructions.

Core 0 fetches 16B every cycle.
Core 1 fetches 16B every cycle.

#214

FordGT90Concept

"I go fast!1!11!1!"

The fetcher delegates threads and instructions. In the case of Excavator with both integer cores enabled, that's two sets of instructions per cycle. The fetcher decides which integer core gets to process it. For example, if the power state is lowered, it can request two instructions but place them both on the integer core in a serialized fashion. That has obvious advantages when running on one integer core because it translates to fewer cycles wasted on the memory subsystem. It's still a shared, critical component.

If you seriously think the fetcher isn't shared then you're telling me AMD doesn't know their own product (:laugh:). Refresher:

Let me get my van Gogh on again...

This is what a dual-core module would look like (mimics UltraSPARC T1):

You could eliminate the Fetcher/Decoder/FPU entirely from this schematic and it will still qualify as a dual core module because there's two discreet processors there. Likewise, you could clone the FPU, remove the fetcher/decoder for it, and place it under the control of each core's fetcher/decoder and you'd end up with a design very similar to Core 2 Duo.

If there was no FPU under the fetcher, the fetcher wouldn't fetch floating point operations by design. It has to be shared in order to load balance the shared FPU. If one thread is hammering FPU instructions, the processor is better off sending another FPU instruction heavy thread to an entirely different module.

#215

RichF

Garbage Class-Action Lawsuit Against AMD Bulldozer Is Headed to Trial
www.extremetech.com/computing/284335-the-garbage-class-action-lawsuit-against-amds-bulldozer-is-headed-to-trial

Joel HruskaWhat Dickey and Parmer are actually arguing is that Bulldozer/Piledriver (the FX-9590, specifically) did not deliver the performance they expected from an eight-core CPUrelative to Intel CPUs. They argue that the shared resources in the Bulldozer core prevented the chip from “simultaneously multi-tasking” and that because resources were shared between the CPU cores, that Bulldozer “functionally only have four cores.” Both of these claims are factually wrong.

Joel HruskaA Bulldozer CPU core was different than a Thuban CPU core, or an Intel CPU core from an equivalent Core chip. The problem is, they aren’t nearly different enough to justify arguing that AMD had misused the word core, and the claims the plaintiffs make do not withstand technical analysis.

Did Bulldozer share an FPU? Certainly. So did the Sun UltraSPARC T1 (one FPU per chip) and T2 (one FPU per core, but shared by up to eight threads). The lawsuit claims that sharing L2 caches and FPUs means that AMD violated the commonly understood definition of “core,” yet Intel chips have shared L2 caches since the Core 2 Duo days. And therein lies the problem. We could certainly define a CPU core based on the underlying capabilities of the relevant components to act as a general purpose microprocessor without assistance — something Cell’s SPEs cannot do. This type of division would establish a more meaningful differentiation. Attempting to draw a line through a chip in the manner that this lawsuit does, however, is impossible. If Bulldozer’s cores aren’t cores, neither are the cores in other CPUs.

Someone posted that the way AVX-256 is processed by Bulldozer justifies this lawsuit. However, that reasoning calls into question whether any processor that lacks AVX-256 support has even one "actual core", which is clearly absurd.

Beyond how Bulldozer didn't measure up to Intel's design decisions in various ways, it exceeded Intel's performance in certain other ways — such as the number of in flight instructions the processor could handle. Does this mean Intel's processors didn't have true cores in them? After all, they didn't tell consumers that Bulldozer can handle more in flight instructions.

Even earlier processors didn't have FPUs at all. Some supported external FPU chips. Some didn't support even those. Some chips have L4 cache. Old CPUs had no cache at all. Is something that's on the die part of the CPU core, from the point of view of the consumer, like the L4 cache in Broadwell-C? If not, what is the consumer to make of it — that it doesn't exist simply because it's not part of the main chip on the die or part of what CPU architects consider a core? For something that doesn't exist, Broadwell-C's L4 did improve performance tangibly in workloads that are important to consumers — making the obsessing over what's inside cores even more suspect.

There is also the issue of in-order vs. out-of-order design. In-order, which is slower, was dropped back in 1995 with the Pentium Pro. Yet, Intel decided, many years later, to sell Atom to the masses, an in-order design. With the notion that consumers should consider it fraud when a company sells them slow cores — the Atom seems to be a great target for frivolous lawsuits. Not only was it a radical return to in-order design, it was paired with a power-inefficient supporting cast that cast very dramatic doubt on the entire point of Atom's marketing pitch: its performance-per-watt, a performance-per-watt level reached by subjecting consumers to the anemic performance of in-order processing, processing slowness not justified by the savings in power due to the horribly inefficient supporting chipset/GPU. To make matters worse in terms of consumer confusion, Atom was later changed to be out-of-order. There was a ton of pro-Atom netbook hype for quite some time. Then, a large swath of reviewers began writing as if the entire thing was the fault of silly consumers, even though so many of them hyped netbooks while it was trendy to do so.

#216

FordGT90Concept

"I go fast!1!11!1!"

How about we quote directly from the judge instead of a journalists' opinion?
regmedia.co.uk/2019/01/22/amd-core-class-action.pdf

Plaintiffs argue in their complaint that Defendant's Bulldozer products do not contain eight
“cores” as claimed and advertised. Id. ¶ 8. According to Plaintiffs, a “core” is a processing unit
that is able to operate (e.g., perform calculations and execute instructions) independent from other
cores positioned on a chip. Id. ¶ 23–24. Plaintiffs allege that the Bulldozer CPUs, advertised as
having eight cores, actually contain eight “sub-processors” which share resources, such as L2
memory caches and floating point units (“FPUs”). Id. ¶ 37–49. Plaintiffs allege that the sharing
of resources in the Bulldozer CPUs results in bottlenecks during data processing, inhibiting the
chips from “simultaneously multitask[ing].” Id. ¶¶ 38, 41. Plaintiffs allege that, because
resources are shared between two “cores,” the Bulldozer CPUs functionally only have four cores.
Id. ¶ 38–43. Therefore, Plaintiffs claim the products they purchased are inferior to the products as
represented by the Defendant. Id. ¶ 39.

Dickey purchased a FX-9590 and Parmer purchased an FX-8350 both advertised as a " native 8-core desktop processor."

Both Named Plaintiffs allege they relied on Defendant’s advertisements as well as their
“own understanding of the term ‘core’” in “believ[ing] that the . . . 8-Core Bulldozer processor
would contain 8 cores, such that each ‘core’ would be independent from all others (i.e., it would
not share resources with the other cores) and would be capable of performing independent
calculations at full speed.” Id. ¶¶ 54, 62. Plaintiffs allege that Defendant’s representations
regarding the number of cores on each Bulldozer chip were false. Id. ¶¶ 38–43

AMD tried to throw the case out but the judge said:

Defendant’s challenges are not persuasive. The central question raised is whether a
reasonable consumer would have been deceived by the term “core” as used in Defendant’s
advertising.

How does the court answer that question?

“Whether an ordinary consumer reasonably believes [plaintiffs’ interpretation of the misleading statements] is amenable to
common proof: reviewing the advertisements, labels, and then asking the jury how they understand the message.”

Defendant has, in essence, repurposed its argument that different class members may hold
different understandings of the term “core” as a challenge to class-wide exposure. Because
exposure to the alleged misleading statements is uniform across the class, and for the same reasons
as discussed in Sections III(A)(ii) and III(B)(i)(a) above, these individualized issues of each class
member’s understanding are not at issue when materiality is assessed on a class-wide basis under a
reasonable consumer standard.

The definition of "core" must be held by "a reasonable consumer standard." The population ("class") cannot be divided up between experts and amateurs because that's not what false advertising is about.

At this point, the class action lawsuit is very confined in scale:

All individuals who purchased one or more of the following AMD computer chips either (1) while residing in California or (2) after visiting the AMD.com website: FX-8120, FX-8150, FX-8320, FX8350, FX-8370, FX-9370, and FX-9590.

I think that's a mistake seeing how the alleged misrepresentation appears everywhere (retailers, on the retail packaging, on advertising material associated with machines containing the processors, etc.). They're not reaching for the stairs like they could be.

"Plaintiffs allege that the Bulldozer CPUs, advertised as having eight cores, actually contain eight “sub-processors” which share resources..." this statement is absolutely true and where AMD is in trouble trying to redefine what a "core" is. No doubt AMD is going to explain to the jury that sharing L2 cache is not out of the ordinary across many architectures so the plaintiffs' case is kind of weak there but sharing FPUs is something extraordinary in the consumer space.

Keep in mind that the Plaintiffs aren't "tech experts." They'll bring in an expert to argue their case for the jury.

#217

RichF

FordGT90ConceptHow about we quote directly from the document instead of a journalists' opinion?

I'm not seeing where that judge's opinion rebuts the points raised by Hruska, such as the point about SMT. Perhaps you can directly rebut his rebuttal?

Hruska's article was published in response to the judge's opinion, so the order of things is to rebut Hruska's rebuttal rather than to go back in time to the point in time where the opinion was released and the rebuttal didn't exist.

#218

FordGT90Concept

"I go fast!1!11!1!"

RichFSomeone posted that the way AVX-256 is processed by Bulldozer justifies this lawsuit. However, that reasoning calls into question whether any processor that lacks AVX-256 support has even one "actual core", which is clearly absurd.

A module can only process one AVX2 instruction at a time because it requires the full capabilities of the FPU to execute. If two threads are both queuing a AVX2 instruction, one thread has to halt while the other executes it. This is irregular for consumer CPU space. Bulldozer and Piledriver can't do AVX2.

RichFI'm not seeing where that judge's opinion rebuts the points raised by Hruska, such as the point about SMT. Perhaps you can directly rebut his rebuttal?

Hruska's article was published in response to the judge's opinion, so the order of things is to rebut Hruska's rebuttal rather than to go back in time to the point in time where the opinion was released and the rebuttal didn't exist.

The judge's only job is to align the claims with the law. The judge dismissed AMD's argument that AMD did not mislead the public.

Hruska was prattling on about technical jargon which is irrelevant to the case. There's really only two very basic questions being asked here:
1) Is the definition of a "core" an "independent processor?" [this is going to be a resounding "yes"]
2) Does Bulldozer sharing resources conflict with the definition of an independent processor? [this can go either way depending on the strength of the arguments presented by the lawyers and witnesses]

A jury of 12 will be answering those question, not you nor I, and their decision will define the word in California and likely beyond.

#219

lexluthermiester

Late to the party once again, but here's my 2 cents;
The Bulldozer CPU was a hybrid architecture. It was neither 8 true cores NOR 4 true cores. Because of it's design it was somewhere inbetween. The performance bares that out. For cost, the performance was a good value. People whining about whether or not they got 8 actual cores need some cheese. Technically, it did have 8 instruction executing cores with an FPU unit for each pair of cores in a module. Because of that logic, which is based in factual functionality of how the CPU works, AMD should win this.

qubitHopefully this lawsuit will discourage AMD from using such a cludgy, low performance compromised design in the future.

For it's time it performed very well for it's price point. Your conclusion is flawed.

#220

londiste

RichFI'm not seeing where that judge's opinion rebuts the points raised by Hruska, such as the point about SMT. Perhaps you can directly rebut his rebuttal?
Hruska's article was published in response to the judge's opinion, so the order of things is to rebut Hruska's rebuttal rather than to go back in time to the point in time where the opinion was released and the rebuttal didn't exist.

Performance is not the primary concern in the court.

At the same time, look at how bulldozer was improved over time. Separate decoder was added in Steamroller and it was suspected and partially shown that with decoder being removed as a limitation, fetch became one. So, in the Fetch-Decode-Execute, Execute had more resources since the beginning, Decode had to be doubled afterwards and to extract the possible performance Fetch would have to be doubled as well. Now if they went through with that the result would have been two independent cores.

FPU claims are fairly irrelevant. In the same way, so are L2 caches. Bringing these up in the court is kind of stupid.
Memory controllers and caches are not integral or required part of the core, no more than FPUs. In consumer space Athlon64 was the CPU moving memory controller into the CPU, it used to be in northbridge. Same with cache, there were (even x86) CPUs without cache and it was outside the CPU at first.

lexluthermiesterTechnically, it did have 8 instruction executing cores with an FPU unit for each pair cores in a module.

Those 8 are pipes, not cores.
Zen has six plus pretty much the same FPU, 10 pipes total in both units in execution stage. Skylake has 8 pipes in the execution unit.
With all this, we are talking about execution units.

Bulldozer: en.wikipedia.org/wiki/File:AMD_Bulldozer_block_diagram_(CPU_core_bloack).PNG
Zen: en.wikichip.org/wiki/amd/microarchitectures/zen#Individual_Core
Skylake: en.wikichip.org/wiki/intel/microarchitectures/skylake_(client)#Individual_Core

Independently executing means a core (similarly to CPU) should be capable of executing the instruction set, not specific micro-operations. At least when we are talking about x86.

#221

FordGT90Concept

"I go fast!1!11!1!"

londisteBulldozer: en.wikipedia.org/wiki/File:AMD_Bulldozer_block_diagram_(CPU_core_bloack).PNG

"Core Interface Unit" betrays AMD here too. Why didn't they call it "Module Interface Unit?" Could it be that AMD internally called "modules" "cores?" Zen slides and slides presented during the debut of Bulldozer certainly suggest that.

Context: www.tomshardware.com/reviews/processors-cpu-apu-features-upgrade,3569-15.html

Image (Core Interface Unit is abbreviated as "Core IF"):

There's only four and they're responsible for communication among:
-input via L1I
-output via L1D from each integer unit
-input/output via L2
-other modules

Three major components are shared across all Bulldozer iterations:
1) Fetcher (manages high level instructions)
2) Core Interface Unit (effectively a high level cache and communications controller)
3) Floating Point Unit (it's wider in an attempt to match Thuban FPU performance per thread but AVX2 will effectively shutdown access to the FPU by one thread in Excavator)

AMD officially calls them "integer cores" judging by AMD slides. Pictures above call them integer clusters. Lawsuit calls them "subprocessors." One can't deny that AMD has done a poor job of messaging here.

#222

qubit

Overclocked quantum bit

lexluthermiesterFor it's time it performed very well for it's price point. Your conclusion is flawed.

My conclusion isn't flawed, yours is.

It was supposed to go up against Sandy Bridge, but AMD were then forced to reduce the price because performance was so rubbish.
On top of that, it's not really a true 8-core (hence my use of "dodgy" in my statement) and can't therefore be claimed as such, no matter how one spins it, hence this lawsuit. I hope the lawsuit wins and no one ever tries this again.

#223

EsaT

FordGT90ConceptThe definition of "core" must be held by "a reasonable consumer standard." The population ("class") cannot be divided up between experts and amateurs because that's not what false advertising is about.

Bulldozer's design certainly had more than its share of flaws, starting from crappy IPC. (just like in Pentium 4)
But if we are talking about false advertising, where's the lawsuit about Intel's CPU generation advertising?

After all "7th" gen was nothing but carbon copy of 6th gen with just clock speed tweaks and really had no business of being anything else than new xx50 designation CPU models.
Even "9th" gen is more of same old Skylake with only some bug tweaks. Though at least extra cores would give justification for calling it as seventh gen.
Not to forget artificial CPU socket roulette to force people to buy new motherboards:
www.techpowerup.com/250109/core-i9-9900k-achieves-5-50-ghz-overclock-on-a-z170-chipset-motherboard

And then there are those compilers provided by Intel two decades ago claiming compatibility with also AMD...
While they actually disabled multimedia extensions supported by CPU, if program was run on AMD CPU, to give Intel CPUs artificial advantage.

qubitOn top of that, it's not really a true 8-core (hence my use of "dodgy" in my statement) and can't therefore be claimed as such, no matter how one spins it, hence this lawsuit. I hope the lawsuit wins and no one ever tries this again.

OK, so when does Intel get judged for their advertising and sleezy tactics?
Or are we going to be picky about who gets penalized and who is given get out of jail for free card?
Intel practised literally extortion 15 years ago.

Intel isn't any white knight on white horse, not even grey knight...
jolt.law.harvard.edu/digest/intel-and-the-x86-architecture-a-legal-perspective

#224

Aquinus

Resident Wat-man

You know, all of these block diagrams are cute and everything, but the fact of the matter is that 99% of consumers don't care about the internal parts of the CPU. You don't market block diagrams, you market simple information. Mind you, this entire argument is predicated on the idea that the FPU is essential to the operation of a CPU... it is not. Without the FPU story, none of these arguments make any sense at all. At this point, most of the arguments being made are pedantic in nature, "oh my god, look at this one individual block, it's shared so it can't possibly be a 'real core'™". Just because there are a handful of parts that are shared like how the module interacts with the rest of the CPU doesn't mean it's not a core. If Intel had to go the MCM route, a lot of parts of each chiplet would be shared as well. This is a natural result of pulling things apart and clustering them together and looks far more like a multi-CPU design than a modern SMT implementation. That doesn't make the cores any less less of a "real core".

Performance was bad for a couple reasons but, a very big reason was that the integer cores were gimped compared to previous generations and these block diagrams everyone is showing describes that; less ALUs and AGUs means less uOPs per clock, less uOPs per clock results in lower IPC numbers, and as a result, poor performance per core. You know what it doesn't result in? Worse per-core scaling and I think @cdawall already did an excellent job of illustrating that.

cdawallThe 9700K is an 8 core 8 thread CPU. It scores 214 points for the single threaded CB test, it scores 1513 points for the multithreaded test. That is a 7.07x speed up. In 2012 AMD was able to pull off a 6.7x speed up in that same benchmark and you are going to sit there and tell me it only had 4 cores?

If this doesn't inoculate us from this misconception, considering it's comparing apples to apples, then I don't know what will.

I'm sorry people, but an 8 core CPU doesn't have a requirement for those cores to not be crap. 8 crappy cores are still 8 cores and a core is still a core without the FPU. People are grasping at any straws to find substantial arguments at this point. The reality is that if you try to do that in court, they'll see what you're doing, because it means that your argument doesn't have a very strong foundation because you've changed it so many times to support a particular narrative.

#225

londiste

AquinusMind you, this entire argument is predicated on the idea that the FPU is essential to the operation of a CPU... it is not. Without the FPU story, none of these arguments make any sense at all. At this point, most of the arguments being made are pedantic in nature, "oh my god, look at this one individual block, it's shared so it can't possibly be a 'real core'™". Just because there are a handful of parts that are shared like how the module interacts with the rest of the CPU doesn't mean it's not a core.

- The argument is not predicated on the idea of FPU. Or Cache or Memory Controller. Or not even performance really although the stupid court case probably will have claims on that.
- The part that is NOT shared is one individual block. Everything else is shared.
- Core is a CPU by definition.

Add your own comment

Bulldozer Core-Count Debate Comes Back to Haunt AMD

369 Comments on Bulldozer Core-Count Debate Comes Back to Haunt AMD

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

Bulldozer Core-Count Debate Comes Back to Haunt AMD

Related News

369 Comments on Bulldozer Core-Count Debate Comes Back to Haunt AMD

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts