• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Dragged to Court over Core Count on "Bulldozer"

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.50/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
@Aquinus: again, performance is peripheral. AMD says right on the box "8-core" but everything under the hood says otherwise from Microsoft calling it "Cores: 4, Logical Processors 8" to Phenom II X6 beating it in the multithreaded tests where Bulldozer is supposed to excel, to the decoders being shared, to the FPU clearly using SMT, to the die shot looking a whole lot more like a monolithic core than a dual core from any other architecture (excepting Piledriver and Steamroller, of course). It should have been marketed as a "4 core" or maybe a "4+ core" to indicate it isn't traditional, not an "8 core." Had AMD called it what it really is, this lawsuit never would have happened. AMD is going to lose because it is patently obvious they stretched the meaning of "core" beyond the breaking limit. The only way the plaintiff loses is if he does a terrible job.


I think we can all agree there isn't much more to be said on this topic until there is a verdict. I'll take my leave until then.
 
Joined
May 18, 2010
Messages
3,427 (0.65/day)
System Name My baby
Processor Athlon II X4 620 @ 3.5GHz, 1.45v, NB @ 2700Mhz, HT @ 2700Mhz - 24hr prime95 stable
Motherboard Asus M4A785TD-V EVO
Cooling Sonic Tower Rev 2 with 120mm Akasa attached, Akasa @ Front, Xilence Red Wing 120mm @ Rear
Memory 8 GB G.Skills 1600Mhz
Video Card(s) ATI ASUS Crossfire 5850
Storage Crucial MX100 SATA 2.5 SSD
Display(s) Lenovo ThinkVision 27" (LEN P27h-10)
Case Antec VSK 2000 Black Tower Case
Audio Device(s) Onkyo TX-SR309 Receiver, 2x Kef Cresta 1, 1x Kef Center 20c
Power Supply OCZ StealthXstream II 600w, 4x12v/18A, 80% efficiency.
Software Windows 10 Professional 64-bit
@Aquinus: again, performance is peripheral. AMD says right on the box "8-core" but everything under the hood says otherwise from Microsoft calling it "Cores: 4, Logical Processors 8" to Phenom II X6 beating it in the multithreaded tests where Bulldozer is supposed to excel, to the decoders being shared, to the FPU clearly using SMT, to the die shot looking a whole lot more like a monolithic core than a dual core from any other architecture (excepting Piledriver and Steamroller, of course). It should have been marketed as a "4 core" or maybe a "4+ core" to indicate it isn't traditional, not an "8 core." Had AMD called it what it really is, this lawsuit never would have happened. AMD is going to lose because it is patently obvious they stretched the meaning of "core" beyond the breaking limit. The only way the plaintiff loses is if he does a terrible job.


I think we can all agree there isn't much more to be said on this topic until there is a verdict. I'll take my leave until then.

When you're a multinational lawsuits happen everyday from intellectual property disputes, workplace hazards, pension breaches, sexual harassment etc. This is a part of business.

If AMD called it a 4-core with physical hyper threading this lawsuit would have been avoided, but somebody could still turn around and sue saying AMD has an "unfair" performance monopoly by deliberately under-spec'ing their processors to outperform the competition. Not saying they would win, but a lawsuit could still be filed.
 

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
41,471 (6.58/day)
Location
Republic of Texas (True Patriot)
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
Then on multi-threaded score it only starts to catch up when the Intel CPU starts relying on hyper threading which goes back to "if they're not real cores, why do they scale like they are?"

Yeah a 8350 In Multithread only leads the 6700 by 4% when the CPU is OCd to 5.0GHz (104%vs 100%). I'd figure HyThr is like how DDR operates
 
Joined
Mar 24, 2011
Messages
2,356 (0.48/day)
Location
VT
Processor Intel i7-10700k
Motherboard Gigabyte Aurorus Ultra z490
Cooling Corsair H100i RGB
Memory 32GB (4x8GB) Corsair Vengeance DDR4-3200MHz
Video Card(s) MSI Gaming Trio X 3070 LHR
Display(s) ASUS MG278Q / AOC G2590FX
Case Corsair X4000 iCue
Audio Device(s) Onboard
Power Supply Corsair RM650x 650W Fully Modular
Software Windows 10
I would say if they can drive home the inability to disable 1 "core" and at minimum disabling 1 "module" then they will be able to win. AMD definitely moved the goal posts on what defines a "core". I remember the "Real Men Use Real Cores" campaign they ran against Intel in the Athlon X2 days, and going by their own definition Bulldozer didn't have 8 "cores".
 
Joined
Sep 6, 2013
Messages
748 (0.18/day)
Location
Oceania
A lot of every processor in existence takes more than one clock to complete a task. While the FPU is crunching on something, SMT allows another thread to be processed through the ALU which takes far fewer clocks. Another example is a thread having to wait because of a cache miss, the other thread keeps executing. Like Bulldozer, after instructions are decoded, a lot of the processor is out-of-order and that is where SMT occurs. The only thing different about Bulldozer is that there are two ALUs instead of one. The rest of the processor mimics SMT. I would never call a core with two ALUs a dual core.


12 "core" Xeon can send 6 commands per clock cycle.

There you go, corrected...;)



Btw on a different note here's a couple of benchmarks if anyone is curious....

Check out the latency on bottom screenshot....


Phenom II @4.0 (Singlethreaded)




Phenom II @4.4 (Single)





Vishera @3.5 (stock) (Single)




Vishera @5.0 (Singlethreaded)






Vishera @4.7 (Multithreaded)



(No bandwith sorry PC kept locking up with full test).
 
Last edited:

lotsofstupid

New Member
Joined
Jan 1, 2016
Messages
1 (0.00/day)
yes so an intel SX Cpu wasnt really a chip so we should sue intel for lying decades ago, the SX had no math co processor and you had to by it as an addon chip the intel DX had the chip built in, same thing here. Make -j4 has no issue working on linux and is twice as fast as make -j2
 

qubit

Overclocked quantum bit
Joined
Dec 6, 2007
Messages
17,865 (2.91/day)
Location
Quantum Well UK
System Name Quantumville™
Processor Intel Core i7-2700K @ 4GHz
Motherboard Asus P8Z68-V PRO/GEN3
Cooling Noctua NH-D14
Memory 16GB (2 x 8GB Corsair Vengeance Black DDR3 PC3-12800 C9 1600MHz)
Video Card(s) MSI RTX 2080 SUPER Gaming X Trio
Storage Samsung 850 Pro 256GB | WD Black 4TB | WD Blue 6TB
Display(s) ASUS ROG Strix XG27UQR (4K, 144Hz, G-SYNC compatible) | Asus MG28UQ (4K, 60Hz, FreeSync compatible)
Case Cooler Master HAF 922
Audio Device(s) Creative Sound Blaster X-Fi Fatal1ty PCIe
Power Supply Corsair AX1600i
Mouse Microsoft Intellimouse Pro - Black Shadow
Keyboard Yes
Software Windows 10 Pro 64-bit
yes so an intel SX Cpu wasnt really a chip so we should sue intel for lying decades ago, the SX had no math co processor and you had to by it as an addon chip the intel DX had the chip built in, same thing here. Make -j4 has no issue working on linux and is twice as fast as make -j2
Whut?! That makes no sense.

This isn't at all like with AMD. When Intel disabled the FPU to make a 486SX it didn't then market the chip as FPU capable, hence there's nothing to sue them over. In fact they actually marketed an FPU coprocessor to go with it to restore the missing function.

However, Bulldozer doesn't have 8 discrete cores, but rather 4 siamesed ones that share resources and have lower performance as a result. Completely different scenario so no wonder they're getting sued.
 

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,473 (4.13/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
Whut?! That makes no sense.

This isn't at all like with AMD. When Intel disabled the FPU to make a 486SX it didn't then market the chip as FPU capable, hence there's nothing to sue them over. In fact they actually marketed an FPU coprocessor to go with it to restore the missing function.

His point is that an FPU isn't required to be considered a full core. If it was then CPUs that didn't have FPUs would be considered 0-Core processors. You wouldn't call a 486SX a 0-Core processor, would you?

AFAIK, the 486 line didn't offer a separate FPU, there was no way to restore the lost function of the 486SX. It was the 386 line that had a separate FPU available. They made something called a i487SX, but it was really a full blown 486DX, when installed it would disable the 486SX completely and take over all CPU operations.

However, Bulldozer doesn't have 8 discrete cores, but rather 4 siamesed ones that share resources and have lower performance as a result. Completely different scenario so no wonder they're getting sued.

If sharing resources results in one core, than all of Intel's current desktop processors are all single-core processors...

Microsoft calling it "Cores: 4, Logical Processors 8

What a different company calls it doesn't really matter here. Remember when Microsoft used to call single core Pentium 4 processors with hyperthreading "two processors"? I do. So what Microsoft says doesn't really matter.

Also, there is other software, ones that are far more geared towards dealing with processor specs, that say they are 8-cores. CPU-Z says 8-Cores and 8-Threads. Microsoft's software can't even read the clock speed properly half the time, so I'd say we should listen to CPU-Z over what Windows says.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.50/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
Do realize that when FPUs were co-processors, the concept of a "core" didn't exist. Skip forward a few years and co-processors were relegated to history. Skip forward about a decade and you have AMD producing two processors on one die. Intel quickly follows with multi-chip modules on a die to get two processors. Skip forward about half a decade from there and you finally reach AMD sharing the FPU with two ALUs. You have to dredge up 20 years of history to reach that conclusion. In computing, that's how many lifetimes? 10? It doesn't work that way. AMD is trying to redefine the definition of "core" to mislead consumers into thinking they're getting more than they got. This lawsuit is completely justified and should have happened years ago.

...but we've already been over all that, haven't we?
 
Last edited:

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,473 (4.13/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
Do realize that when FPUs were co-processors, the concept of a "core" didn't exist. Skip forward a few years and co-processors were relegated to history. Skip forward about a decade and you have AMD producing two processors on one die. Intel quickly follows with multi-chip modules on a die to get two processors. Skip forward about half a decade from there and you finally reach AMD sharing the FPU with two ALUs. You have to dredge up 20 years of history to reach that conclusion. In computing, that's how many lifetimes? 10? It doesn't work that way. AMD is trying to redefine the definition of "core" to mislead consumers into thinking they're getting more than they got. This lawsuit is completely justified and should have happened years ago.


...but we've already been over all that, haven't we?

Except Intel redefined what a Core was just the same. Sharing resources isn't justification for calling something one core. When Intel decided that 2 of their cores would share a single L2 cache, we all didn't say the cores weren't two cores. The story of the L2 cache is very similar to the FPU. It was something that started as being completely separate from the processor(just like the FPU), something that was eventually integrated into the processor(just like the FPU), and just like the FPU, each core had its own L2 before Intel decided to have 2 cores share a single L2 cache.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.50/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
What a different company calls it doesn't really matter here. Remember when Microsoft used to call single core Pentium 4 processors with hyperthreading "two processors"? I do. So what Microsoft says doesn't really matter.
That was Windows XP and XP only has two states: uniprocessor (one thread at a time) and multiprocessor (two or more threads at a time). Multiprocessor could mean two physical sockets with one core each, one socket with two cores, or one physical + one logic processor. It was updated to better handle the three variations.

Bulldozer did the same thing with Vista. Vista (I believe 7 too) called it eight-cores because it was incapable of distinguishing them but that apparently caused problems because updates were released to fix core parking issues. Come Windows 8 and newer, Microsoft updated the operating system to definitively account for sockets, cores, and logic processors which is where we see 4 cores and 8 logic processors.

Also, there is other software, ones that are far more geared towards dealing with processor specs, that say they are 8-cores. CPU-Z says 8-Cores and 8-Threads. Microsoft's software can't even read the clock speed properly half the time, so I'd say we should listen to CPU-Z over what Windows says.
CPU-Z doesn't need to schedules threads. Windows does. Microsoft did what they did deliberately so the scheduler best utilizes the processor resources.

Except Intel redefined what a Core was just the same. Sharing resources isn't justification for calling something one core. When Intel decided that 2 of their cores would share a single L2 cache, we all didn't say the cores weren't two cores. The story of the L2 cache is very similar to the FPU. It was something that started as being completely separate from the processor(just like the FPU), something that was eventually integrated into the processor(just like the FPU), and just like the FPU, each core had its own L2 before Intel decided to have 2 cores share a single L2 cache.
Caches have always been tiered. The closer the tier is to the ALUs and FPUs, the faster it is. Caches completely lack logic and there's numerous advantages, and virtually no disadvantages, to sharing caches (scheduler will allot the cache evenly when the load is even).

There's only a handful of FPUs shared in the computing world outside of Bulldozer (and derivatives) and all of them are set up in a way that resembles a co-processor. That is, it has it's own scheduler and all of the cores can queue work to it--effectively its own core. They don't market it as having an extra core though because that would be misleading.
 
Last edited:

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,473 (4.13/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
That was Windows XP and XP only has two states: uniprocessor (one thread at a time) and multiprocessor (two or more threads at a time). Multiprocessor could mean two physical sockets with one core each, one socket with two cores, or one physical + one logic processor. It was updated to better handle the three variations.

Bulldozer did the same thing with Vista. Vista (I believe 7 too) called it eight-cores because it was incapable of distinguishing them but that apparently caused problems because updates were released to fix core parking issues. Come Windows 8 and newer, Microsoft updated the operating system to definitively account for sockets, cores, and logic processors which is where we see 4 cores and 8 logic processors.

How Microsoft labels it in their OS to make their OS work better doesn't matter. The purpose of their software isn't to give CPU specs, and like I said, even in their current OSes they can't even get the CPU clock speeds right a lot of the time.

CPU-Z doesn't need to schedules threads. Windows does. Microsoft did what they did deliberately so the scheduler best utilizes the processor resources.

Yes, but what CPU-Z does do is give CPU specs. In fact that is all it does, and the program designed to tell you what specs your CPU has says 8 Cores. Microsoft's software is one of the few that actually says 4 Cores, almost everything else says 8 Core. Every program that gives CPU specs says 8 cores. LInux says 8 Cores(well CPUs actually).

Caches have always been tiered. The closer the tier is to the ALUs and FPUs, the faster it is. Caches completely lack logic and there's numerous advantages, and virtually no disadvantages, to sharing caches.

The disadvantage to sharing cache is that it is slower than not sharing cache. If each core had 4MB of cache it would be faster than sharing 4MB of cache.

Either way, L2 cache's story is very similar to the FPU. It was separate, it then was integrated, it then was shared between two cores. That doesn't make the two cores count as one.

If we are going to let Intel get away with sharing resources and still calling them separate cores, then we have to allow AMD.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.50/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
How Microsoft labels it in their OS to make their OS work better doesn't matter. The purpose of their software isn't to give CPU specs, and like I said, even in their current OSes they can't even get the CPU clock speeds right a lot of the time.
The purpose of their software is to describe what is present.

Microsoft doesn't deal with clock speeds, they deal with processor states. The clock speed data they do provide is only as a convenience. That said, it appears accurate to me in Windows 10. It's pretty obvious Microsoft put a lot of effort into understanding the processor in more recent versions of Windows (probably because of their work with ARM).

The disadvantage to sharing cache is that it is slower than not sharing cache. If each core had 4MB of cache it would be faster than sharing 4MB of cache.
Yes, but would also cost a lot more as well as consuming more power and producing more heat.

Either way, L2 cache's story is very similar to the FPU. It was separate, it then was integrated, it then was shared between two cores. That doesn't make the two cores count as one.
Let's use the test of disabling cores. Where L2 is shared, can you disable half of the cores above it and still have the processor function perfectly normal? With L2 (and L3, and L4, and so on), the definitive answer is "yes." Does Bulldozer pass the same test? The definitive answer is "no." The former constitutes of legitimate cores while the latter does not.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,159 (2.84/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Sharing resources isn't justification for calling something one core.
I think this is where it all ends. When push comes to shove, FPU operations can be done on an integer core at the expense of extra cycles. There are many older ARM cores that do this that lack a real FPU, it's just not commonly done anymore because of performance.
That is, it has it's own scheduler and all of the cores can queue work to it--effectively its own core.
Sorry to bust your bubble but, Bulldozer does have a scheduler for each FPU per module as well as as a scheduler for every integer core. The only components they share excluding cache are fetch and decoding units, something that any core will require. On the original BD chip, that was actually a significant problem because that became a bottleneck in the CPU which is why later in Steamroller, an extra decode unit was added to each module which leaves what, a single shared fetch unit?

People shouldn't be bitching about if they're "real" cores or not. They should be questioning why the integer cores suck in the first place. It's not because of shared components, it's because each core is actually gimped. I posted this earlier but maybe people have short memories. Explain to me why BD can only process practically half as many instructions per clock versus Haswell? That alone will contribute to cruddy performance, you don't even have to look further than the integer cores to figure out that one.

People are blaming one thing, when they should be blaming another. Most operations in a CPU are going to be integer operations. While floating point math is used often, it's not used as often as the integer ALU in most circumstances which is why AMD shared it in the first place. What AMD screwed up is gimping the integer cores.

For those with a short memory or the inability to go back a page or two:
Bullshit. There are a lot of instructions that not only execute in 1 second cycle, it can sometimes do several of the same instruction at once.

Before I grab part of this document, I will quote it:

View attachment 69108
Source: https://gmplib.org/~tege/x86-timing.pdf
Lets look at Sandy Bridge for a minute:
add, sub, and, or, xor inc, dec, neg, and not all execute in a single clock cycle and can process 3 of these uOps at once per core. Haswell expanded that to 4 uOps per cycle from 3 on SB. Even AMD's K10 was the same way but then you look at AMD's BD1 (which is what we're all huffy about,) and you notice that these same instructions can only do 2 uOps per clock cycle on Bulldozer. Then there are cases like double shift left and right which has a fraction of the performance on BD versus modern Intel CPUs.

People need to get their information right. Bulldozer is slow because dedicated components are skimped on, the fact that instructions usually take the same number of cycles as its Intel counterpart in many cases however, have much less throughput resulting in uOps having to be run more often than they would otherwise, which increases latency and translates certain full instructions into a longer set of uOps because of the CPU. So you might have an instruction with uOps that an Intel CPU could execute in one clock cycle but the AMD CPU might need two because it doesn't have enough resources in a single core to do it all at once.

For what its worth, Intel cores might not execute instructions "faster" but, it's that they can do more of them in a single clock cycle but both AMD and Intel both have a lot of core x86 instructions that not only occur in one cycle but, can execute multiple of the same uOps in the same cycle, which is where pipelining comes into play for instructions that allow pipelining.

It's also worth noting that there are x86 instructions that are not pipelined for various reasons. That's in this other document:
http://www.intel.com/content/www/us...-ia-32-architectures-optimization-manual.html

Simply put, Bulldozer didn't suck because of a shared FPU, it sucks because they gimped the integer cores worse than on K10 (per clock).
 

de.das.dude

Pro Indian Modder
Joined
Jun 13, 2010
Messages
8,912 (1.71/day)
Location
Internet is borked, please help.
System Name Monke | Work Thinkpad| Old Monke
Processor Ryzen 5600X | Ryzen 5500U | FX8320
Motherboard ASRock B550 Extreme4 | ? | Asrock 990FX Extreme 4
Cooling 240mm Rad | Not needed | hyper 212 EVO
Memory 2x16GB DDR4 3600 Corsair RGB | 16 GB DDR4 3600 | 16GB DDR3 1600
Video Card(s) Sapphire Pulse RX6700XT 12GB | Vega 8 | Sapphire Pulse RX580 8GB
Storage Samsung 980 nvme (Primary) | some samsung SSD
Display(s) Dell 2723DS | Some 14" 1080p 98%sRGB IPS | Dell 2240L
Case Ant Esports Tempered case | Thinkpad | Antec
Audio Device(s) Logitech Z333 | Jabra corpo stuff
Power Supply Corsair RM750e | not needed | Corsair GS 600
Mouse Logitech G400 | nipple
Keyboard Logitech G213 | stock kb is awesome | Logitech K230
VR HMD ;_;
Software Windows 10 Professional x3
Benchmark Scores There are no marks on my bench
this is just stupid :/
have an 8 core processor and its definitely better at multi threaded stuff than others. Plus it cost a fraction of what intel has to offer.
 

qubit

Overclocked quantum bit
Joined
Dec 6, 2007
Messages
17,865 (2.91/day)
Location
Quantum Well UK
System Name Quantumville™
Processor Intel Core i7-2700K @ 4GHz
Motherboard Asus P8Z68-V PRO/GEN3
Cooling Noctua NH-D14
Memory 16GB (2 x 8GB Corsair Vengeance Black DDR3 PC3-12800 C9 1600MHz)
Video Card(s) MSI RTX 2080 SUPER Gaming X Trio
Storage Samsung 850 Pro 256GB | WD Black 4TB | WD Blue 6TB
Display(s) ASUS ROG Strix XG27UQR (4K, 144Hz, G-SYNC compatible) | Asus MG28UQ (4K, 60Hz, FreeSync compatible)
Case Cooler Master HAF 922
Audio Device(s) Creative Sound Blaster X-Fi Fatal1ty PCIe
Power Supply Corsair AX1600i
Mouse Microsoft Intellimouse Pro - Black Shadow
Keyboard Yes
Software Windows 10 Pro 64-bit
Sorry NT, I'm gonna write my responses in your quote in a different colour, because it's a bit easier that way than all the fiddly cutting and pasting of tags. :)

His point is that an FPU isn't required to be considered a full core. If it was then CPUs that didn't have FPUs would be considered 0-Core processors. You wouldn't call a 486SX a 0-Core processor, would you?

AFAIK, the 486 line didn't offer a separate FPU, there was no way to restore the lost function of the 486SX. It was the 386 line that had a separate FPU available. They made something called a i487SX, but it was really a full blown 486DX, when installed it would disable the 486SX completely and take over all CPU operations.

Maybe he meant that, but he really could have put some more effort into making that post more coherent and I didn't say that an FPU is required for a full core, either. I've got lots of old computers here without FPUs (6502 & ARM2/3 based) which are most definitely full CPUs. He does defend AMD though by saying that Intel should be also be sued because "yes so an intel SX Cpu wasnt really a chip so we should sue intel for lying decades ago" which is completely wrong and that's what I set him straight on. No one is arguing that the removal/disabling of the FPU is a reason to sue a company. It only applies if the company inflates the capabilities of the crippled chip, which Intel didn't do, but AMD did.

Your AFAIK bit is about right and brings back old memories of the 80s computing scene and Byte magazine.
:)

If sharing resources results in one core, than all of Intel's current desktop processors are all single-core processors...

No, come on, that's not the kind of sharing I'm talking about and everyone understands. Intel will have common caches between the cores, but otherwise the CPU cores are all separate, so a quad core CPU will literally have the processing core duplicated 4 times, along with common control logic and buses to make them all work together - it even shows up clearly in a die photo. Here's one of a Sandy Bridge die:




AMD made parts of the core shared instead in a siamesed way, closer to a form of one core + hyperthreading than two actual cores and called them modules, yet still marketed them all as separate cores. Hence the claims of 8 core processors when they weren't really.
 

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,473 (4.13/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
The purpose of their software is to describe what is present.

Microsoft doesn't deal with clock speeds, they deal with processor states. The clock speed data they do provide is only as a convenience. That said, it appears accurate to me in Windows 10. It's pretty obvious Microsoft put a lot of effort into understanding the processor in more recent versions of Windows (probably because of their work with ARM).

Yet they list the processor speed(often times wrong), right next to where they list the cores and logical threads, they don't list processor states. In fact, by your same logic, they aren't actually dealing with cores and threads either, they are dealing with the schedulers on the processor.

Yes, but would also cost a lot more as well as consuming more power and producing more heat.

Just like including more FPUs.;)

Let's use the test of disabling cores. Where L2 is shared, can you disable half of the cores above it and still have the processor function perfectly normal? With L2 (and L3, and L4, and so on), the definitive answer is "yes." Does Bulldozer pass the same test? The definitive answer is "no." The former constitutes of legitimate cores while the latter does not.

I don't think there is really anything technically stopping an integer core on a bulldozer CPU from being being disabled. Each integer core has its own scheduler, so technically you just have to disable that scheduler and the integer core will be effectively disabled as well.

Simply put, Bulldozer didn't suck because of a shared FPU, it sucks because they gimped the integer cores worse than on K10 (per clock).

Exactly. Their Integer cores were weak on Bulldozer. On top of this, the integer can in fact do all the work that the FPU does(just much slower). The fact is the CPU could have no FPUs at all and still technically function. It would just be slow as hell whenever work would have been accelerated by the FPU.

No one is arguing that the removal/disabling of the FPU is a reason to sue a company.

Actually, I'm pretty sure that is exactly what most are arguing and what this entire lawsuit is based on. The fact that the CPU only has 4 FPUs is the basis to claiming it is a 4 core CPU not an 8.

It only applies if the company inflates the capabilities of the crippled chip, which Intel didn't do, but AMD did.

The entire claim is that AMD called the processor an 8 Core. They didn't inflate the capabilities of the chip. It is capable of doing 8 things at the exact same time, something a 4 core CPU can never do(even with HypterThreading), so it is an 8 core CPU.

AMD made parts of the core shared instead in a siamesed way, closer to a form of one core + hyperthreading than two actual cores and called them modules, yet still marketed them all as separate cores. Hence the claims of 8 core processors when they weren't really.

Yeah, but that is the slope we allowed them to slide down. Intel started sharing resources between cores, so AMD started doing it. Each did it to varying degrees. Intel did it with L2 and L3, AMD took it a step further and did it with FPUs. Did they go too far? That is up to the court. But since an x86 CPU can function completely without any FPU, technically, then calling a processing core that doesn't have its own FPU a x86 core is legitimate, IMO. But that is left up to the court at this point, so we'll just have to wait and see.
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,159 (2.84/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
The entire claim is that AMD called the processor an 8 Core. They didn't inflate the capabilities of the chip. It is capable of doing 8 things at the exact same time, something a 4 core CPU can never do(even with HypterThreading), so it is an 8 core CPU.
Not just that but, I'm willing to bet that given purely parallel workloads, that the CPU will scale almost linearly to the number of threads being utilized. Even doing floating point calculations, there are a lot of integer calculations that come before and after it to handle things like memory addressing and stepping through sets of data. There really is no such thing as "pure floating point workloads," because there are control structures and memory operations that require integers and utilize the ALU.

All in all, I think we can say that bulldozer sucked because of the length of the pipeline and it's reduced ability to execute certain uOps in parallel. The pipeline introduces stalls and increases the amount of work when a stall occurs. Not being able to process as many uOps per cycle could very well mean that certain X86 instructions might require more clock cycles to complete on BD than on K10 or on an Intel CPU. None of which has to do with the FPU.

I won't deny that the CPU's FP performance is lesser than having 8 dedicated FPUs but, the question is would the CPU suck less if it did but, I don't think that's the case. There are a lot of things wrong with the architecture and the shared FPU isn't even among the biggest issues in my opinion. AMD went with slimmed down cores in order to add more of them which was a fatal mistake.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.50/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
this is just stupid :/
have an 8 core processor and its definitely better at multi threaded stuff than others. Plus it cost a fraction of what intel has to offer.
Benchmarks disagree. Quad-core Intel processors handily beat "8-core" AMD at virtually everything multithreaded. You put an 8-core Intel up against AMD's "8-core," AMD gets slaughtered. You don't have an "8-core" processor, you have a quad-core processor accepting 8 threads. Hell, you can even put an older AMD 6-core up against these AMD "8-core" processors and the 6-core will give the "8-core" a severe lashing.

I don't think there is really anything technically stopping an integer core on a bulldozer CPU from being being disabled. Each integer core has its own scheduler, so technically you just have to disable that scheduler and the integer core will be effectively disabled as well.
"Think." AMD would have done it if they could.

On top of this, the integer can in fact do all the work that the FPU does(just much slower).
You're talking about software emulation. That's a moot argument because all x86 processors on the planet can emulate floating points through software.

Actually, I'm pretty sure that is exactly what most are arguing and what this entire lawsuit is based on. The fact that the CPU only has 4 FPUs is the basis to claiming it is a 4 core CPU not an 8.
The lawsuit argues that there are not 8 discreet cores, only 4. The fact there is four FPUs is a technicality. The judge has to ask "did AMD mislead" by branding their FX-8### processors at "8-core" processors. They never say "integer cores" on the packaging; they leave out that keyword "integer." I don't see how a judge could rule in AMD's favor. AMD definitely did mislead the public.

It is capable of doing 8 things at the exact same time, something a 4 core CPU can never do(even with HypterThreading), so it is an 8 core CPU.
Hyperthreading does enable a quad core processor to do 8 integer operations at the same time if the conditions are met. This is how Intel quad-cores can best AMD's "8-core" processors at their own game.

AMD went with slimmed down cores in order to add more of them which was a fatal mistake.
Which plays into the narrative that AMD intended to mislead the public.
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,159 (2.84/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Benchmarks disagree. Quad-core Intel processors handily beat "8-core" AMD at virtually everything multithreaded. You put an 8-core Intel up against AMD's "8-core," AMD gets slaughtered. You don't have an "8-core" processor, you have a quad-core processor accepting 8 threads. Hell, you can even put an older AMD 6-core up against these AMD "8-core" processors and the 6-core will give the "8-core" a severe lashing.
Once again, that's a result of single-threaded performance being garbage. It's not how much it outperforms Intel, it's how much improvement each added core contributes to multi-threaded performance. If scaling is near linear, I would argue that shared components aren't contributing to bad performance. Read #243. This also has arguably very little to do with a shared FPU and you've yet to provide any information that counters my argument other than repeating yourself.
The lawsuit argues that there are not 8 discreet cores, only 4. The fact there is four FPUs is a technicality. The judge has to ask "did AMD mislead" by branding their FX-8### processors at "8-core" processors. They never say "integer cores" on the packaging. They leave out that keyword "integer." I don't see how a judge could rule in AMD's favor. AMD definitely did mislead the public.
You don't have to, an x86 CPU can't run without integer cores. Period. End of story. It's not how x86 CPUs work. I know you made the argument that you can do all of this with a FPU but then it wouldn't be an x86 CPU anymore now would it?
Hyperthreading does enable a quad core processor to do 8 integer operations at the same time if the conditions are met. This is how Intel quad-cores can best AMD's "8-core" processors at their own game.
WRONG! It enables Intel CPUs to use the unused resources in the CPU, you're still limited by the hardware within a single core which means that if an instruction is eating up the resources, hyper-threading gets you nothing. It uses the core's resources more efficiently, nothing more, nothing less. If the "conditions are right," then it's probably not already using those resources.
 
Last edited:

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.50/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
Everything said has been said before. I'll leave this here:
 
Last edited:

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,473 (4.13/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
"Think." AMD would have done it if they could.

Why do you say that? Intel hasn't done it and we know they can. And AMD stopped doing that with the first runs of Phenom II. They changed their strategy with odd numbers of cores in the late Phenom II days. Even though there were 6 Cores processors with a single bad core, they disabled two full cores and sold them as x4 processors. When Thurban came out in 2010 AMD stopped the odd-core counts.

You're talking about software emulation. That's a moot argument because all x86 processors on the planet can emulate floating points through software.

No, it is not software emulation. Way before CPUs had FPUs, the x86 processors did the work. It is still hardware doing the work, it is not software emulation, it is just very slow at doing it. That is the idea of a general CPU core. It can do anything, it just isn't fast at doing it because it is built for general use. The integer cores on Bulldozer are general CPU cores, that can do any type of work you ask it.

The lawsuit argues that there are not 8 discreet cores, only 4. The fact there is four FPUs is a technicality. The judge has to ask "did AMD mislead" by branding their FX-8### processors at "8-core" processors. They never say "integer cores" on the packaging; they leave out that keyword "integer." I don't see how a judge could rule in AMD's favor. AMD definitely did mislead the public.

The integer core is a general x86 core. What they labeled as the integer core was considered an entire x86 processor at one point in time.

Hyperthreading does enable a quad core processor to do 8 integer operations at the same time if the conditions are met. This is how Intel quad-cores can best AMD's "8-core" processors at their own game.

No, they can't. They can only work on 4 integer operations at one time. What hyperthreading does is set the processor up to switch extremely quickly between doing those operation to give the user the appearance of the integer core doing two things at the same time. What HT does is have two operations waiting for the integer core, so when it finishes doing one operation it doesn't have to wait to execute the next. So clock cycles that would normally be wasted are actually used. The integer core can never work on two things at the exact same time. If the integer core could actually work on two integer tasks at the same time, integer work on a desktop i7 would be basically doubled an i5. That never happens, the difference is never even very big, because it is just using idle cycles to do extra work, not actually doing two things at once.

And Intel beats AMD with less cores because Intel's processors are way more powerful than AMD's. So even with 8-Cores AMD can't top Intel. AMD's 6-Core Phenoms couldn't beat Intel's 4 cores either, Intel's cores are just a lot lot faster.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,159 (2.84/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
That's because the dispatcher pulls it apart into two different pipelines between the integer cores whereas in Thabun the two integer units are parts of the same pipeline. So sure, for a moment it's one cohesive unit but, there comes a point where there was added hardware (the dispatcher, something K12 did not have,) pulls them apart to enable them to operate independently, hence why it's called a module. The question is, what does that module house? This is something that Intel's SMT does not do. If you have conjoined twins at the hip with some shared organs, does that make it one child? I would argue that it doesn't.

My argument really boils down to two pipelines = two cores. 1 decode unit that can do 4 uOps or 2 decode units that can do 2 uOps each doesn't make a difference to me, it's doing the same thing.

Intel's HT is simply filling the gaps in the pipeline when the first thread isn't utilizing the entire thing in order to run a second thread, that's it. It's also why scaling is task dependent with HT however, scaling on FX CPUs, once again, tends to be almost linear which isn't indicative of SMT-like behavior on a single core.
 
Last edited:

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.50/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
What hyperthreading does is set the processor up to switch extremely quickly between doing those operation to give the user the appearance of the integer core doing two things at the same time.
I see four ALUs (two INT, two FLOP):

It should be able to do one basic ALU operation per clock, per thread. More complex operations will cause one thread to be blocked so it would fall to one ALU operation per clock across two threads.

And I should stress on that image that the whole thing is a "core," not just the integer portion. You can't have a discreet x86 "core" without an x86 decoder.

Note that SMT increases latency which is why the performance gains are not very good.


Edit:
It's also why scaling is task dependent with HT however, scaling on FX CPUs, once again, tends to be almost linear which isn't indicative of SMT-like behavior on a single core.
http://www.overclock.net/t/1469255/fx-8350-trying-to-get-best-performance-per-core
Ultracarpet said:
Also I goofed around with disabling cores and such to see if there was a performance difference because apparently there was with bulldozer... there wasn't with piledriver. I got the exact same cinebench scores if I ran a 4 thread bench will all 8 cores enabled compared to a 4 thread bench with only 4 cores enabled.
Trying to find better benchmarks but I'm not coming up with them.
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,159 (2.84/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
It should be able to do one basic ALU operation per clock, per thread. More complex operations will cause one thread to be blocked.
That's not how hyper-threading works, hyper-threading utilizes unused parts of the pipeline to run that second thread, it doesn't do any parallel execution on a single stage. What it can do is execute multiple of the same kind of uOP on the ALU at any given time that is to say, if you have 3 of the same uOps per instruction in a row on data that's not dependent on the results, the CPU can execute them in parallel to some extent. You're conflating instruction-level parallelism (parallel uOps,) and thread-level parallelism (parallel instructions). Two very different things. Simply put, the only time uOps can be executed in parallel on the ALU is when they're the same uOp. You can't add and sub at the same time.

With that said, most modern super-scalar CPUs already handle instruction-level parallelism internally when instructions are decoded.

Side note, back in the day on older x86 processors, there were a lot fewer bells and whistles and the core of an x86 EU was the part you said isn't a core. ;)
 
Last edited:
Top