• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Possible AMD "Vermeer" Clock Speeds Hint at IPC Gain

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,315 (7.52/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
The bulk of AMD's 4th generation Ryzen desktop processors will comprise of "Vermeer," a high core-count socket AM4 processor and successor to the current-generation "Matisse." These chips combine up to two "Zen 3" CCDs with a cIOD (client I/O controller die). While the maximum core count of each chiplet isn't known, they will implement the "Zen 3" microarchitecture, which reportedly does away with CCX to get all cores on the CCD to share a single large L3 cache, this is expected to bring about improved inter-core latencies. AMD's generational IPC uplifting efforts could also include improving bandwidth between the various on-die components (something we saw signs of in the "Zen 2" based "Renoir"). The company is also expected to leverage a newer 7 nm-class silicon fabrication node at TSMC (either N7P or N7+), to increase clock speeds - or so we thought.

An Igor's Lab report points to the possibility of AMD gunning for efficiency, by letting the IPC gains handle the bulk of Vermeer's competitiveness against Intel's offerings, not clock-speeds. The report decodes OPNs (ordering part numbers) of two upcoming Vermeer parts, one 8-core and the other 16-core. While the 8-core part has some generational clock speed increases (by around 200 MHz on the base clock), the 16-core part has lower max boost clock speeds than the 3950X. Then again, the OPNs reference A0 revision, which could mean that these are engineering samples that will help AMD's ecosystem partners to build their products around these processors (think motherboard- or memory vendors), and that the retail product could come with higher clock speeds after all. We'll find out in September, when AMD is expected to debut its 4th generation Ryzen desktop processor family, around the same time NVIDIA launches GeForce "Ampere."



View at TechPowerUp Main Site
 
Joined
Mar 16, 2017
Messages
2,170 (0.76/day)
Location
Tanagra
System Name Budget Box
Processor Xeon E5-2667v2
Motherboard ASUS P9X79 Pro
Cooling Some cheap tower cooler, I dunno
Memory 32GB 1866-DDR3 ECC
Video Card(s) XFX RX 5600XT
Storage WD NVME 1GB
Display(s) ASUS Pro Art 27"
Case Antec P7 Neo
Vermeer is not just the name of a Baroque artist, but it also a company that makes commercial-grade wood chippers. Will the 4000 series grind up heavy work with ease? :)
 
Joined
Dec 30, 2010
Messages
2,202 (0.43/day)
They need to work out that 7nm proces so that all core clocks of now boost levels should be overcome. When AMD can accomplish that, then it's pretty much bye bye intel.

It's always bin low current > high clocks OR high current > low clocks. But a FX could easily consume up to 220W and even 300W in extreme conditions easily.
 
Joined
Mar 31, 2012
Messages
863 (0.19/day)
Location
NL
System Name SIGSEGV
Processor AMD Ryzen 9 9950X
Motherboard MSI MEG ACE X670E
Cooling Noctua NF-A14 IndustrialPPC Fan 3000RPM | Arctic P14 MAX
Memory Fury Beast 64 Gb CL30
Video Card(s) TUF 4090 OC
Storage 1TB 7200/256 SSD PCIE | ~ TB | 970 Evo | WD Black SN850X 2TB
Display(s) 27" /34"
Case O11 EVO XL
Audio Device(s) Realtek
Power Supply FSP Hydro TI 1000
Mouse g402
Keyboard Leopold|Ducky
Software LinuxMint
Benchmark Scores i dont care about scores
We'll find out in September, when AMD is expected to debut its 4th generation Ryzen desktop processor family, around the same time NVIDIA launches GeForce "Ampere."
perfect.
however, I would eagerly wait for AMD's Radeon next-gen GPUs offering.
 
Joined
Jul 18, 2017
Messages
575 (0.21/day)
Get rid of the crappy glue and put 8 cores in a single CCD and it might finally slay the legendary Skylake in gaming. We all know Ampere will increase GPU threshold in high res and AMD needs to do exactly this to close the gap. Can you imagine still losing to Skylake while using 7nm+ process this time around. Put 16 cores in a single CCD with Zen 4 and that should be a beast.
 
Joined
Jul 9, 2015
Messages
3,413 (0.99/day)
System Name M3401 notebook
Processor 5600H
Motherboard NA
Memory 16GB
Video Card(s) 3050
Storage 500GB SSD
Display(s) 14" OLED screen of the laptop
Software Windows 10
Benchmark Scores 3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.
legendary Skylake in gaming

XCOM: Chimera Squad at totally reasonable 1080p resolution, while equipped with $1.3k GPU:

1589949231224.png
 
Joined
Jul 18, 2017
Messages
575 (0.21/day)
XCOM: Chimera Squad at totally reasonable 1080p resolution, while equipped with $1.3k GPU:

View attachment 155930
Cool nitpick. Now let's look at multiple games in an average:

lawl.png



ttZT2gfoskvFfXD3DfyFcF-650-80.png.png



RTX 2080 Ti bottlenecks the fastest Skylake model while the fastest Zen 2 model clearly bottlenecks the RTX 2080 ti. You know Ampere is going to make Zen 2 look even worse right?
 
Joined
Jun 10, 2014
Messages
3,006 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
Estimates for Zen 3's IPC gains are all over the place. Some claim Zen 3 is a minor improvement, while others claim it's a major architectural overhaul, but we'll see.
Nevertheless, IPC improvements is the area to focus on going forward, and any IPC improvement is appreciated.

They need to work out that 7nm proces so that all core clocks of now boost levels should be overcome. When AMD can accomplish that, then it's pretty much bye bye intel.
Intel's rated boost clock speeds are too optimistic, and they usually throttle quite a bit when the power limit kicks in. So in reality, with high sustained load on multiple cores, AMD often matches or exceeds Intel in actual clock speeds.

I don't think AMD should be pushing too hard on unstable boost speeds, what we need is good sustained performance. AMD needs to work on the areas where they fall behind Intel, primarily the CPU front-end and memory controller latency. The CPU front-end is one of the largest area of improvement in Sunny Cove over Skylake, so AMD needs to step up here.
 
Joined
Sep 17, 2014
Messages
22,757 (6.05/day)
Location
The Washing Machine
System Name Tiny the White Yeti
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
VR HMD HD 420 - Green Edition ;)
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
"Wait darling, are you seriously saying 'more IPC' now?"

1589961301079.png


"Indeed!"

1589961365027.png
 
Joined
Jun 26, 2015
Messages
69 (0.02/day)
Get rid of the crappy glue and put 8 cores in a single CCD and it might finally slay the legendary Skylake in gaming. We all know Ampere will increase GPU threshold in high res and AMD needs to do exactly this to close the gap. Can you imagine still losing to Skylake while using 7nm+ process this time around. Put 16 cores in a single CCD with Zen 4 and that should be a beast.
They already do have 8 cores in a CCD, perhaps you do not know what a CCD is ;)
 

ARF

Joined
Jan 28, 2020
Messages
4,670 (2.59/day)
Location
Ex-usa | slava the trolls
15-20% IPC and +200 MHz betterment, because the CCX will be now with 8 cores, up from just 4, and the caches will be unified for the whole CCd.
 
Joined
Jul 9, 2015
Messages
3,413 (0.99/day)
System Name M3401 notebook
Processor 5600H
Motherboard NA
Memory 16GB
Video Card(s) 3050
Storage 500GB SSD
Display(s) 14" OLED screen of the laptop
Software Windows 10
Benchmark Scores 3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.
Now let's look at multiple games in an average:

Note the following, stranger:

1) 1080p gaming on $1.3k makes no sense. We do it, to figure "how CPUs will behave in the future". It's an arguable theory, that assumes that future performance could be deducted by testing stuff at unrealistic resolutions
2) New games show completely different behavior. Games that actually do use CPU power (lot's of AI stuff going in that game) get us to that picture which is outright embarrassing to Intel

Mkay?
Low resolution tests of archaic games are only good for easing the pain of the blue fans.
On top of low resolution tests using the fastest card money can buy being questionable on its own.
 
Joined
May 15, 2020
Messages
697 (0.41/day)
Location
France
System Name Home
Processor Ryzen 3600X
Motherboard MSI Tomahawk 450 MAX
Cooling Noctua NH-U14S
Memory 16GB Crucial Ballistix 3600 MHz DDR4 CAS 16
Video Card(s) MSI RX 5700XT EVOKE OC
Storage Samsung 970 PRO 512 GB
Display(s) ASUS VA326HR + MSI Optix G24C4
Case MSI - MAG Forge 100M
Power Supply Aerocool Lux RGB M 650W
Note the following, stranger:

1) 1080p gaming on $1.3k makes no sense. We do it, to figure "how CPUs will behave in the future". It's an arguable theory, that assumes that future performance could be deducted by testing stuff at unrealistic resolutions

Just to make the devil 's advocate, it makes sense for competitive FPS players. The guys that use 240+ FPS monitors like this one: https://www.techpowerup.com/267368/alienware-announces-aw2521h-360hz-gaming-monitor
Other than that, I agree with you that more threads will probably age better in gaming, given the architecture of next-gen consoles.
 

Cheeseball

Not a Potato
Supporter
Joined
Jan 2, 2009
Messages
2,058 (0.35/day)
Location
Pittsburgh, PA
System Name Titan
Processor AMD Ryzen™ 7 7950X3D
Motherboard ASRock X870 Taichi Lite
Cooling Thermalright Phantom Spirit 120 EVO CPU
Memory TEAMGROUP T-Force Delta RGB 2x16GB DDR5-6000 CL30
Video Card(s) ASRock Radeon RX 7900 XTX 24 GB GDDR6 (MBA)
Storage Crucial T500 2TB x 3
Display(s) LG 32GS95UE-B, ASUS ROG Swift OLED (PG27AQDP), LG C4 42" (OLED42C4PUA)
Case Cooler Master QUBE 500 Flatpack Macaron
Audio Device(s) Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Cloud Alpha Wireless
Power Supply Corsair SF1000
Mouse Logitech Pro Superlight 2 (White), G303 Shroud Edition
Keyboard Keychron K2 HE Wireless / 8BitDo Retro Mechanical Keyboard (N Edition) / NuPhy Air75 v2
VR HMD Meta Quest 3 512GB
Software Windows 11 Pro 64-bit 24H2 Build 26100.2605
1) 1080p gaming on $1.3k makes no sense. We do it, to figure "how CPUs will behave in the future". It's an arguable theory, that assumes that future performance could be deducted by testing stuff at unrealistic resolutions

Flawed argument. There are many that prefer 1080p at high FPS for competitive shooters (e.g. PUBG, R6S and Doom Eternal speedruns). That point would make sense if you're just playing and want to enjoy the visual fidelity.
 
Joined
Jun 10, 2014
Messages
3,006 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
Other than that, I agree with you that more threads will probably age better in gaming, given the architecture of next-gen consoles.
This has been predicted for many years, but people fail to understand that it's the nature of the workload which dictates how it can scale across multiple threads.

While we probably will continue to see games use a little more threads in general, this is mostly for non-rendering tasks; audio, networking, video encoding, etc. It doesn't make sense do rendering (which consists of building queues for the GPU) over more than 1-3 threads, with each thread doing its separate task like a render pass, viewport, particle simulation or resource loading. While it is technically possible to have multiple threads build a single GPU queue, the synchronization overhead would certainly kill any perceived "performance advantage".

In the next years single thread performance will continue to be important, but only to the point where the GPU is fully saturated. So with the next generations from AMD and Intel we should expect Intel's gaming advantage to shrink a bit.
 
Joined
May 15, 2020
Messages
697 (0.41/day)
Location
France
System Name Home
Processor Ryzen 3600X
Motherboard MSI Tomahawk 450 MAX
Cooling Noctua NH-U14S
Memory 16GB Crucial Ballistix 3600 MHz DDR4 CAS 16
Video Card(s) MSI RX 5700XT EVOKE OC
Storage Samsung 970 PRO 512 GB
Display(s) ASUS VA326HR + MSI Optix G24C4
Case MSI - MAG Forge 100M
Power Supply Aerocool Lux RGB M 650W
This has been predicted for many years, but people fail to understand that it's the nature of the workload which dictates how it can scale across multiple threads.

While we probably will continue to see games use a little more threads in general, this is mostly for non-rendering tasks; audio, networking, video encoding, etc. It doesn't make sense do rendering (which consists of building queues for the GPU) over more than 1-3 threads, with each thread doing its separate task like a render pass, viewport, particle simulation or resource loading. While it is technically possible to have multiple threads build a single GPU queue, the synchronization overhead would certainly kill any perceived "performance advantage".

I'm a senior software engineer, so I also speak from experience. It is not the nature of the workload that dictates the scaling upon multiple threads, it is the way the software is written. 10-15 years ago software was completely monolithic, so the only way to use multiple cores or threads was to have multiple applications running at one time.

Since then, due to the plateau in frequency, in many domains, software has been written differently so that it can take advantage of massive parallelization. Of course, parallelization requires a shift in paradigm, and software, firmware and hardware advances. And it is much more difficult to write fully parallel software than monolithic, but it is feasible and it has already been done in many applications.
In gaming, until now there has not been a strong drive in this direction, because the average consumer computer thread count wasn't that high. But that is over with the PS5&co, these consoles have 8 cores clocked rather low. If games that are being written right now, would only use 2-3 cores, that means they would suck big time. So I'm pretty sure that next-gen games will be quite good at using multiple threads, and we will start feeling this in PC gaming in less than 2 year's time.
 
Joined
Jul 10, 2010
Messages
1,234 (0.23/day)
Location
USA, Arizona
System Name SolarwindMobile
Processor AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G
Motherboard Acer Wasp_BR
Cooling It's Copper.
Memory 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF
Video Card(s) ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER]
Storage TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB
Display(s) ViewSonic XG2401 SERIES
Case Acer Aspire E5-553G
Audio Device(s) Realtek ALC255
Power Supply PANASONIC AS16A5K
Mouse SteelSeries Rival
Keyboard Ducky Channel Shine 3
Software Windows 10 Home 64-bit (Version 1607, Build 14393.969)
Estimates for Zen 3's IPC gains are all over the place. Some claim Zen 3 is a minor improvement, while others claim it's a major architectural overhaul, but we'll see.
Family 17h = Zen to Zen2
Family 19h = Zen3 speculatively to Zen4.

It is very much likely going to be an architectural overhaul within that of Bobcat to Jaguar overhauling at least;

Ex:
Bobcat dual-core (14h) => two separate L2s
Jaguar dual-core (16h) => one unified L2
::
Zen2 octo-core (17h) => two separate CCXs
Zen3 octo-core (19h) => one unified CCX
 
Last edited:
Joined
Jun 10, 2014
Messages
3,006 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I'm a senior software engineer, so I also speak from experience. It is not the nature of the workload that dictates the scaling upon multiple threads, it is the way the software is written.
<snip>
And it is much more difficult to write fully parallel software than monolithic, but it is feasible and it has already been done in many applications.
Since we're flashing credentials, so am I, with a thesis in graphics programming :)
That depends on your definition of being "fully parallel". If you have a workload of independent work chunks that can be processed without synchronization, you can scale almost linearly until you reach a bottleneck in hardware or software. This mostly applies to large workloads of independent chunks, and the overhead of thread communication is negligible because of the chunk size vs. time scale. Examples include large encoding jobs, web servers, software rendering etc.
On the opposite end of the spectrum are highly synchronized workloads, where any workload will reach the point of diminishing returns due to overhead as threading isn't free.
There is also instruction level parallelism, but that's a topic of its own.

In gaming, until now there has not been a strong drive in this direction, because the average consumer computer thread count wasn't that high. But that is over with the PS5&co, these consoles have 8 cores clocked rather low. If games that are being written right now, would only use 2-3 cores, that means they would suck big time. So I'm pretty sure that next-gen games will be quite good at using multiple threads, and we will start feeling this in PC gaming in less than 2 year's time.
These are common misconceptions, even among programmers. While games have been using more than one thread for a long time, using many threads for rendering haven't happened despite Xbox One and PS4 launching nearly 7 years ago with 8 cores.

Firstly games work on a very small time scale, e.g. 8.3ms if you want 120 Hz, there is very little room for overhead before you encounter serious stutter. Rendering with DirectX, OpenGL or Vulkan works by using API calls to build a queue for the GPU pipeline. The GPU pipeline itself isn't fully controlled by the programmer, but at certain points in the pipeline it executes programmable pieces of code called "shader programs"(the name is misleading, as it's much more than shading). While it is technically possible to have multiple GPU queues (doing different things) or even to have multiple threads cooperate building a single queue, it wouldn't make sense doing so since the API calls needs to be executed in order, so you need synchronization, and the overhead of synchronization is much more substantial than building the entire queue from a single thread. This is the reason why even after all these years of multi-core CPUs all games use 1 thread per GPU workload. Having a pool of worker threads to build a single queue makes no sense today or several years from now. If you need to offload something, you should offload the non-rendering stuff, but even then do limited synchronization, as the individual steps in a rendering lives within <1ms, which leaves very little time for constantly syncing threads to do a tiny bit of work.

As someone who has been using OpenGL and DirectX since the early 2000s, I've seen the transition from a fixed function pipeline to a gradually more programmable pipeline. The long term trend (10+ years) is to continue offloading the rendering logic to the GPU, hopefully one day achieving a completely programmable pipeline from the GPU. As we continue to take steps in that direction, the CPU will become less of a bottleneck. The need for more threads for games will be dictated by whatever non-rendering work the games needs.
 
Joined
Oct 12, 2005
Messages
718 (0.10/day)
Few things to consider:

- It do not mean that an app is not using 100% cpu on a 8 core chip that the 2-4 extra core vs a 4 or 6 core that the additional core aren't helpful. The goal is always to run a specific list of thing in the shortest timeframe. That may be run something that won't utilise 100% of a core for the whole frame on a different core. Overall the latency is reduced, but it won't use 100% of that core.

- Having more thread in a program add an overhead that require more power to overcome. A faster cpu will be able to overcome that better than a slower one.

- Latency is still king in game. Core to core latency is still something that need to be taken into consideration. And depending on the workload, that core to core latency can be transformed in a Core to L3 cache or Core to Memory latency, slowing things down quite a bit.

- AMD FX CPU had a lot of core thread and still do reasonably well in some title with frame time consistency (meaning no big fps drop), but that do not mean at all that they can run these games faster. just smoother with lower fps. They had an hard time against an intel 2500k at the time. A 2500k can have some difficulties with frame time consistency in modern title, but still deliver better average FPS in many title.

- on that subject, a 2600k witch is very similar to a 2500k do way better in many game these days than it did at launch. remove some minor MHz differency and the main difference is going from 4core/4thread to 4core/8thread.

So to recap, in my opinion. a 3950x right now might do better in the future when game developper will be used to the 8 core / 16 thread of the next gen console (right now they are on a 8 core/8 thread slow jaguar cpu). but a newer CPU with less thread but better IPC and frequency could also do a much better job at running these games.

this is why cpu like the 3300x and the 3600 make so much sense right now. i do not think, except on very specific case that these super high end parts are really worth it. If the CPU race is restarted, spending 300 buck every 1.5 years will give better results than spending 600 bucks for 3 + years.
 
Joined
Apr 16, 2019
Messages
632 (0.30/day)
Get rid of the crappy glue and put 8 cores in a single CCD and it might finally slay the legendary Skylake in gaming. We all know Ampere will increase GPU threshold in high res and AMD needs to do exactly this to close the gap. Can you imagine still losing to Skylake while using 7nm+ process this time around. Put 16 cores in a single CCD with Zen 4 and that should be a beast.
Yet that is exactly what I am expecting; granted, the gap will probably finally be in the single digits, but it will remain and what's worse (for AMD fan(bois) anyway) soon after comes Rocket Lake...
 
Joined
May 15, 2020
Messages
697 (0.41/day)
Location
France
System Name Home
Processor Ryzen 3600X
Motherboard MSI Tomahawk 450 MAX
Cooling Noctua NH-U14S
Memory 16GB Crucial Ballistix 3600 MHz DDR4 CAS 16
Video Card(s) MSI RX 5700XT EVOKE OC
Storage Samsung 970 PRO 512 GB
Display(s) ASUS VA326HR + MSI Optix G24C4
Case MSI - MAG Forge 100M
Power Supply Aerocool Lux RGB M 650W
Since we're flashing credentials, so am I, with a thesis in graphics programming :)
That depends on your definition of being "fully parallel". If you have a workload of independent work chunks that can be processed without synchronization, you can scale almost linearly until you reach a bottleneck in hardware or software. This mostly applies to large workloads of independent chunks, and the overhead of thread communication is negligible because of the chunk size vs. time scale. Examples include large encoding jobs, web servers, software rendering etc.
On the opposite end of the spectrum are highly synchronized workloads, where any workload will reach the point of diminishing returns due to overhead as threading isn't free.
There is also instruction level parallelism, but that's a topic of its own.
Any large workload can be parallelized, if it's big enough, it means it can be broken into pieces that can be dealt with separately. There is overhead for splitting work between worker threads and recomposing the result, but it can be optimized so that there are still gains from distributing workloads, people have been doing this for years, in all types of applications. It simply works, although it may be complicated.

These are common misconceptions, even among programmers. While games have been using more than one thread for a long time, using many threads for rendering haven't happened despite Xbox One and PS4 launching nearly 7 years ago with 8 cores.
Are you really trying to say that the PS5 will get by with doing most of the work on one 2GHz core?
Firstly games work on a very small time scale, e.g. 8.3ms if you want 120 Hz, there is very little room for overhead before you encounter serious stutter.
I imagine you do realize that 120Hz translates to over 8 million clock cycles on your average AMD APU SMT core?
Rendering with DirectX, OpenGL or Vulkan works by using API calls to build a queue for the GPU pipeline. The GPU pipeline itself isn't fully controlled by the programmer, but at certain points in the pipeline it executes programmable pieces of code called "shader programs"(the name is misleading, as it's much more than shading). While it is technically possible to have multiple GPU queues (doing different things) or even to have multiple threads cooperate building a single queue, it wouldn't make sense doing so since the API calls needs to be executed in order, so you need synchronization, and the overhead of synchronization is much more substantial than building the entire queue from a single thread. This is the reason why even after all these years of multi-core CPUs all games use 1 thread per GPU workload. Having a pool of worker threads to build a single queue makes no sense today or several years from now. If you need to offload something, you should offload the non-rendering stuff, but even then do limited synchronization, as the individual steps in a rendering lives within <1ms, which leaves very little time for constantly syncing threads to do a tiny bit of work.

As someone who has been using OpenGL and DirectX since the early 2000s, I've seen the transition from a fixed function pipeline to a gradually more programmable pipeline. The long term trend (10+ years) is to continue offloading the rendering logic to the GPU, hopefully one day achieving a completely programmable pipeline from the GPU. As we continue to take steps in that direction, the CPU will become less of a bottleneck. The need for more threads for games will be dictated by whatever non-rendering work the games needs.

I haven't looked at OpenGL code for quite a while, but I am sure that APIs have transitioned to more asynchronous workflows (because everything is becoming asynchronous these days, even good ol' HTTP), but that is not the main point here. I think what you do not understand is the fact that having all the calls to the graphic API coming from a single thread doesn't equate at all to the fact that that thread is doing all the computing.
 
Joined
Jul 9, 2015
Messages
3,413 (0.99/day)
System Name M3401 notebook
Processor 5600H
Motherboard NA
Memory 16GB
Video Card(s) 3050
Storage 500GB SSD
Display(s) 14" OLED screen of the laptop
Software Windows 10
Benchmark Scores 3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.
Joined
May 15, 2020
Messages
697 (0.41/day)
Location
France
System Name Home
Processor Ryzen 3600X
Motherboard MSI Tomahawk 450 MAX
Cooling Noctua NH-U14S
Memory 16GB Crucial Ballistix 3600 MHz DDR4 CAS 16
Video Card(s) MSI RX 5700XT EVOKE OC
Storage Samsung 970 PRO 512 GB
Display(s) ASUS VA326HR + MSI Optix G24C4
Case MSI - MAG Forge 100M
Power Supply Aerocool Lux RGB M 650W
That is fair enough, but note how different "niche gamers, that are into 240hz +" are from simply "gamers.
Frankly, I have no idea what percentage of the market they represent. I play a bit of Fortnite and I understand there's a small advantage at playing at very high framerates (even higher than the monitor), but I also happen to know lots of kids who play at 60Hz and play very well...
But, as usual, there's a Gaussian distribution, so it's rather normal to have the highest end of the market comprised of only 5%-10% of the total.
 
Joined
Apr 16, 2019
Messages
632 (0.30/day)
Just to make the devil 's advocate, it makes sense for competitive FPS players. The guys that use 240+ FPS monitors like this one: https://www.techpowerup.com/267368/alienware-announces-aw2521h-360hz-gaming-monitor
Other than that, I agree with you that more threads will probably age better in gaming, given the architecture of next-gen consoles.
Only up to a point. For instance, 3800x might and I repeat might offer slightly better experience than let's say 10600k at the very end of both chips' usability, like around 5 years from now, but both will be struggling by then, since single thread advancements will continue to be important despite what leagues of AMD fan(boy)s would tell you. 3900x (or 3950x for that matter) will never get you better (while still objectively good enough) framerates than 9900k / 10700k though, of that I am completely certain. A fine example are old 16 core Opterons compared even with 8 core FX chips (that clocked much better), not to mention something like a 2600k.
 
Top