Friday, December 2nd 2022

AMD Readies 16-core, 12-core, and 8-core Ryzen 7000X3D "Zen 4" Processors

Dec 2nd, 2022 22:24 Discuss (153 Comments)

AMD is firing full cylinders to release a new line of Ryzen 7000-series "Zen 4" Socket AM5 desktop processors featuring 3D Vertical Cache, at the earliest. Faced with a significant drop in demand due to the slump in the PC industry, and renewed competition from Intel in the form of its 13th Gen Core "Raptor Lake" processors, the company is looking to launch the Ryzen 7000X3D desktop processors within January 2023, with product unveiling expected at AMD's 2023 International CES event. The 3D Vertical Cache technology had a profound impact on the gaming performance of the older "Zen 3" architecture, bringing it up to levels competitive with those of the 12th Gen Core "Alder Lake" processors, and while gaming performance of the Ryzen 7000 "Zen 4" processors launched till take match or beat "Alder Lake," they fall behind those of the 13th Gen "Raptor Lake," which is exactly what AMD hopes to remedy with the Ryzen 7000X3D series.

In a report, Korean tech publication Quasar Zone states that AMD is planning to release 16-core/32-thread, 12-core/24-thread, and 8-core/16-thread SKUs in the Ryzen 7000X3D series. These would use one or two "Zen 4" chiplets with stacked 3D Vertical Cache memory. A large amount of cache memory operating at the same speed as the on-die L3 cache, is made contiguous with it and stacked on top of the region of the CCD (chiplet) that has the L3 cache, while the region with the CPU cores has structural silicon that conveys heat to the surface. On "Zen 3," the 32 MB on-die cache is appended with 64 MB of stacked cache memory operating at the same speed, giving the processor 96 MB of L3 cache that's uniformly accessible by all CPU cores on the CCD. This large cache memory positively impacts gaming performance on the Ryzen 7 5800X3D in comparison to the 5800X; and a similar uplift is expected for the 7000X3D series over their regular 7000-series counterparts.

The naming of these 7000X3D series SKUs is uncertain. It's possible that the 16-core part is called the 7950X3D, and the 12-core part 7900X3D; but the 8-core part may either be called the 7700X3D or 7800X3D. Quasar Zone also posted some theoretical performance projections for the 7950X3D based on the kind of performance uplifts 3DV cache yielded for "Zen 3" in the 5800X3D. According to these, the theoretical 7950X3D would easily match or beat the gaming performance of the Core i9-13900K, which begins to explain why Intel is scampering to launch the faster Core i9-13900KS with a boost frequency of 6.00 GHz or higher. The report also confirms that there won't be a 6-core/12-thread 7600X3D as previously thought.

Source: harukaze5719 (Twitter)

Add your own comment

153 Comments on AMD Readies 16-core, 12-core, and 8-core Ryzen 7000X3D "Zen 4" Processors

#126

Why_Me

beedooI will. Thanks :)

Why?

#127

beedoo

Why_MeWhy?

Well, computers have been my hobby for nearly 40 years and I have lots of disposable income - and as I work in I.T., the tax man likes to help me with my I.T. purchases :)

#128

Mussels

Freshwater Moderator

fevgatosEven then, of course you can. The 5600 literally wipes the floor with the 3d in terms of performance per price

Abso-friggin-lutely

The 5600 is IMO, the best choice for 99% of gamers out there - you need a serious GPU to ever have it even be a limit.

Minus InfinityWe only have results for server applications mostly from Milan-X to see how it can benefit non-gaming.

That's because it helps gaming and server applications, theres very little overlap.

because the 5800x3D did run at lower clocks (overall mine runs 4.45GHz all core vs 4.6GHz in AVX workloads, soooo much slower) it'd be hard to advertise it as an all purpose product when it'd have a deficit in some commonly used setups

What'd be amazing is if AMD had the pull intel does with microsoft, and could release a CPU with one 3D stacked die and others without - 3D becomes the P cores, and the others do the higher wattage boring workloads

#129

JustBenching

MusselsAbso-friggin-lutely

The 5600 is IMO, the best choice for 99% of gamers out there - you need a serious GPU to ever have it even be a limit.

That's because it helps gaming and server applications, theres very little overlap.

because the 5800x3D did run at lower clocks (overall mine runs 4.45GHz all core vs 4.6GHz in AVX workloads, soooo much slower) it'd be hard to advertise it as an all purpose product when it'd have a deficit in some commonly used setups

What'd be amazing is if AMD had the pull intel does with microsoft, and could release a CPU with one 3D stacked die and others without - 3D becomes the P cores, and the others do the higher wattage boring workloads

Even the 5700 is what, 190€ or something. Usually the low / mid range from both companies is pretty great in value, the higher end cpus 99% of the time are not worth the money. The 3d sadly falls in the high range tier and is a bit overpriced even at the recent cut down prices

#130

efikkan

fevgatosEven then, of course you can. The 5600 literally wipes the floor with the 3d in terms of performance per price

Wow, what a sight to behold! It must be the first CPU in history which also cleans your house. :P

PunkenjoyWe are no longer in the last decade. Devs do not target a number of cores. They have X amount of operation witch X amount can be send to other thread. They can scale it out to more thread but the more you scale out the multithreaded part, the more the single threaded part become the problem.

It seems you are talking about thread pools. Those are usually relevant when you have large quanitites of independent work chunks with no need for synchronization, which is exceedingly rare for interactive synchronous workloads with very strict deadlines like games. (large batch jobs are a different story)
With fast games pushing tick rates of 120 Hz(8.3ms) or higher, individual steps in the pipelined simulation can have a performance budget of 1 ms or much less, which leaves very small margins for delays caused by threading. At this fine level, splitting a task across threads may actually hurt performance, especially when it comes to stutter. It may even cause simulation glitches or crashes, as we've seen in some games. For this reason, even heavily multithreaded games usually do separate tasks on separate threads, e.g. ~2-3+ threads for GPU interaction, usually 1 for the core game simulation, 1 for audio, etc.

PunkenjoyBefore modern low level API, Game were hard to multithread and they had to decide what to run on other cores, during thoses days it was true that a game could benefits from having x cores…

Multithreading has been common in games since the early 2000s, and game engines still to this day have control over what to run on which cores.

PunkenjoyBut today, it's no longer the case, The modern API allow the devs to send a lot of operation to others thread. By example all draws calls (command sent to the GPU to perform an action) can now be multithreaded.

Current graphics APIs certainly can accept commands from multiple threads, but to what purpose? This just leaves the driver with the task to organize the commands, and it's shoved into a single queue of operations internally either way. The purpose of multiple threads with GPU context is to execute independent queues, e.g. multiple render passes, compute loads, asset loading, or possibly multiple viewports (e.g. split screen).

Usually, when a game's rendering is CPU bottlenecked, the game is either not running the API calls effectively or using the API "incorrectly". In such cases, the solution is to move the non-rendering code out of the rendering thread instead of creating more rendering threads.

PunkenjoyEven more today since all those thread have data in common that be cached, if they are not too far, it can lead to very nice performance gain since the CPU will not waste time waiting for the data from main memory.

Data cache line hits in L3 from other threads are (comparatively) rare, as the entire cache is overwritten every few thousand clock cycles, so the window for a hit here is very tiny.
Instruction cache line hits from other threads in L3 is more common however, e.g. if multiple threads execute the same code but on different data.

PunkenjoyThat bring back to the point of 3D-Vcache. This is useful when a software reuse frequently the same data and that data can fit or close to fit in the cache. Game really behave like this since it's generating frame one after another. And you realise that if you want high FPS (60+) and you have to wait for memory latency, your working set need to be quite small. 250-500 MB depending on memory and other stuff.

Actually, this misconception is why we see so little gains from 3D V-Cache. The entire L1/L2/L3 hierarchy is overwritten many times in the lifecycle of a single frame, so there is no possibility of benefits here across frames.
We only see some applications and games benefit significantly from massive L3 caches, and it's usually not the ones which are the most computationally intensive, this is because of instruction cache hits, not data. When software is sensitive to instruction cache, it's usually a sign of unoptimized or bloated code. For this reason I'm not particularly excited about extra L3 cache, as this mainly benefits "poor" code. But when the underlying technology eventually is used to build bigger cores with varying features, now that's exciting.

#131

Punkenjoy

efikkanWow, what a sight to behold! It must be the first CPU in history which also cleans your house. :p

It seems you are talking about thread pools. Those are usually relevant when you have large quanitites of independent work chunks with no need for synchronization, which is exceedingly rare for interactive synchronous workloads with very strict deadlines like games. (large batch jobs are a different story)
With fast games pushing tick rates of 120 Hz(8.3ms) or higher, individual steps in the pipelined simulation can have a performance budget of 1 ms or much less, which leaves very small margins for delays caused by threading. At this fine level, splitting a task across threads may actually hurt performance, especially when it comes to stutter. It may even cause simulation glitches or crashes, as we've seen in some games. For this reason, even heavily multithreaded games usually do separate tasks on separate threads, e.g. ~2-3+ threads for GPU interaction, usually 1 for the core game simulation, 1 for audio, etc.

Threading switching are counted in nano seconds, not milliseconds. it's not a problem at all for a CPU to to run multiples thread in series if they can do it fast enough. Those threads are just a bunch of operations that needs to be done and if a CPU can complete the operation of 2 thread in one core faster than another one with 2 cores, that one core CPU will be faster and this is what we see right now.

The data proof what i say.

efikkanMultithreading has been common in games since the early 2000s, and game engines still to this day have control over what to run on which cores.

Most game today just map the topology of processor for things like CCD (to schedule it all in one CCD for Ryzen 9 by example) E-cores, SMT, etc. They do not really care if the job run on X core or Y. Else, they would have to code their game to run on every variation of the CPU. What if there is no SMT, what if there is just 4 core, etc.

And still, the main thread remain the bottleneck in most case. Else you would see 20-25% gain going from 6 core to 8 core in a CPU limited scenario and that is still not the case. Also, if thread were assigned staticly by the engine, the game would really run poorly on newer gen CPU with less cores. (like the 7600x). But still today, this CPU kick the ass for many other CPU with mores cores because it do not matter. Just raw performance matter.

efikkanCurrent graphics APIs certainly can accept commands from multiple threads, but to what purpose? This just leaves the driver with the task to organize the commands, and it's shoved into a single queue of operations internally either way. The purpose of multiple threads with GPU context is to execute independent queues, e.g. multiple render passes, compute loads, asset loading, or possibly multiple viewports (e.g. split screen).

The driver do not decide what to do on it's own. The CPU still have to send commands to the drivers and this is the part that is now more easily multithreadable with modern API.

After that, all drivers are more or less multithreaded. One of the main advantage Nvidia had with older API where it was harder to do multithreading code was that their drivers was more multithreaded. But it also had a higher CPU overhead. AMD catched up in june with its older DirectX driver and OpenGL now. but anyway. Once the driver done it's job, all the rest is done by the GPU and the numbers of cores do not really matter.

efikkanUsually, when a game's rendering is CPU bottlenecked, the game is either not running the API calls effectively or using the API "incorrectly". In such cases, the solution is to move the non-rendering code out of the rendering thread instead of creating more rendering threads.

but is it? This is an oversimplification. It can be true in some case, but not always. This is a discourse that i hate to read. This is just plain false and most people do not really understand what a game is doing.

If we take your reasoning up to the end, this mean we could have a game with the perfect physics, perfect AI and be photorealistics with unlimited numbers of assets and if the game do not run properly, its just because the game do not use the API "Correctly"

efikkanData cache line hits in L3 from other threads are (comparatively) rare, as the entire cache is overwritten every few thousand clock cycles, so the window for a hit here is very tiny.
Instruction cache line hits from other threads in L3 is more common however, e.g. if multiple threads execute the same code but on different data.

Actually, this misconception is why we see so little gains from 3D V-Cache. The entire L1/L2/L3 hierarchy is overwritten many times in the lifecycle of a single frame, so there is no possibility of benefits here across frames.
We only see some applications and games benefit significantly from massive L3 caches, and it's usually not the ones which are the most computationally intensive, this is because of instruction cache hits, not data. When software is sensitive to instruction cache, it's usually a sign of unoptimized or bloated code. For this reason I'm not particularly excited about extra L3 cache, as this mainly benefits "poor" code. But when the underlying technology eventually is used to build bigger cores with varying features, now that's exciting.

It's not a misconception at all. It can be true that L1/L2 are overwritten many time in the lifecycle of a single frame, but that is what they are intended for. They need to cache data for a very small amount of time. still in that time many cycles can happen.

The fact that cache are overwritten is not an issue. As long as they aren't when they are required. But anyway, CPU have mechanism to predict what need to be in cache, what need to stay in cache and etc. They look at the future code and see what portion of memory would be needed. One misconception that we may have is to think about cache and Data. Cache do not cache data. Cache cache Memory space.

The CPU detect what memory region it need to continue working and it will prefetch it. There are other mechanism in the CPU that will decide if some memory region need to remain cache because they are likely to be reused.

Having more cache allow you to be more flexible and balance those things quite easily. Also a CPU might still reuse some data hundreds of time before it get purged from the L3 cache.

The working set for a single frame must remain small anyway. In a ideal scenario, let say you have 60 GB/s, you want to have 60 FPS, that mean that if you can access all your data instaneously, you cannot have more than 1 GB of data read during that frame (unless indeed, if you cache it).

If you add the wait for all memory access on top of that (a typical access would be few bytes, and you would wait 50-70 ns to get it) it's probably more around 200-300 MB per frame.

You can see that with a larger cache, you can now cache a significant portion of it.

#132

THU31

No 6-core? I guess they are sticking with being greedy.

Really looking forward to seeing benchmarks and prices.

#133

efikkan

PunkenjoyThreading switching are counted in nano seconds, not milliseconds. it's not a problem at all for a CPU to to run multiples thread in series if they can do it fast enough…

SMT is done by the CPU transparently to the software, but scheduling of threads is done by the OS scheduler, which works on the millisecond time scale. (for Windows and standard Linux)

Delays may happen any time the OS scheduler kicks in one of the numerous background threads. Let's say you for example have a workload of 5 pipelined stages, each consisting of 4 work units, and a master thread will divide these to 4 worker threads. If each stage has to be completed by all threads and then synced up, then a single delay will delay the entire pipeline. This is why threadpools scale extremely well with independent work chunks and not well otherwise.

PunkenjoyMost game today just map the topology of processor for things like CCD (to schedule it all in one CCD for Ryzen 9 by example) E-cores, SMT, etc…

While requesting affinity is possible, I serously doubt most games are doing that.

PunkenjoyAnd still, the main thread remain the bottleneck in most case. Else you would see 20-25% gain going from 6 core to 8 core in a CPU limited scenario and that is still not the case.

That makes no sense. That's not how performance scaling i games work.

PunkenjoyAlso, if thread were assigned staticly by the engine, the game would really run poorly on newer gen CPU with less cores. (like the 7600x).

My wording was incorrect, I meant to say "on which thread", not which (physical) core, which mislead you.
The OS scheduler clearly decides where to run the threads.

Punkenjoybut is it? This is an oversimplification. It can be true in some case, but not always. This is a discourse that i hate to read. This is just plain false and most people do not really understand what a game is doing.

The current graphics APIs are fairly low overhead. Render threads are not normally spending the majority of their CPU time with API calls. Developers will easily see this with a simple profiling tool.

PunkenjoyIf we take your reasoning up to the end, this mean we could have a game with the perfect physics, perfect AI and be photorealistics with unlimited numbers of assets and if the game do not run properly, its just because the game do not use the API "Correctly"

This is just silly straw man argumentation. I said no such thing. :rolleyes:
I was talking about rendering threads wasting (CPU) time on non-rendering work.

PunkenjoyIt's not a misconception at all. It can be true that L1/L2 are overwritten many time in the lifecycle of a single frame, but that is what they are intended for. They need to cache data for a very small amount of time. still in that time many cycles can happen.

The fact that cache are overwritten is not an issue. As long as they aren't when they are required. But anyway, CPU have mechanism to predict what need to be in cache, what need to stay in cache and etc. They look at the future code and see what portion of memory would be needed. One misconception that we may have is to think about cache and Data. Cache do not cache data. Cache cache Memory space.

The CPU detect what memory region it need to continue working and it will prefetch it. There are other mechanism in the CPU that will decide if some memory region need to remain cache because they are likely to be reused.

L3 (in current Intel and AMD CPUs) are spillover caches for L2, meaning anything evicted from L2 will end up there. If a CPU has 100 MB L3 cache, and the CPU reads ~100 MB (unique data) from memory, then anything from L3 which hasn't been promoted into L2 will be overwritten after this data is read. It is a misconception that "important data" remains in cache. The prefetcher does not preserve/keep important cache lines in L3, and for anything to remain in cache it needs to be reused before it's evicted from L3 too. Every cache line the prefetcher loads into L2 (even one which is ultimately not used) will evict another cache line.

PunkenjoyHaving more cache allow you to be more flexible and balance those things quite easily. Also a CPU might still reuse some data hundreds of time before it get purged from the L3 cache.

You can see that with a larger cache, you can now cache a significant portion of it.

Considering the huge amount of data the CPU is churning through, even all the data it ultimately prefetches unnecessarily, the hitrate of extra L3 cache is falling very rapidly. Other threads, even low priority background threads, also affects this.

PunkenjoyThe working set for a single frame must remain small anyway. In a ideal scenario, let say you have 60 GB/s, you want to have 60 FPS…

While one or more threads are working on the rendering, other thread(s) will process events, do game simulation, etc. (plus other background threads), all of which will traverse different data and code, which will "pollute" (or rather compete over) the L3.

#134

Punkenjoy

efikkanSMT is done by the CPU transparently to the software, but scheduling of threads is done by the OS scheduler, which works on the millisecond time scale. (for Windows and standard Linux)

What you refer to is not thread scheduling but thread balancing (Into Slices of times that are, indeed, on that scale). An OS can assign a thread way faster than that if the CPU is ready and thread will release the CPU one it's done.

What you describe is how the OS would switch between thread if there is limited ressource (no more core/SMT available) and it need to switch between thread to make the job faster.

If OS were really that slow to assign thread, everything you do on your computer would feel awfully slow. That would leave so much performance on the table.

efikkanDelays may happen any time the OS scheduler kicks in one of the numerous background threads. Let's say you for example have a workload of 5 pipelined stages, each consisting of 4 work units, and a master thread will divide these to 4 worker threads. If each stage has to be completed by all threads and then synced up, then a single delay will delay the entire pipeline. This is why threadpools scale extremely well with independent work chunks and not well otherwise.

While requesting affinity is possible, I serously doubt most games are doing that.

It's not affinity but knowing How much thread they need to launch to get some balance.

efikkanThat makes no sense. That's not how performance scaling i games work.

This indeed make no sense for game scaling. But this is how it work for things that aren't bottleneck by a single thread like 3d rendering.

efikkanMy wording was incorrect, I meant to say "on which thread", not which (physical) core, which mislead you.
The OS scheduler clearly decides where to run the threads.

Agree!

efikkanThe current graphics APIs are fairly low overhead. Render threads are not normally spending the majority of their CPU time with API calls. Developers will easily see this with a simple profiling tool.

They are lower overhead but that do not means null. Also having a lower overhead allow you to do more with the same.

efikkanThis is just silly straw man argumentation. I said no such thing. :rolleyes:
I was talking about rendering threads wasting (CPU) time on non-rendering work.

That was unclear, but if a game indeed waste too much time on useless things, it is indeed bad...

efikkanL3 (in current Intel and AMD CPUs) are spillover caches for L2, meaning anything evicted from L2 will end up there. If a CPU has 100 MB L3 cache, and the CPU reads ~100 MB (unique data) from memory, then anything from L3 which hasn't been promoted into L2 will be overwritten after this data is read. It is a misconception that "important data" remains in cache. The prefetcher does not preserve/keep important cache lines in L3, and for anything to remain in cache it needs to be reused before it's evicted from L3 too. Every cache line the prefetcher loads into L2 (even one which is ultimately not used) will evict another cache line.

Considering the huge amount of data the CPU is churning through, even all the data it ultimately prefetches unnecessarily, the hitrate of extra L3 cache is falling very rapidly. Other threads, even low priority background threads, also affects this.

You are right that L3 is a victim cache and will contain previously used data (that may or may not have been prefetch by the prefetcher.) Having Larger L3 cache allow you to have more aggressive prefetcher without too much penalty.

The benefits of cache will not be linear indeed. But tripling it will give a significant performance gain when the code re-use frequently the same data. And this is what game usually do. But your concern are also totally valid because there are many other application that will just load new data all the time and barely re-use the data making the L3 cache almost useless.

But at the processor scale, (and not human time frame), in the best scenario where each of the 6 core read the from the memory at 60 GB/s, it will take about 1.5 ms to flush the cache with fresh data. 60 GB/s is about what DDR4-3800 would give you in theorical bench. If we use Fast DDR5 like DDR5-6000, that would be close to 90GB/s meaning the cache would be flushed every 1 ms.

This is indeed close to the timeframe a game use to render a frame. And this is a very theorical scenario. Also a read to main memory will be between 45-100 ns where a hit to L3 with only take 10 ns. The bandwidth is also 10 times faster. You do not need to have a very high hit ratio in the period of 16.6 ms to get significant gain. And all the cache hit leave the memory controller free to perform more memory access.

Even if you are doing more task, background task, etc, you will still have cache hit at some point

efikkanWhile one or more threads are working on the rendering, other thread(s) will process events, do game simulation, etc. (plus other background threads), all of which will traverse different data and code, which will "pollute" (or rather compete over) the L3.

Well a lot of that data will reuse the same data even within a thread making it worthwhile to have L3. And the thing is L3 do not cache "Things" it cache memory line. Memory line cache instruction and data. Those instruction can be reused as part of a loop many times. smaller loops will all happen within registry or L1, but some can take more time (and/or have branch making them longer to run) and benefits from having L2 and L3.

Again, Data prove this point. 5800X3D, Even if it clock slightly lower than 5800x Beat it at almost all game

There are 2 scenario where the extra cache in gaming will not be beneficial (In CPU limited scenario)

1. Too large working set. In those case, due to memory latency, you will be talking at game with very low FPS and memory bandwidth limited.
2. Game with way smaller datasets that could fit or almost fit in the 32 MB L3 of the 5800X making the extra cache useless.

#135

MWK

I am so happy that they are not releasing the 7600x3d. 6 cores in 2023. It's is just silly and the frametime will be so choppy when you doing anything.

#136

Creatorbros3

The 7950X3D is actually what I was looking for. Similar to what other people have said, there are those of us who do workstation jobs and gaming both. I do not want two machines for separate uses. A single processor that can do both the best or close to it is perfect. I don't want to have to decide which my system will excel at. I do video production, graphics, and animation, as well as gaming. Using the abundant cores on a 7950X3D for the production work and the "X3D" for gaming seems like it could be my perfect solution. This is mostly all day-dreaming for now though...

In all reality, I will probably still wait for next gen 8950X3D just to get past the first-AM5-gen hiccups and get better ecosystem stability. My R7 2700/GTX 1070 is basically adequate for now. :twitch: I am used to the issues it has with my workflows and can hopefully get by with it until I can send it big for a whole new system (actually need to replace the 1070 soon. :cry: Not sure it can keep going too much longer). Hopefully an RX 8950 XTX is available then too and I can do a matching system of 8950's... that would make my brain happy. :laugh:

We'll see... looking forward to RDNA3 in a few days and will see how things shape up for my video business in the new year. It is a side gig right now, but it's building.

#137

wheresmycar

MWKI am so happy that they are not releasing the 7600x3d. 6 cores in 2023. It's is just silly and the frametime will be so choppy when you doing anything.

6-cores is an affordable smart entry point for gaming on the whole and at the moment 8 cores doesn't necessarily see any significant performance gains unless we're gonna cherry pick a couple of select titles. I'm yet to see any mainstream and half well optimised game max out on any current Gen 6 cores/12 threads let alone demanding 8 cores/16 threads for some minimal overall performance uplift (which is largely clockspeed variance opposed to additional cores for most titles)

I know you were specifically referencing "frametimes" but i thought the above was necessary for the less informed reader. As for frametime analysis:

[URL='https://www.techpowerup.com/review/amd-ryzen-5-7600x/23.html'][SIZE=4][U]7600X frametime analysis[/U][/SIZE][/URL]

[URL='https://www.techpowerup.com/review/amd-ryzen-7-7700x/23.html'][SIZE=4][U]7700X frametime analysis[/U][/SIZE][/URL]

I don't see anything of significance to axe 6-core gaming parts as "silly" in "2023". Same goes for previous Gen 5600X/12600K.... still fantastic for getting your feet wet with smooth gameplay. Moreover, not everyone is willing to fork out $400-$500 for a flagship gaming chip regardless of the performance feat.... a more affordable 6 core X3D would have been nice and a nice AM5 get-on-board tempting green card to an already expensive DDR5 AM5 platform. I get the feeling the 7600X3D will be something expected a little later after AMD's exhausted current elevated Zen 3 sales (5600~5800X3D)

#138

yeeeeman

5800x3d was nice, but it will feel like wasted money compared to 7800x3d.

#139

efikkan

PunkenjoyWhat you refer to is not thread scheduling but thread balancing (Into Slices of times that are, indeed, on that scale). An OS can assign a thread way faster than that if the CPU is ready and thread will release the CPU one it's done.

The OS scheduler runs at a certain interval, (in Linux it's normally fixed, in Windows it's somewhat dynamic ~0.1 ms -> 20 ms). Running a different scheduler (or changing parameters), e.g. a "low latency" kernel in Linux, enabling faster thread switching which in turn cuts down delays. I've actually done this exact thing to explore the effects of the OS scheduler on a rendering engine, so I know how it works.

PunkenjoyWhat you describe is how the OS would switch between thread if there is limited ressource (no more core/SMT available) and it need to switch between thread to make the job faster.

You really don't understand how it works. Even your phrasing here is evidence you don't grasp the subject.

PunkenjoyIt's not affinity but knowing How much thread they need to launch to get some balance.

Yet another argument which doesn't make sense.

Punkenjoy
efikkan
PunkenjoyAnd still, the main thread remain the bottleneck in most case. Else you would see 20-25% gain going from 6 core to 8 core in a CPU limited scenario and that is still not the case. Also, if thread were assigned staticly by the engine, the game would really run poorly on newer gen CPU with less cores. (like the 7600x). But still today, this CPU kick the ass for many other CPU with mores cores because it do not matter. Just raw performance matter.
That makes no sense. That's not how performance scaling i games work.
This indeed make no sense for game scaling. But this is how it work for things that aren't bottleneck by a single thread like 3d rendering.

Well, it was your claim. :rolleyes:

PunkenjoyThe benefits of cache will not be linear indeed. But tripling it will give a significant performance gain when the code re-use frequently the same data. And this is what game usually do. But your concern are also totally valid because there are many other application that will just load new data all the time and barely re-use the data making the L3 cache almost useless.

Then let me try to explain it from another angle to see if you get the point;
Think of the amount of data vs. amount of code that the CPU is churning through at a given timeframe. As you hopefully know, most games have vastly less code than memory allocated, and even with system libraries etc., data is still much larger than code. On top of this, any programmer will know most algorithms execute code paths over and over, so the chances of a given cache line getting a hit is at least one order of magnitude higher if the cache line is code (vs. data), if not two. On top of this, the chances of a core getting a data cache hit from another core is very tiny, even much smaller than a data cache hit from it's own evicted data cache line. This is why programmers who know low level optimization knows L3 cache sensitivity often has to do with instruction cache, which in turn is very closely tied to the computational density of the code (in other words; how bloated the code is). This is why so few applications get a significant boost from loads of extra L3, they are simply too efficient, which is actually a good thing, if you can grasp it.

And to your idea of running entire datasets in L3, this is not realistic in real world scenarios on desktop computers. There are loads of background threads which will "pollute" your L3, and even just a web browser with some idling tabs will do a lot. For your idea to "work" you need something like an embedded system or controlled environment. Hypothetically, if we had a CPU with L3 split in instruction and data caches, now that would be a bit more interesting.

#140

bwanaaa

efikkanActually, this misconception is why we see so little gains from 3D V-Cache. The entire L1/L2/L3 hierarchy is overwritten many times in the lifecycle of a single frame, so there is no possibility of benefits here across frames.
We only see some applications and games benefit significantly from massive L3 caches, and it's usually not the ones which are the most computationally intensive, this is because of instruction cache hits, not data. When software is sensitive to instruction cache, it's usually a sign of unoptimized or bloated code. For this reason I'm not particularly excited about extra L3 cache, as this mainly benefits "poor" code. But when the underlying technology eventually is used to build bigger cores with varying features, now that's exciting.

Yet the 5800x3D wipes the floor with the 13900k in MSFS. Even tho it is 1 ghz slower, uses slower memory and has a lower IPC. I have even seen reports of the AMD part beating the 13900k in DCS, a notoriously old code base that mostly runs on one core. This disparity however is unique to flight sims and in most other mainstream games, they are at parity. The press has said 'the 3D cache is miraculous' or some thing like that. I never really understood that and most comparisons neglect the bandwidth of the infinity fabric, the effect on latency and other details that might account for the difference.

#141

Avro Arrow

This is a serious mistake on the part of AMD. It has been repeatedly shown that the 3D cache has little to no positive impact on productivity tasks while often having an incredible positive impact on gaming. Why would they put the 3D cache on the R9-7900X and R9-7950X, two CPUs that specialise in productivity? Most people who buy CPUs like that buy them for productivity and wouldn't benefit much from the 3D cache. To me, that says a lot of people won't pay the extra for 3D cache on those CPUs. OTOH, the R5-7600X is primarily a gaming CPU and if AMD had put 3D cache on that, they would've ended up owning the gaming CPU market. The R7-7800X3D is the only one of the three that they're releasing that makes any sense whatsoever because some gamers do buy 8-core CPUs.

We are in the process of watching AMD stupidly snatch defeat from the jaws of victory. :kookoo:

#142

THU31

There are people who do gaming and a lot of productivity on the same machine. But I do not think a lot of them care about getting the absolute best gaming performance. A 7950X is already the best of both worlds, I doubt many people would spend the extra $100 or whatever they ask.

But some will, and they are going to get their high margins, which is the only thing they care about.

They do not want to sell 6-core CPUs, because they are using 8-core chiplets. And I doubt many of those super tiny chiplets are defective, probably only some of the ones on the very edge of the wafer.

A 7600X3D would hinder the sales of the 8-core X3D model. They would sell millions of them, but their margins would be super low, and that goes against everything corporations believe in.

Every now and then each company will have a product with unbelievable value, but it will never become a common thing, because the big heads will not allow it.

#143

wheresmycar

Avro ArrowThis is a serious mistake on the part of AMD. It has been repeatedly shown that the 3D cache has little to no positive impact on productivity tasks while often having an incredible positive impact on gaming. Why would they put the 3D cache on the R9-7900X and R9-7950X, two CPUs that specialise in productivity? Most people who buy CPUs like that buy them for productivity and wouldn't benefit much from the 3D cache. To me, that says a lot of people won't pay the extra for 3D cache on those CPUs. OTOH, the R5-7600X is primarily a gaming CPU and if AMD had put 3D cache on that, they would've ended up owning the gaming CPU market. The R7-7800X3D is the only one of the three that they're releasing that makes any sense whatsoever because some gamers do buy 8-core CPUs.

We are in the process of watching AMD stupidly snatch defeat from the jaws of victory. :kookoo:

A serious mistake?

You're essentially suggesting anyone purchasing these chips has no intention on gaming? You can't ignore MT specialists being as human as you and I - theres a nice chunk of these workstation stroke gaming pundits who will definitely buy into these chips. I myself would fancy something like a 7900X3D for gaming, transcoding and the occasional video rendering....although since these non-gaming tasks are irregular i'd save my money and grab a 7700X3D (not that i'm looking to buy into AM5 unless DDR5/MOBO prices shrink).

#144

imeem1

bought my 7950x during black friday, but didn't open it yet because I was waiting for a even bigger price drop, or a 7950x3d announcement . Like others, im going to be doing everything and anything on my pc simultaneously. So the 7950x3d will be exactly what im looking for.

I saw the x3d is good in Flight Simulator.

#145

kapone32

wheresmycarA serious mistake?

You're essentially suggesting anyone purchasing these chips has no intention on gaming? You can't ignore MT specialists being as human as you and I - theres a nice chunk of these workstation stroke gaming pundits who will definitely buy into these chips. I myself would fancy something like a 7900X3D for gaming, transcoding and the occasional video rendering....although since these non-gaming tasks are irregular i'd save my money and grab a 7700X3D (not that i'm looking to buy into AM5 unless DDR5/MOBO prices shrink).

I just sold a workstation system yesterday. The buyer told me that he was reviewing the parts before he came (5900X, Water cooled 6800XT, 4 TB NVME Storage and 850 W PSU. During the transaction I asked the buyer if he knew what Steam was and he looked at me like I had 2 heads. After I had him test the PC he looked at me and said "what was that Game where you are a Car thief again?" I said you mean GTA5 and he said There is a 5th one!!? I then showed him Humble Choice for the month and he was sold. I contacted him this morning to see how his rig was going and he told me he was waiting to go to Canada Computers to get more cables because he will be Gaming on one of his screens while Working sometimes. I myself am guilty of getting HEDT because I love PCIe lanes and you would have to have had that to understand how I have built like 9 PCs using the storage from my HEDT. If only AMD had not thought that the 3950X and a 12 Core HEDT (3945X) to the masses for an affordable price would have eaten into the 5950X sales but that might not have been the case.

#146

Avro Arrow

THU31There are people who do gaming and a lot of productivity on the same machine. But I do not think a lot of them care about getting the absolute best gaming performance. A 7950X is already the best of both worlds, I doubt many people would spend the extra $100 or whatever they ask.

But some will, and they are going to get their high margins, which is the only thing they care about.

They do not want to sell 6-core CPUs, because they are using 8-core chiplets. And I doubt many of those super tiny chiplets are defective, probably only some of the ones on the very edge of the wafer.

I would say then that they should stop producing the R5-7600X and just replace it with the R5-3600X3D. The non-X parts are the budget parts and they can be left alone.

THU31A 7600X3D would hinder the sales of the 8-core X3D model. They would sell millions of them, but their margins would be super low, and that goes against everything corporations believe in.

It would and it wouldn't. For someone who couldn't afford the "7800X3D" to begin with, an R5-7600X3D might stop them from deciding to buy an i5-13600K. That's another person hooked on the AM5 platform which means at least one more AMD CPU purchase to come. My own AM4 "Odyssey" has been R7-1700, R5-3600X, R7-5700X and R7-5800X3D on three motherboards. (X370, X570, A320)

I have an A320 motherboard because I came across one that was so hilarious that I had to buy it. Canada Computers had the Biostar A320M on clearance for only CA$40! My first build back in 1988 had a Biostar motherboard so I have a soft spot in my heart for them. This board looked so pathetic that I just couldn't say no to it. It was actually kinda cute:

It doesn't even have an M.2 port! The thing is though, it cost CA$40, and IT WORKS! You can't go wrong there.

THU31Every now and then each company will have a product with unbelievable value, but it will never become a common thing, because the big heads will not allow it.

Well those big heads need to be given a shake because that's how you gain an advantageous market position. It's the difference between strategy and tactics. Just look at what nVidia did with the GTX 1080 Ti. Was it a phenomenal value? Absolutely! Is it one of the major reasons why nVidia is seen as the market leader? Absolutely! The long-term benefits typically outweigh the short-term losses. This isn't some new revelation either so those "big heads" must be pretty empty if they can't comprehend this.

imeem1bought my 7950x during black friday, but didn't open it yet because I was waiting for a even bigger price drop, or a 7950x3d announcement . Like others, im going to be doing everything and anything on my pc simultaneously. So the 7950x3d will be exactly what im looking for.

I do everything and anything on my PC too. The thing is, the most hardware-hardcore tasks that I do are gaming tasks. For everything else, hell, even my old FX-8350 would probably be fast enough. It's not like Windows or Firefox have greatly increased their hardware requirements in the last ten years.

imeem1I saw the x3d is good in Flight Simulator.

Yep. From what I understand the R7-5800X3D is currently the fastest CPU for FS2020. God only knows what kind of performance a Zen4 X3D CPU would bring. At the same time, I don't think that any performance advantage over the 5800X3D would be anything of significant value because it already runs FS2020 perfectly and you can't get better than perfect.

kapone32I just sold a workstation system yesterday. The buyer told me that he was reviewing the parts before he came (5900X, Water cooled 6800XT, 4 TB NVME Storage and 850 W PSU. During the transaction I asked the buyer if he knew what Steam was and he looked at me like I had 2 heads. After I had him test the PC he looked at me and said "what was that Game where you are a Car thief again?" I said you mean GTA5 and he said There is a 5th one!!? I then showed him Humble Choice for the month and he was sold. I contacted him this morning to see how his rig was going and he told me he was waiting to go to Canada Computers to get more cables because he will be Gaming on one of his screens while Working sometimes. I myself am guilty of getting HEDT because I love PCIe lanes and you would have to have had that to understand how I have built like 9 PCs using the storage from my HEDT. If only AMD had not thought that the 3950X and a 12 Core HEDT (3945X) to the masses for an affordable price would have eaten into the 5950X sales but that might not have been the case.

So you opened his eyes to what is possible! See, I love reading things like this and while I'm glad that you did what you did (because his life may never be the same now, which is a good thing), it also serves to underscore just how oblivious that a lot of workstation users are when it comes to gaming on a PC. I believe that most of the 12 and 16-core parts get bought by businesses for their own workstations (assuming that they don't need anything like Threadripper or EPYC) and gaming is far from their desired use. In fact, I'd be willing to bet that Zen4's IGP makes those CPUs even more attractive to businesses because how much GPU power does an office app use?

MusselsAbso-friggin-lutely

The 5600 is IMO, the best choice for 99% of gamers out there - you need a serious GPU to ever have it even be a limit.

I agree. The R5-5600 is a tremendous value, but only in the short-term. In the long-term, the R7-5800X3D will leave it in the dust. It's like back when people were scooping up the R3-3100X faster than AMD could make them. They were an unbelievable value at the time, but, like most quad-cores, their legs proved to be pretty short over the years.

Musselsbecause the 5800x3D did run at lower clocks (overall mine runs 4.45GHz all core vs 4.6GHz in AVX workloads, soooo much slower) it'd be hard to advertise it as an all purpose product when it'd have a deficit in some commonly used setups

Well, here's the catch... Technically, ALL x86 CPUs are all-purpose products. There's nothing that the R9-7950X can do that my FX-8350 can't. It's just a matter of how fast and so for certain tasks, there are CPUs that are better than others. However, that doesn't change the fact that they can all do everything that any other x86 CPU can do. That does meet my definition of all-purpose.

Q: Can an R7-5800X3D run blender?
A: Yep.
Q: Can it run ARM applications?
A: Yep.
Q: Can it run Adobe Premiere?
A: Yep.
Q: Can it run a virtual machine?
A: Yep.
Q: Can it run MS-Office or LibreOffice?
A: Yep.
Q: Can it run a multimedia platform?
A: Yep.
Q: Can it run a local server?
A: Yep.
Q: Can it run 7zip?
A: Yep.
Q: Can it run DOOM?
A: Tell me one thing that CAN'T run DOOM!

And finally...
Q: CAN IT RUN CRYSIS?
A: Yep.

Ok, so yeah, it's an all-purpose CPU!

MusselsWhat'd be amazing is if AMD had the pull intel does with microsoft, and could release a CPU with one 3D stacked die and others without - 3D becomes the P cores, and the others do the higher wattage boring workloads

That's why I think AMD made a serious screwup by not having a hexacore 3D CPU. The reason that Intel has that much pull with MS is the fact that most Windows PCs still have an Intel CPU at their heart. If AMD managed to conquer the gamer market with an R5-7600X3D, then they too would have considerable pull with Microsoft, far more than they have now anyway.

#147

Count von Schwalbe

Avro ArrowI would say then that they should stop producing the R5-7600X and just replace it with the R5-3600X3D. The non-X parts are the budget parts and they can be left alone.

5600X3D would be a better choice. They could use an existing chiplet design and it's not like the 3600 is particularly cheaper to produce than the 5600X - same node and less chiplets to attach to the substrate. Also, gaming performance is already much better with Zen 3.

#148

Avro Arrow

Count von Schwalbe5600X3D would be a better choice. They could use an existing chiplet design and it's not like the 3600 is particularly cheaper to produce than the 5600X - same node and less chiplets to attach to the substrate. Also, gaming performance is already much better with Zen 3.

The problem is that they're not making those anymore and they don't want to. The more AM4 upgrade CPUs they make, the fewer AM5 CPUs and motherboards they can sell. There's also the fact that they'd have to sell the 5600X3D for even less than the 7600X3D would be and no corporation wants less money.

I do however agree that an R5-5600X3D would've made more sense than the R7-5800X3D. The thing is, I gave them a pass because it was their first "kick at the can" so to speak. I won't give them the same pass with AM5 because the 5800X3D showed that they clearly got it right. It's not just that they're being greedy, it's that they're being stupid. I really don't think that the 3D cache has enough of a positive impact on productivity that the productivity users will be willing to pay extra for it and they'll just gather dust like the RTX 4080 and RX 7900 XT. That's not good for anybody.

#149

Count von Schwalbe

Avro ArrowI do however agree that an R5-5600X3D would've made more sense than the R7-5800X3D.

I was saying it would make more sense than the 3600X3D.

The trouble with a 6-core X3D, 5K or 7K series, is that the X3D bit commands a $150 price premium - would you have bought a $400 6-core? I think enough people were complaining about the low core count/MT performance of the 5800X3D.

#150

kapone32

Avro ArrowThe problem is that they're not making those anymore and they don't want to. The more AM4 upgrade CPUs they make, the fewer AM5 CPUs and motherboards they can sell. There's also the fact that they'd have to sell the 5600X3D for even less than the 7600X3D would be and no corporation wants less money.

I do however agree that an R5-5600X3D would've made more sense than the R7-5800X3D. The thing is, I gave them a pass because it was their first "kick at the can" so to speak. I won't give them the same pass with AM5 because the 5800X3D showed that they clearly got it right. It's not just that they're being greedy, it's that they're being stupid. I really don't think that the 3D cache has enough of a positive impact on productivity that the productivity users will be willing to pay extra for it and they'll just gather dust like the RTX 4080 and RX 7900 XT. That's not good for anybody.

We are at an interesting place in the CPU space. If you got a 1800X and then jumped to Ryzen 2 you would feel nice as the PC would be snappier and you would see improved Game performance. If you went from Ryzen 2 to Ryzen 3 you were blown away at how efficient the CPUs were while being faster the 2nd gen. If you got 4th Gen you were not as enthused as the increase in performance was more whelming. When the 5th Gen launched it did not have the same effect because MBs were way too expensive and there were no carrots like 5.0 GPus or even storage that became moot quickly. The thing is X570S was launched and Super featured B550 boards were launched before X670 but those boards are just as flexible as any X670 board (with what you can buy). Then the board makers priced their boards to the moon as any X670E board is more expensive than some TRX40 boards. DDR5 is another onion as it those numbers like 36-36-36-70 are not in the least bit interesting. Intel gets some too as they obviously are insane for splitting the GPU lanes between the M2 and GPU on their Z790 boards. Then there is the obvious elephant in the room that says that E cores are a joke and really only serve the purpose of making multi threaded benchmarks favour those chips. The kicker is that they want a king's ransom from consumers and have used their power to do so. It is not all bad though as the narrative is what it is but reality is changing the landscape. The 5800X3D IS that good. AMD don't need to make another chip for AM4. They can easily produce more of these as TSMC has moved on from 7NM being the state of the art making those dies less expensive. That CPU alone will carry AM4 into the sunset combined with Nvidia juicing their prices and AMD cards falling in price combined with their software means AMD will dominate the budget Gaming market in 2023. I went to the PC store the other day and I saw that at least 8 AMD cards were waiting for pickup while there were 2 Nvidia cards on top of that there were way more Nvidia cards than AMD sitting on the shelves and when I saw the pricing I understood and here is the other side. The experiment that was Raptor Lake will be a home run when Meteor Lake launches. The reason these are important is that the kids that grew up on Minecraft are already hard wired to be Gamers and are starting to get their own money, like we did at one point. When they just don't have enough and go with their parents to the PC store pragmatism will kick in and when the associate tells the parent that the $300 card (6600) is faster than the $400 (3050) card but the $400 card has Ray tracing hehe. I am sure in this Cost of living crisis that the lowest hanging fruit will always be picked first. If I was Nvidia I would be scared as Intel and AMD (I have said this before) will and are going to the OEM vendors and making deals for all Intel and AMD based systems. Intel may have underwhelmed on their first (new) GPU but they will improve their drivers as time goes on and release faster chips but the drivers is where it is at. If and when they do that you know who's left out? The fact that they shot for the moon and tried to buy ARM was the cannon shot that could be heard but not felt around the World. I will disqualify the preceding statement with the fact that I am on my 3rd King can of Hollandia (No DABs) but I am on Vacation until January so it's all good.

Add your own comment

AMD Readies 16-core, 12-core, and 8-core Ryzen 7000X3D "Zen 4" Processors

153 Comments on AMD Readies 16-core, 12-core, and 8-core Ryzen 7000X3D "Zen 4" Processors

[URL='https://www.techpowerup.com/review/amd-ryzen-5-7600x/23.html'][SIZE=4][U]7600X frametime analysis[/U][/SIZE][/URL]

[URL='https://www.techpowerup.com/review/amd-ryzen-7-7700x/23.html'][SIZE=4][U]7700X frametime analysis[/U][/SIZE][/URL]

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

AMD Readies 16-core, 12-core, and 8-core Ryzen 7000X3D "Zen 4" Processors

Related News

153 Comments on AMD Readies 16-core, 12-core, and 8-core Ryzen 7000X3D "Zen 4" Processors

[URL='https://www.techpowerup.com/review/amd-ryzen-5-7600x/23.html'][SIZE=4][U]7600X frametime analysis[/U][/SIZE][/URL]

[URL='https://www.techpowerup.com/review/amd-ryzen-7-7700x/23.html'][SIZE=4][U]7700X frametime analysis[/U][/SIZE][/URL]

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts