Thursday, February 13th 2025

AMD to Build Next-Gen I/O Dies on Samsung 4nm, Not TSMC N4P

Back in January, we covered a report about AMD designing its next-generation "Zen 6" CCDs on a 3 nm-class node by TSMC, and developing a new line of server and client I/O dies (cIOD and sIOD). The I/O die is a crucial piece of silicon that contains all the uncore components of the processor, including the memory controllers, the PCIe root complex, and Infinity Fabric interconnects to the CCDs and multi-socket connections. Back then it was reported that these new-generation I/O dies were being designed on the 4 nm silicon fabrication process, which was interpreted as being AMD's favorite 4 nm-class node, the TSMC N4P, on which the company builds everything from its current "Strix Point" mobile processors to the "Zen 5" CCDs. It turns out that AMD has other plans, and is exploring a 4 nm-class node by Samsung.

This node is very likely the Samsung 4LPP, also known as the SF4, which has been in mass-production since 2022. The table below shows how the SF4 compares with TSMC N4P and Intel 4, where it is shown striking a balance between the two. We have also added values for the TSMC N5 node from which the N4P is derived from, and you can see that the SF4 offers comparable transistor density to the N5, and is a significant improvement in transistor density over the TSMC N6, which AMD uses for its current generation of sIOD and cIOD. The new 4 nm node will allow AMD to reduce the TDP of the I/O die, implement a new power management solution, and more importantly, the need for a new I/O die is driven by the need for updated memory controllers that support higher DDR5 speeds and compatibility with new kinds of DIMMs, such as CUDIMMs, RDIMMs with RCDs, etc.
Sources: The Bell, Jukanlosreve (Twitter)
Add your own comment

65 Comments on AMD to Build Next-Gen I/O Dies on Samsung 4nm, Not TSMC N4P

#26
dgianstefani
TPU Proofreader
FouquinHere's a good one: fixed function and analog logic gates and their inability to benefit greatly from process density scaling.
It's also where the GPU is, so it's not like it's all just fixed function. Besides, Samsung processes are notorious for being comparatively bad, especially what will be a 4 year old process by the time of this product release, and memory latency and IO die power draw/waste is the pain point of these chiplet processors, so it's frustrating to see them seem to sideline it again.

Hopefully rumours are true and the packaging takes a huge jump to mitigate the same issue Zen has had since it's first generation.
Posted on Reply
#27
R-T-B
dgianstefaniAMD and skimping on IO die. Name a better duo.
ikr? Right when the thing hurting then the most is the IMC.
FouquinHere's a good one: fixed function and analog logic gates and their inability to benefit greatly from process density scaling.
That may be true in which case maybe this move could work out. I just hope the IMC gets updated to be at least... decent.
Posted on Reply
#28
alwayssts
R0H1TAre you forgetting the best selling mobile chips on the planet? SD Elite & Mediatek's 9400 are also on N3E currently? They're not going to stop selling those chips any time soon, in fact there's still new phones/tablets being launched with SD gen 2/3 & the 9300 as well.
Well, those are mobile, but you're right. I'm talking about larger chips (that are the not 'M') lagging behind one node. They start producing when the smaller chips move to the next node and free production up.
It isn't *just* Apple, but they're kind of made for and dictated by Apple, so I guess that's why I say that. You're right it's not *just* them, though. There are other mobile chips (that generally flip nodes at same time).

SDEG2 is N3P (they tried Samsung but apparently it sucked too much and/or couldn't yield). SDEG3 they *want* to be on Samsung 2nm, but like I said, I think everyone is waiting to see if Samsung can do it.

Did we say this about 3nm? Yes we did. Were they able to do it? No they weren't. Will it be different this time? I don't know. :p
Posted on Reply
#29
dgianstefani
TPU Proofreader
R-T-Bikr? Right when the thing hurting then the most is the IMC.


That may be true in which case maybe this move could work out. I just hope the IMC gets updated to be at least... decent.
The IMC is fine, it's the disaggregated nature of the chip that hurts it. ARL has same issue where for some reason they put memory controller on a different tile, so you can get insane (+25%) gaming performance improvements from tuning interconnect speeds along with the usual stuff. On Zen chiplets have to talk to each other and the RAM via the IF and the IO die, not directly. Hence the monolithic APUs like 8000 series doing so much better memory and latency wise. Fingers crossed for better packaging but that costs more so...
Posted on Reply
#30
R-T-B
dgianstefaniThe IMC is fine
It might be if it could do 4 DIMMs. I doubt it could do that well even in monolithic form.
Posted on Reply
#31
dgianstefani
TPU Proofreader
R-T-BIt might be if it could do 4 DIMMs. I doubt it could do that well even in monolithic form.
Why do 4 DIMMs when you can get 96 GB from two?

More than that and you're better off with HPC anyway.
Posted on Reply
#32
R-T-B
dgianstefaniWhy do 4 DIMMs when you can get 96 GB from two?

More than that and you're better off with HPC anyway.
Because 256GBs will soon be possible with 4?

Don't act like use cases for >128GBs don't exist. I know there is HEDT but still, options. Options should be there.

I'm running 128GBs right now for one very high ram use case.
Posted on Reply
#33
dgianstefani
TPU Proofreader
R-T-BBecause 256GBs will soon be possible with 4?

Don't act like use cases for >128GBs don't exist. I know there is HEDT but still, options. Options should be there.

I'm running 128GBs right now for one very high ram use case.
I feel like people who actually need 256 GB of memory can afford the $1k+ Threadrippers, and it's not really a priority for consumers. Besides, CUDIMMS exist which target this exact problem, so if you need that much memory Intel is the better choice at the moment considering ARL is very competitive for non gaming stuff.
Posted on Reply
#34
R-T-B
dgianstefaniI feel like people who actually need 256 GB of memory can afford the $1k+ Threadrippers
I can but you are again acting like options are a bad thing. I want my gaming and work rig to be the same box. Cost is not the issue here.
dgianstefaniCUDIMMS
Unsupported by current Zen5 IMC. Seeing the problem yet? Saying "go intel" is not exactly great for AMD.
Posted on Reply
#35
dgianstefani
TPU Proofreader
R-T-BI can but you are again acting like options are a bad thing. I want my gaming and work rig to be the same box. Cost is not the issue here.
I'm not, I'm just aware that 99% of people on AM5 have zero need for 256 GB.
R-T-BUnsupported by current Zen5 IMC. Seeing the problem yet?
Yes, AMD is unsurprisingly behind Intel for RAM support as usual, hopefully this will change with Zen 6.

From what I understand they could add support via a BIOS update, they just haven't. Perhaps there's not much need considering the promised 64 GB CUDIMMs haven't materialised anyway.

But again, the MT cap isn't limited by the IMC, at least not for 2 DIMM, it's limited by IF having to stay in sync, which is capped by the chiplet design and cheap packaging.
Posted on Reply
#37
LittleBro
TSMC N3 is fully booked, upcoming N2 is extremely expensive and booked mostly by Apple and Nvidia, maybe Intel booked something in case their 18A sucks. TSMC N4 (enhanced N5) is utilized by Nvidia for RTX 5000 and Hopper, among other things. Intel 4 is in fact 7 nm, a bit more densier than TSMC N6 though but not enough significant leap to make AMD steer that way. So really no other choice but to go with Sammy if TSMC is too expensive. Sammy's N4 has about <5% less transistor density than TSMC's N4, while achieving about 20% higher density than TSMC' N6.

Sure, TSMC N4 would be a better choice.

Zen 6 12-core CCD is reportedly gonna be made using TSMC N3 process, which should allow for achieving similar size as does 12-cores in Strix Point already made using TSMC N4. TSMC N3 has about 55% higher density than TSMC N4, which is something that should allow AMD to achieve 12 full Zen 6 cores (incl. 48 MB L3 cache) per CCD. Anyway, AMD really needs more cores per CCD to stay competitive, as Intel keeps raising cores.

If AMD stays with 8 cores per CCD in Zen 6, that's gonna be a total disaster. Since Zen 5 is already AVX-512 focused generation with improved caches and prediction algorithm, I can't think of anything else but increasing cores with Zen 6 to increase performance enough.

Posted on Reply
#38
N/A
kondaminMight want to add some extra words
The IOD is 27 Mtr/mm² and far from any stated density. I doubt SFF4 will change things much. it's just a porting the same die with some retooling, interconnect pitch 34 to 32 nm. earth shattering.
alwaysstsThanks for the refresher on the IOD density...I'll have to figure out exactly what that translates to for analog (vs other processes) when I have time.

Are you curious about density of 5070? I figure it's bc it's aimed to clocked higher (by relaxing density). I've been trying to explain that to people...prolly a special child...and likely a design test for clocks on 3nm.
Oh intel does that all the time, with their plus++ nodes, it's a learning curve
Posted on Reply
#39
R0H1T
N/AOh intel does that all the time, with their plus++ nodes, it's a learning curve
That's not necessarily true, they went that route only because of the disasters that were 14nm & 10nm. Before that till 22nm (delayed by a few quarters) they had max 2 gens on the same node for probably 2(3?) decades.
Posted on Reply
#40
R-T-B
R0H1TTbf it's better you go full ECC with those capacities, unless you're already doing that? In which case high speed RAM is not an option, the bigger issue with zen 5 is the hobbled IF ~
Who said anything about high speed ram? I'm running DDR5-4000 right now and its actually not that bad because its in sync with the IF. The real issue is the ~15 min training times to do this. That's painfully bad.
R0H1TTbf it's better you go full ECC with those capacities
Yeah that probably is where I will eventually go but that tends to be even harder on the IMC, so ouch again.
Posted on Reply
#41
R0H1T
Comes with the territory, I have 64GB & 128GB setups on x570 & I just do manual timings almost always these days. With the way zen5 turned out on desktops it's almost like a side project to the showrunners EPYC & Strix Halo. Which is to say AMD didn't try to make it too good, or in fact in order to push segmentation (like you know who) they deliberately(?) gimped it a bit too much.

You can't seriously believe they'd make something to threaten low end TR or EPYC do you?
Posted on Reply
#42
joseLopez
It seems a bit reckless to me to use Samsung instead of TSMC.
Posted on Reply
#43
3valatzy
joseLopezIt seems a bit reckless to me to use Samsung instead of TSMC.
AMD must move all of the chips to Samsung - better, cheaper and available..
Posted on Reply
#44
Fouquin
dgianstefaniIt's also where the GPU is, so it's not like it's all just fixed function.
The GPU CUs that can scale on the current I/O die are approximately 1/40th of the die, the rest of the GPU logic is supporting logic and can certainly scale but with varying degrees of success. The remainder of the die is predominantly low density supporting logic; that's why it's relegated to the I/O die. Remember that this was the point of the chiplet transition in the first place; isolate the scalable logic from the non-scalable logic so that the scalable logic can benefit from bleeding edge process improvements and not languish being tied to the fixed and analog logic.

I agree that hopefully they have some better packaging because the current limits on bandwidth are dashing the advantage AMD has in latency for bursty workloads.
Posted on Reply
#45
Assimilator
All I want is more than 24 PCIe lanes. If AMD gives us that in the consumer space, they can fab their IOD from moon dust for all I care. You democratised cores with Zen AMD, now democratise PCIe lanes with Zen 6.
Posted on Reply
#46
Fishymachine
Is that graph comparing some cache and IO-rich Apple A chip, to the hypothetical pure logic design that fabs' PR always seem to find? In which case N4E should be somewhere near 190(vanilla N5 is 178, with shit like N31 GDC clocking in at 150MTr/mm2 in a real world product)
Posted on Reply
#47
R-T-B
R0H1TI just do manual timings almost always these days.
This won't save you from link training times on AM5.
R0H1TYou can't seriously believe they'd make something to threaten low end TR or EPYC do you?
No, but I can believe they'd compete with clientside Intel which is spanking them in this area. It's not like I am asking for more cores, threads, pcie lanes or memory channels.

I'm not complaining. Honestly AM5 still does what I need. But I do see spots for improvement.
Posted on Reply
#48
Denver
user556The SRAM cell size and interconnect pitch are both similar to TSMC's N6. Those are significant factors for density. But it's the speed/power profile that will really be the telling in the end, and that is not listed.
This is not the kind of chip that needs a lot of cache, so the density will be very close to 4nm TSMC. Also, the data on 6nm is terribly wrong.
Posted on Reply
#49
Carillon
AssimilatorAll I want is more than 24 PCIe lanes. If AMD gives us that in the consumer space, they can fab their IOD from moon dust for all I care. You democratised cores with Zen AMD, now democratise PCIe lanes with Zen 6.
If zen6 is on AM5 the best they can do is make the chipset pcie5.

If what they said about strix halo having insane IF bandwith is true (i presume just doubling the connections) AMD could be doing the same with this rumored 4nm IOD.
Posted on Reply
#50
R0H1T
R-T-BThis won't save you from link training times on AM5.


No, but I can believe they'd compete with clientside Intel which is spanking them in this area. It's not like I am asking for more cores, threads, pcie lanes or memory channels.

I'm not complaining. Honestly AM5 still does what I need. But I do see spots for improvement.
Maybe, hopefully with zen6 at least. I'm not getting AM5 anytime soon & the way AMD is doing things these days they'll probably cede a lot of ground to Apple. The reason people still go with AMD is of course "PC" and the ability to upgrade with substantial improvements in say two generations. But if it's that much of a headache then going with fruity loops is not a bad option. AMD is trashing a lot of goodwill they garnered post Zen rather quickly. Then there's QC & maybe Nvidia that could compete with them on desktops in a year or two, the window to eff so many things is also insanely large.
Posted on Reply
Add your own comment
Feb 15th, 2025 10:23 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts