Tuesday, January 14th 2025
AMD Implements New CCD Connection in "Strix Halo" Ryzen AI Max Processors
Thanks to the informative breakdown by Chips and Cheese, we are learning that AMD's latest Ryzen AI processors for laptops, codenamed "Strix Halo," utilize a parallel "sea of wires" interconnect system between their chiplets, replacing the SERDES (serializer/deserializer) approach found in desktop Ryzen models. The processor's physical implementation consists of two Core Complex Dies (CCDs), each manufactured on TSMC's N4 (4 nm) process and containing up to eight Zen 5 cores with full 512-bit floating point units. Notably, the I/O die (IOD) is also produced using the N4 process, marking an advancement from the N6 (6 nm) process used in standard Ryzen IODs on desktops. The key change lies in the inter-chiplet communication system. While the Ryzen 9000 series (Granite Ridge) employs SERDES to convert parallel data to serial for transmission between chiplets, Strix Halo implements direct parallel data transmission through multiple physical connections.
This design achieves 32 bytes per clock cycle throughput and eliminates the latency overhead associated with serialization/deserialization processes. The parallel interconnect architecture also removes the need for connection retraining during power state transitions, a limitation present in SERDES implementations. However, this design choice necessitates additional substrate complexity due to increased connection density and requires more pins for external connections, suggesting possible modifications to the CCD design compared to desktop variants. AMD's implementation required more complex substrate manufacturing processes to accommodate the dense parallel connections between chiplets. The decision to prioritize this more challenging design approach was driven by requirements for lower latency and power consumption in data-intensive workloads, where consistent high-bandwidth communication between chiplets is crucial.
Sources:
Chips and Cheese, via HardwareLuxx
This design achieves 32 bytes per clock cycle throughput and eliminates the latency overhead associated with serialization/deserialization processes. The parallel interconnect architecture also removes the need for connection retraining during power state transitions, a limitation present in SERDES implementations. However, this design choice necessitates additional substrate complexity due to increased connection density and requires more pins for external connections, suggesting possible modifications to the CCD design compared to desktop variants. AMD's implementation required more complex substrate manufacturing processes to accommodate the dense parallel connections between chiplets. The decision to prioritize this more challenging design approach was driven by requirements for lower latency and power consumption in data-intensive workloads, where consistent high-bandwidth communication between chiplets is crucial.
24 Comments on AMD Implements New CCD Connection in "Strix Halo" Ryzen AI Max Processors
I’m more concerned about the implications of requiring more socket pins, and if that’s the case and zen6 uses a new IO die we get:
1) New socket
or
2) Design compromises to meet current AM5 socket pinout.
Anyway, Zen 6 will have new IOD, finally.
*Socket FP11 has 2077 pins and AM5 has 1718 pins, obviously the new AI Max processors support quad channel memory, but unsure how that alone directly effects socket pin count differences.
"However, this design choice necessitates additional substrate complexity due to increased connection density and requires more pins for external connections, suggesting possible modifications to the CCD design compared to desktop variants"
All I read was more dense and complex substrate for more parallel connections.
This will require changes to CCDs and IOD.
No word for socket pins count
Probably the word pins is what throws you out.
But chiplets can have pins(connections) to the substrate.
Hopefully...
It always amazes me that communications swaps back and forth between these two methodologies so much.
As Zach pointed out, those extra Strix Halo pins are required for additional memory channels, NPU and maybe GPU, or not?
There's no need to increase pin count in order to implement new IOD, because that's CCD interconnection thing (substrate thing).
If new IOD required more pins, then that would literally mean end of AM5. However, AMD commited to hold it alive till at least 2027.
I think we will see modified version of this Strix Halo's IOD in Zen 6.
45W is "desktop replacement laptop", very high power usage for laptops but technically doable.
50W is for breakfast...
40CU x64 = 2560 shaders
2560 shaders is between 7600XT and 7700XT
Thats why it needs to be the whole thing on 4nm node
Wont be cheap either.
That’s why I listed two scenarios for zen 6, one being a different socket, the other a modified design which could end up with a less efficient (data/speed) wise to accommodate socket limitations.
I think the wording of the OP explicitly implies pin count changes, why would you mention “external” otherwise. Another question would be how many pins are currently not in use on AM5 that can be repurposed to accommodate potential changes.
I also wonder if they plan to make use of cu-dimms at any point, as this route is infinitely better than the disaster/failure that is camm; it’s currently unclear if cu-dimm support will be dependent on just cpu arch design updates or also require motherboard/socket changes. I don’t know enough about this to really say.
AMD's IFOP ... not that we know many details about it but it seems to be a set of 40+32+1 serial interfaces. 40 bits from IOD to CCD, 32 bits from CCD to IOD, and clock, working at 8 GT/s with a 2 GHz FCLK clock.
chipsandcheese.com/p/pushing-amds-infinity-fabric-to-its