Thursday, August 6th 2020
Intel Readies Atom "Grand Ridge" 24-core Processor, Features PCIe 4.0 and DDR5
Intel is monetizing its "small" x86 cores across its product lineup, and not just in entry-level client processors. These cores will be part of Intel's current- and upcoming Hybrid processors, and have been serving Intel's re-branded Atom line of high core-count low-power server processors targeting micro-servers, NAS, network infrastructure hardware, and cellular base-stations. A company slide scored by AdoredTV unveils Intel's Atom "Grand Ridge" 24-core processor. A successor to the 24-core Atom P5962B "Snow Ridge" processor built on 10 nm and featuring "Tremont" CPU cores, "Grand Ridge" sees the introduction of the increased IPC "Gracemont" CPU cores to this segment. These cores make their debut in 2021 under the "Alder Lake" microarchitecture as "small" cores.
The "Grand Ridge" silicon is slated to be built on Intel's 7 nm HLL+ silicon fabrication node, and features 24 "Gracemont" cores across six clusters with four cores, each. Each cluster shares a 4 MB L2 cache among the four cores, while a shared L3 cache of unknown size cushions transfers between the six clusters. Intel is deploying its SCF (scalable coherent fabric) interconnect between the various components of the "Grand Ridge" SoC. Besides the six "Gracemont" clusters, the "Grand Ridge" silicon features a 2-channel DDR5 integrated memory controller, and a PCI-Express gen 4.0 root complex that puts out 16 lanes. It also features fixed function hardware that accelerates network stack processing. There are various USB and GPIO connectivity options relevant to 5G base-station setups. Given Intel's announcement of a delay in rolling out its 7 nm node, "Grand Ridge" can only be expected in 2022, if not later.
Sources:
AdoredTV (YouTube), VideoCardz
The "Grand Ridge" silicon is slated to be built on Intel's 7 nm HLL+ silicon fabrication node, and features 24 "Gracemont" cores across six clusters with four cores, each. Each cluster shares a 4 MB L2 cache among the four cores, while a shared L3 cache of unknown size cushions transfers between the six clusters. Intel is deploying its SCF (scalable coherent fabric) interconnect between the various components of the "Grand Ridge" SoC. Besides the six "Gracemont" clusters, the "Grand Ridge" silicon features a 2-channel DDR5 integrated memory controller, and a PCI-Express gen 4.0 root complex that puts out 16 lanes. It also features fixed function hardware that accelerates network stack processing. There are various USB and GPIO connectivity options relevant to 5G base-station setups. Given Intel's announcement of a delay in rolling out its 7 nm node, "Grand Ridge" can only be expected in 2022, if not later.
28 Comments on Intel Readies Atom "Grand Ridge" 24-core Processor, Features PCIe 4.0 and DDR5
Tremont (and probably Gracemont) is out-of-order. software.intel.com/content/dam/develop/public/us/en/documents/64-ia-32-architectures-optimization-manual.pdf
You can see the renaming registers, as well as the new dual-decoder structure of Tremont. Two decoders operate out-of-order (!!) on this architecture.
------------
EDIT: I did a bit of research into this architecture. It seems like the point is to be a frontend for the Diamond Mesa chip (www.intel.it/content/www/it/it/products/programmable/asic/easic-devices/diamond-mesa-soc-devices.html).
Diamond Mesa will perform the heavy-work associated with routing or something. I don't really know. But the Atom is just there to interface with the ASIC. As such, the Atom doesn't really need to be fast, or realtime, at all. Its just Intel's lowest-energy chip. It does make me wonder why 24-cores are needed though.
However, what I described about the need for predictability is true, and that out of order execution makes the processor less predictable is also true. So to use any Atom past Silvermont, an embedded system would have to overcome that, or accept a certain percentage fault rate. This means that in many embedded control applications, Atom isn't the best - or at least is going to be difficult to get to fit into these scenarios.
There are many articles and papers on this.
www.es.mdh.se/pdf_publications/2840.pdf
"Furthermore, hardware used in real-time systems is steadily becoming more complex, including advanced computer architecture features such as caches, pipelines, branch prediction, and out-of-order execution. These features increase the speed of execution on average, but also makes the timing behavior much harder to predict, since the variation in execution time between fortuitious and worst cases increase. "
"Processor instruction timing has been getting increasingly variable over time, as features improving average-case performance and overall throughput are invented and put in use. Typically, performance improvements are achieved using various speculation and caching techniques, with the effect that the span between best-case and worst-case times increase. "