Thursday, January 5th 2023

AMD Shows Instinct MI300 Exascale APU with 146 Billion Transistors
During its CES 2023 keynote, AMD announced its latest Instinct MI300 APU, a first of its kind in the data center world. Combining the CPU, GPU, and memory elements into a single package eliminates latency imposed by long travel distances of data from CPU to memory and from CPU to GPU throughout the PCIe connector. In addition to solving some latency issues, less power is needed to move the data and provide greater efficiency. The Instinct MI300 features 24 Zen4 cores with simultaneous multi-threading enabled, CDNA3 GPU IP, and 128 GB of HBM3 memory on a single package. The memory bus is 8192-bit wide, providing unified memory access for CPU and GPU cores. CLX 3.0 is also supported, making cache-coherent interconnecting a reality.
The Instinct MI300 APU package is an engineering marvel of its own, with advanced chiplet techniques used. AMD managed to do 3D stacking and has nine 5 nm logic chiplets that are 3D stacked on top of four 6 nm chiplets with HBM surrounding it. All of this makes the transistor count go up to 146 billion, representing the sheer complexity of a such design. For performance figures, AMD provided a comparison to Instinct MI250X GPU. In raw AI performance, the MI300 features an 8x improvement over MI250X, while the performance-per-watt is "reduced" to a 5x increase. While we do not know what benchmark applications were used, there is a probability that some standard benchmarks like MLPerf were used. For availability, AMD targets the end of 2023, when the "El Capitan" exascale supercomputer will arrive using these Instinct MI300 APU accelerators. Pricing is unknown and will be unveiled to enterprise customers first around launch.
The Instinct MI300 APU package is an engineering marvel of its own, with advanced chiplet techniques used. AMD managed to do 3D stacking and has nine 5 nm logic chiplets that are 3D stacked on top of four 6 nm chiplets with HBM surrounding it. All of this makes the transistor count go up to 146 billion, representing the sheer complexity of a such design. For performance figures, AMD provided a comparison to Instinct MI250X GPU. In raw AI performance, the MI300 features an 8x improvement over MI250X, while the performance-per-watt is "reduced" to a 5x increase. While we do not know what benchmark applications were used, there is a probability that some standard benchmarks like MLPerf were used. For availability, AMD targets the end of 2023, when the "El Capitan" exascale supercomputer will arrive using these Instinct MI300 APU accelerators. Pricing is unknown and will be unveiled to enterprise customers first around launch.
44 Comments on AMD Shows Instinct MI300 Exascale APU with 146 Billion Transistors
Shall we discuss the XTX overhaet issue next, how about we leave DGPU talk to threads that suit it?!.
Not Enterprise which is showing up your naivety.
The 3D-V Cache power use, plus heat from the silicon below doesn't make a lot of difference to temperature because of how it is manufactured, plus the die selection that has been made, plus the 3D-V Cache has been designed to sit directly over the CPU Cache, it cant just be placed anywhere, and so any other products AMD wants to make where they can stack 3D-V Cache has to be designed that way from the start, and the 3D-V Cache has to be designed to fit the die below it, such as the MCD's on their new GPU's, are they the same as for the Zen CPU's.? I doubt it.
As for the latency point, I don't see how it would reduce latency by having the 3D-V Cache on the underside of the CPU die, if anything I think that apart from being a wiring nightmare, it would add latency as the processor die would be just a little further away from the substrate, and thus everything that it is communicating with, RAM, and I/O I imagine the difference of a few microns would be minimal, and if so, it would also be minimal if the cache was underneath the compute dies ¯\_(ツ)_/¯
So, please answer me. What possible bankruptcy? Be clever with your response please, get out of the 10 year old mindset and subsequent comments.
That's a very clever way of disguising a 60% increased TDP.
Aurora was first announced in 2015 and to be finished in 2018
2017, delayed to 2021 but scaled up to 1 exaFLOP (with Ponte Viecchio)
October 2020, delayed another 6mo.
October 2021, Intel took a 400m Loss on the checkbook and Aurora is targeted for 2023 and 2exaflops. I am curious to see AMD's reply as to how this chip works in multi configs...
It looks... like its 2 socket max. It looks like a 4 GPU chiplet with 4x8 zen4 chiplets betwen the hbm stacks.
Taking node from 4x 560w to 2x idk 700-900w? Unclear if it will need a parent cpu, as each APU has 96 Zen4 cores onboard.
Also the block diagram is notably different than the chip in hand and pictured.
For all of you (ARF) that clearly missed it, AMD was very proudly showing off their insanely profitable Xilinx wing, under the new AMD XDNA branding.
That new FPGA Decimates the inference card by nvidia in the same class... AMD is taking advantage of NVidia abandoning the 75w slot.
www.xilinx.com/content/dam/xilinx/publications/product-briefs/alveo-v70-product-brief.pdf
Nvidia replaced the T4 with the A2 and it's barely better. www.nvidia.com/en-us/data-center/tesla-t4/ www.nvidia.com/en-us/data-center/products/a2/
Left room for AMD.
There is more growth oppertunity with higher returns in datacenter.
Look at Nvidia. Q3 2022 they pulled in $3.8B datacenter, up 31% yoy and gaming was $1.6B, down 51% yoy. Looking at Nvidia's stack, it is geared for datacenter. Even watching Nvidia's release of gaming GPUs, there is way more focus on everything outside gaming. But gaming gets to benefit from a lot of these features that they primarily are selling for enterprise.