• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Introduces Instinct MI210 Data Center Accelerator for Exascale-class HPC and AI in a PCIe Form-Factor

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,590 (0.97/day)
AMD today announced a new addition to the Instinct MI200 family of accelerators. Officially titled Instinct MI210 accelerator, AMD tries to bring exascale-class technologies to mainstream HPC and AI customers with this model. Based on CDNA2 compute architecture built for heavy HPC and AI workloads, the card features 104 compute units (CUs), totaling 6656 Streaming Processors (SPs). With a peak engine clock of 1700 MHz, the card can output 181 TeraFLOPs of FP16 half-precision peak compute, 22.6 TeraFLOPs peak FP32 single-precision, and 22.6 TFLOPs peak FP62 double-precision compute. For single-precision matrix (FP32) compute, the card can deliver a peak of 45.3 TFLOPs. The INT4/INT8 precision settings provide 181 TOPs, while MI210 can compute the bfloat16 precision format with 181 TeraFLOPs at peak.

The card uses a 4096-bit memory interface connecting 64 GBs of HMB2e to the compute silicon. The total memory bandwidth is 1638.4 GB/s, while memory modules run at a 1.6 GHz frequency. It is important to note that the ECC is supported on the entire chip. AMD provides an Instinct MI210 accelerator as a PCIe solution, based on a PCIe 4.0 standard. The card is rated for a TDP of 300 Watts and is cooled passively. There are three infinity fabric links enabled, and the maximum bandwidth of the infinity fabric link is 100 GB/s. Pricing is unknown; however, availability is March 22nd, which is the immediate launch date.

AMD places this card directly aiming at NVIDIA A100 80 GB accelerator as far as the targeted segment, with emphasis on half-precision and INT4/INT8 heavy applications.


View at TechPowerUp Main Site | Source
 
Joined
May 3, 2018
Messages
2,881 (1.20/day)
Gob-smacked an accelerator oft his class is crippled for single and double precision floats. Clearly not designed for scientific work.
 
Joined
Nov 4, 2005
Messages
11,984 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Gob-smacked an accelerator oft his class is crippled for single and double precision floats. Clearly not designed for scientific work.
A lot of preliminary AI ML code is fine at low precision, think of it as focusing a lens, low precision first then high precision workload with lower performance requirements as the major constraints have been narrowed by faster math.

Same way our human brains work, unless we focus on it its blurry background that can be quickly dealt with and move on.
 
Joined
May 3, 2018
Messages
2,881 (1.20/day)
A lot of preliminary AI ML code is fine at low precision, think of it as focusing a lens, low precision first then high precision workload with lower performance requirements as the major constraints have been narrowed by faster math.

Same way our human brains work, unless we focus on it its blurry background that can be quickly dealt with and move on.
It's HPC also not just AI. My definition of HPC is clearly different to yours.
 
Top