News Posts matching #ROCm

Return to Keyword Browsing

AMD Leads High Performance Computing Towards Exascale and Beyond

At this year's International Supercomputing 2021 digital event, AMD (NASDAQ: AMD) is showcasing momentum for its AMD EPYC processors and AMD Instinct accelerators across the High Performance Computing (HPC) industry. The company also outlined updates to the ROCm open software platform and introduced the AMD Instinct Education and Research (AIER) initiative. The latest Top500 list showcased the continued growth of AMD EPYC processors for HPC systems. AMD EPYC processors power nearly 5x more systems compared to the June 2020 list, and more than double the number of systems compared to November 2020. As well, AMD EPYC processors power half of the 58 new entries on the June 2021 list.

"High performance computing is critical to addressing the world's biggest and most important challenges," said Forrest Norrod, senior vice president and general manager, data center and embedded systems group, AMD. "With our AMD EPYC processor family and Instinct accelerators, AMD continues to be the partner of choice for HPC. We are committed to enabling the performance and capabilities needed to advance scientific discoveries, break the exascale barrier, and continue driving innovation."

AMD Announces CDNA Architecture. Radeon MI100 is the World's Fastest HPC Accelerator

AMD today announced the new AMD Instinct MI100 accelerator - the world's fastest HPC GPU and the first x86 server GPU to surpass the 10 teraflops (FP64) performance barrier. Supported by new accelerated compute platforms from Dell, Gigabyte, HPE, and Supermicro, the MI100, combined with AMD EPYC CPUs and the ROCm 4.0 open software platform, is designed to propel new discoveries ahead of the exascale era.

Built on the new AMD CDNA architecture, the AMD Instinct MI100 GPU enables a new class of accelerated systems for HPC and AI when paired with 2nd Gen AMD EPYC processors. The MI100 offers up to 11.5 TFLOPS of peak FP64 performance for HPC and up to 46.1 TFLOPS peak FP32 Matrix performance for AI and machine learning workloads. With new AMD Matrix Core technology, the MI100 also delivers a nearly 7x boost in FP16 theoretical peak floating point performance for AI training workloads compared to AMD's prior generation accelerators.

AMD Radeon "Navy Flounder" Features 40CU, 192-bit GDDR6 Memory

AMD uses offbeat codenames such as the "Great Horned Owl," "Sienna Cichlid" and "Navy Flounder" to identify sources of leaks internally. One such upcoming product, codenamed "Navy Flounder," is shaping up to be a possible successor to the RX 5500 XT, the company's 1080p segment-leading product. According to ROCm compute code fished out by stblr on Reddit, this GPU is configured with 40 compute units, a step up from 14 on the RX 5500 XT, and retains a 192-bit wide GDDR6 memory interface.

Assuming the RDNA2 compute unit on next-gen Radeon RX graphics processors has the same number of stream processors per CU, we're looking at 2,560 stream processors for the "Navy Flounder," compared to 80 on "Sienna Cichlid." The 192-bit wide memory interface allows a high degree of segmentation for AMD's product managers for graphics cards under the $250-mark.

AMD Announces Radeon Pro VII Graphics Card, Brings Back Multi-GPU Bridge

AMD today announced its Radeon Pro VII professional graphics card targeting 3D artists, engineering professionals, broadcast media professionals, and HPC researchers. The card is based on AMD's "Vega 20" multi-chip module that incorporates a 7 nm (TSMC N7) GPU die, along with a 4096-bit wide HBM2 memory interface, and four memory stacks adding up to 16 GB of video memory. The GPU die is configured with 3,840 stream processors across 60 compute units, 240 TMUs, and 64 ROPs. The card is built in a workstation-optimized add-on card form-factor (rear-facing power connectors and lateral-blower cooling solution).

What separates the Radeon Pro VII from last year's Radeon VII is full double precision floating point support, which is 1:2 FP32 throughput compared to the Radeon VII, which is locked to 1:4 FP32. Specifically, the Radeon Pro VII offers 6.55 TFLOPs double-precision floating point performance (vs. 3.36 TFLOPs on the Radeon VII). Another major difference is the physical Infinity Fabric bridge interface, which lets you pair up to two of these cards in a multi-GPU setup to double the memory capacity, to 32 GB. Each GPU has two Infinity Fabric links, running at 1333 MHz, with a per-direction bandwidth of 42 GB/s. This brings the total bidirectional bandwidth to a whopping 168 GB/s—more than twice the PCIe 4.0 x16 limit of 64 GB/s.

AMD Scores Another EPYC Win in Exascale Computing With DOE's "El Capitan" Two-Exaflop Supercomputer

AMD has been on a roll in both consumer, professional, and exascale computing environments, and it has just snagged itself another hugely important contract. The US Department of Energy (DOE) has just announced the winners for their next-gen, exascale supercomputer that aims to be the world's fastest. Dubbed "El Capitan", the new supercomputer will be powered by AMD's next-gen EPYC Genoa processors (Zen 4 architecture) and Radeon GPUs. This is the first such exascale contract where AMD is the sole purveyor of both CPUs and GPUs, with AMD's other design win with EPYC in the Cray Shasta being paired with NVIDIA graphics cards.

El Capitan will be a $600 million investment to be deployed in late 2022 and operational in 2023. Undoubtedly, next-gen proposals from AMD, Intel and NVIDIA were presented, with AMD winning the shootout in a big way. While initially the DOE projected El Capitan to provide some 1.5 exaflops of computing power, it has now revised their performance goals to a pure 2 exaflop machine. El Capitan willl thus be ten times faster than the current leader of the supercomputing world, Summit.

AMD Announces Mini PC Initiative, Brings the Fight to Intel in Yet Another Product Segment

AMD is wading into even deeper waters across Intel's markets with the announcement of new Mini-PCs powered by the company's AMD Ryzen embedded V1000 and R1000 processors. Mini PCs, powered by AMD Ryzen Embedded V1000 and R1000 processors. Multiple partners such as ASRock Industrial, EEPD, OnLogic and Simply NUC have already designed their own takes on Mini-PCs (comparable to Intel's NUC, Next unit of Computing) as a way to give businesses a way to have a small form factor box for different computing needs. These aim to offer a high-performance CPU/GPU processor with expansive peripheral support, in-depth security features and a planned 10-year processor availability.

Until now, AMD's Ryzen Embedded product line had mostly scored one design win here and there, powering handheld consoles such as the Smach Z and such other low power, relatively high-performance environments. When AMD announced the R1000 SoC back in April, it already announced that partners would be bringing their own takes on the underlying silicon, and today is the announcement of that effort.

It Can't Run Crysis: Radeon Instinct MI60 Only Supports Linux

AMD recently announced the Radeon Instinct MI60, a GPU-based data-center compute processor with hardware virtualization features. It takes the crown for "the world's first 7 nm GPU." The company also put out specifications of the "Vega 20" GPU it's based on: 4,096 stream processors, 4096-bit HBM2 memory interface, 1800 MHz engine clock-speed, 1 TB/s memory bandwidth, 7.4 TFLOP/s peak double-precision (FP64) performance, and the works. Here's the kicker: the company isn't launching this accelerator with Windows support. At launch, AMD is only releasing x86-64 Linux drivers, with API support for OpenGL 4.6, Vulkan 1.0, and OpenCL 2.0, along with AMD's ROCm open ecosystem. The lack of display connector already disqualifies this card for most workstation applications, but with the lack of Windows support, it is also the most expensive graphics card that "can't run Crysis." AMD could release Radeon Pro branded graphics cards based on "Vega 20," which will ship with Windows and MacOS drivers.

AMD Expands EPYC Availability, Introduces ROCm 1.7 With Tensor Flow Support

AMD has been steadily increasing output and availability of their latest take on the server market with their EPYC CPUs. These are 32-core, 64-thread monsters that excel in delivering a better feature set in 1P configuration than even some of Intel's 2P setups, and reception for these AMD processors has been pretty warm as a result. The usage of an MCM design to create a 4-way cluster of small 8-core processor packages has allowed AMD to improve yields with minimum retooling and changes to its manufacturing lines, which in turn, has increased yields and profits for a company that sorely needed a a breakout product.
Return to Keyword Browsing
May 21st, 2024 10:59 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts