News Posts matching #CDNA

Micron HBM Designed into Leading AMD AI Platform

Press Release by

Jun 13th, 2025 03:49 Discuss (0 Comments)

Micron Technology, Inc. today announced the integration of its HBM3E 36 GB 12-high offering into the upcoming AMD Instinct MI350 Series solutions. This collaboration highlights the critical role of power efficiency and performance in training large AI models, delivering high-throughput inference and handling complex HPC workloads such as data processing and computational modeling. Furthermore, it represents another significant milestone in HBM industry leadership for Micron, showcasing its robust execution and the value of its strong customer relationships.

Micron HBM3E 36 GB 12-high solution brings industry-leading memory technology to AMD Instinct MI350 Series GPU platforms, providing outstanding bandwidth and lower power consumption. The AMD Instinct MI350 Series GPU platforms, built on AMD advanced CDNA 4 architecture, integrate 288 GB of high-bandwidth HBM3E memory capacity, delivering up to 8 TB/s bandwidth for exceptional throughput. This immense memory capacity allows Instinct MI350 series GPUs to efficiently support AI models with up to 520 billion parameters—on a single GPU. In a full platform configuration, Instinct MI350 Series GPUs offers up to 2.3 TB of HBM3E memory and achieves peak theoretical performance of up to 161 PFLOPS at FP4 precision, with leadership energy efficiency and scalability for high-density AI workloads. This tightly integrated architecture, combined with Micron's power-efficient HBM3E, enables exceptional throughput for large language model training, inference and scientific simulation tasks—empowering data centers to scale seamlessly while maximizing compute performance per watt. This joint effort between Micron and AMD has enabled faster time to market for AI solutions.

Read full story

AMD Instinct MI355X Draws up to 1,400 Watts in OAM Form Factor

AleksandarK

Jun 11th, 2025 09:07 Discuss (10 Comments)

Tomorrow evening, AMD will host its "Advancing AI" livestream to introduce the Instinct MI350 series, a new line of GPU accelerators designed for large-scale AI training and inference. First shown in prototype form at ISC 2025 in Hamburg just a day ago, each MI350 card features 288 GB of HBM3E memory, delivering up to 8 TB/s of sustained bandwidth. Customers can choose between the single-card MI350X and the higher-clocked MI355X or opt for a full eight-GPU platform that aggregates to over 2.3 TB of memory. Both chips are built on the CDNA 4 architecture, which now supports four different precision formats: FP16, FP8, FP6, and FP4. The addition of FP6 and FP4 is designed to boost throughput in modern AI workloads, where models of tomorrow with tens of trillions of parameters are trained on FP6 and FP4.

In half-precision tests, the MI350X achieves 4.6 PetaFLOPS on its own and 36.8 PetaFLOPS in eight-GPU platform form, while the MI355X surpasses those numbers, reaching 5.03 PetaFLOPS and just over 40 PetaFLOPS. AMD is also aiming to improve energy efficiency by a factor of thirty compared with its previous generation. The MI350X card runs within a 1,000 Watt power envelope and relies on air cooling, whereas the MI355X steps up to 1,400 Watts and is intended for direct-liquid cooling setups. That 400 Watt increase puts it right at NVIDIA's upcoming GB300 "Grace Blackwell Ultra" superchip, which is also a 1,400 W design. With memory capacity, raw computing, and power efficiency all pushed to new heights, the question remains whether real-world benchmarks will match these ambitious specifications. AMD now only lacks platform scaling beyond eight GPUs, which the Instinct MI400 series will address.

AMD Patents Provide Early UDNA Insights - "Blackwell-esque" Ray Tracing Performance Could be Achievable

T0@st

May 5th, 2025 09:25 Discuss (60 Comments)

Last September, AMD leadership publicly revealed UDNA—an "unforking" of previously separate enterprise and commercial GPU branches. Not long after this announcement, TechPowerUp's resident Serbian correspondent—AleksandarK—sat down with Team Red's Andrej Zdravkovic. The Chief Software Officer (and SVP) stated that a fair chunk of UDNA-related development work would be done by local engineers. Zdravkovic discussed this technology's eventual deployment in futuristic "AI PCs," but gamers have been salivating at the prospect of a proper successor to RDNA 4. A next-gen graphics architecture seeker—MrMPFR—has combed through official documents for any sign of UDNA preview material. The noted /Hardware subreddit member managed to distill their initial (very long) set of findings into an "easily digestible overview." They stated that this was just a small case of: "reporting and a little analysis on AMD's publicly available US patents filings," and other public-facing resources/archives.

Gleaned information included: "finalized architectural characteristics in future RDNA generations, AMD DXR IHV stacks (driver agnostic), and AMD sponsored titles. But please take everything with a grain of salt given my lack of professional expertise and experience with Real-time ray tracing (RTRT)". MrMPFR believes that Team Red started picking up former NVIDIA and Intel engineering talent, back in 2022/2023. In addition, a lot of new hires were apparently sourced from academic institutions. In theory, these newer team members have not had the time to make major inroads—in terms of getting finalized products out into the wild. MrMPFR reckons that noticeable contributions will accelerate AMD's making of "RDNA 6+/UDNA 2+," and beyond. Early 2025 leaks have pointed to the company collaborating with Sony; their "PlayStation 6" console is tipped to be powered by some fork of Team Red's "UDNA" graphics technology.

Read full story

Oracle Plans to Use 30,000 AMD Instinct MI355X GPUs for AI Cloud

AleksandarK

Mar 11th, 2025 16:51 Discuss (15 Comments)

AMD's Instinct MI355X accelerators for AI workloads are gaining traction, and Oracle just became one of the bigger customers. According to Oracle's latest financial results, the company noted that it had acquired 30,000 AMD Instinct MI355X accelerators. "In Q3, we signed a multi billion dollar contract with AMD to build a cluster of 30,000 of their latest MI355X GPUs," noted Larry Ellison, adding that "And all four of the leading cloud security companies, CrowdStrike, Cyber Reason, Newfold Digital and Palo Alto, they all decided to move to the Oracle Cloud. But perhaps most importantly, Oracle has developed a new product called the AI data platform that enables our huge install base of database customers to use the latest AI models from OpenAI, XAI and Meta to analyze all of the data they have stored in their millions of existing Oracle databases. By using Oracle version 23 AI's vector capabilities, customers can automatically put all of their existing data into the vector format that is understood by AI models. This allows those AI models to learn, understand and analyze every aspect of your company or government agency, instantly unlocking the value in your data while keeping your data private and secure."

AMD's Instinct MI355X accelerator introduces the CDNA4 architecture on TSMC's N3 process node with a focus on AI workload acceleration. The chiplet-based GPU delivers 2.3 petaflops of FP16 compute and 4.6 petaflops of FP8 compute, marking a 77% performance increase over the MI300X series. The MI355X's key advancement comes through support for reduced-precision FP4 and FP6 numerical formats, enabling up to 9.2 petaflops of FP4 compute. Memory specifications include 288 GB of HBM3E across eight stacks, providing 8 TB/s of total bandwidth. Production timelines place the MI355X's market entry in the second half of 2025, continuing AMD's annual cadence for data center GPU launches. By second half, Oracle will likely prepare data center space for these GPUs and just power them on once AMD ships these accelerators.

AMD to Skip RDNA 5: UDNA Takes the Spotlight After RDNA 4

AleksandarK

Nov 19th, 2024 11:56 Discuss (63 Comments)

While the current generation of AMD graphics cards employs RDNA 3 at its core, and the upcoming RX 8000 series will feature RDNA 4, the latest leaks suggest RDNA 5 is not in development. Instead, UDNA will succeed RDNA 4, simplifying AMD's GPU roadmap. A credible source on the Chiphell forums, zhangzhonghao, reports that the UDNA-based RX 9000 series and Instinct MI400 AI accelerator will incorporate the same advanced Arithmetic Logic Unit (ALU) designs in both products, reminiscent of AMD's earlier GCN architectures before the CDNA and RDNA split. Sony's next-generation PlayStation 6 is also rumored to adopt UDNA technology. The PS5 and PS5 Pro currently utilize RDNA 2, while the Pro variant integrates elements of RDNA 4 for enhanced ray tracing. The PS6's CPU configuration remains unclear, but speculation revolves around Zen 4 or Zen 5 architectures.

The first UDNA gaming GPUs are expected to enter production by Q2 2026. Interestingly, AMD's RDNA 4 GPUs are anticipated to focus on entry-level to mid-range markets, potentially leaving high-end offerings until the UDNA generation. This strategic pause may allow AMD to refine AI-accelerated technologies like FidelityFX Super Resolution (FSR) 4, aiming to compete with NVIDIA's DLSS. This unification is inspired by NVIDIA's CUDA ecosystem, which supports cross-platform compatibility from laptops to high-performance servers. As AMD sees it, the decision addresses the challenges posed by maintaining separate architectures, which complicate memory subsystem optimizations and hinder forward and backward compatibility. Putting developer resources into RDNA 5 is not economically or strategically wise, given that UDNA is about to take over. Additionally, the company is enabling ROCm software support across all products ranging from consumer Radeon to enterprise Instinct MI. Accelerating software for one platform will translate to the entire product stack.

GIGABYTE Releases Servers with AMD EPYC 9005 Series Processors and AMD Instinct MI325X GPUs

Press Release by

Nomad76

Oct 10th, 2024 13:27 Discuss (0 Comments)

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, today announced support for AMD EPYC 9005 Series processors with the release of new GIGABYTE servers alongside BIOS updates for some existing GIGABYTE servers using the SP5 platform. This first wave of updates supports over 60 servers and motherboards that customers can choose from that deliver exceptional performance for 5th Generation AMD EPYC processors. In addition, with the launch of the AMD Instinct MI325X accelerator, a newly designed GIGABYTE server was created, and it will be showcased at SC24 (Nov. 19-21) in Atlanta.

New GIGABYTE Servers and Updates
To fill in all possible workload scenarios, using modular design servers to edge servers to enterprise-grade motherboards, these new solutions will ship already supporting AMD EPYC 9005 Series processors. The XV23-ZX0 is one of the many new solutions and it is notable for its modularized server design using two AMD EPYC 9005 processors and supporting up to four GPUs and three additional FHFL slots. It also has 2+2 redundant power supplies on the front-side for ease of access.

Read full story

AMD Launches Instinct MI325X Accelerator for AI Workloads: 256 GB HBM3E Memory and 2.6 PetaFLOPS FP8 Compute

AleksandarK

Oct 10th, 2024 12:17 Discuss (13 Comments)

During its "Advancing AI" conference today, AMD has updated its AI accelerator portfolio with the Instinct MI325X accelerator, designed to succeed its MI300X predecessor. Built on the CDNA 3 architecture, Instinct MI325X brings a suite of improvements over the old SKU. Now, the MI325X features 256 GB of HBM3E memory running at 6 TB/s bandwidth. The capacity memory alone is a 1.8x improvement over the old MI300 SKU, which features 192 GB of regular HBM3 memory. Providing more memory capacity is crucial as upcoming AI workloads are training models with parameter counts measured in trillions, as opposed to billions with current models we have today. When it comes to compute resources, the Instinct MI325X provides 1.3 PetaFLOPS at FP16 and 2.6 PetaFLOPS at FP8 training and inference. This represents a 1.3x improvement over the Instinct MI300.

A chip alone is worthless without a good platform, and AMD decided to make the Instinct MI325X OAM modules a drop-in replacement for the current platform designed for MI300X, as they are both pin-compatible. In systems packing eight MI325X accelerators, there are 2 TB of HBM3E memory running at 48 TB/s memory bandwidth. Such a system achieves 10.4 PetaFLOPS of FP16 and 20.8 PetaFLOPS of FP8 compute performance. The company uses NVIDIA's H200 HGX as reference claims for its performance competitiveness, where the company claims that the Instinct MI325X outperforms NVIDIA H200 HGX system by 1.3x across the board in memory bandwidth, FP16 / FP8 compute performance and 1.8x in memory capacity.

Read full story

Interview with AMD's Senior Vice President and Chief Software Officer Andrej Zdravkovic: UDNA, ROCm for Radeon, AI Everywhere, and Much More!

Exclusive by

AleksandarK

Sep 16th, 2024 01:04 Discuss (40 Comments)

A few days ago, we reported on AMD's newest expansion plans for Serbia. The company opened two new engineering design centers with offices in Belgrade and Nis. We were invited to join the opening ceremony and got an exclusive interview with one of AMD's top executives, Andrej Zdravkovic, who is the senior vice president and Chief Software Officer. Previously, we reported on AMD's transition to become a software company. The company has recently tripled its software engineering workforce and is moving some of its best people to support these teams. AMD's plan is spread over a three to five-year timeframe to improve its software ecosystem, accelerating hardware development to launch new products more frequently and to react to changes in software demand. AMD found that to help these expansion efforts, opening new design centers in Serbia would be very advantageous.

We sat down with Andrej Zdravkovic to discuss the purpose of AMD's establishment in Serbia and the future of some products. Zdravkovic is actually an engineer from Serbia, where he completed his Bachelor's and Master's degrees in electrical engineering from Belgrade University. In 1998, Zdravkovic joined ATI and quickly rose through the ranks, eventually becoming a senior director. During his decade-long tenure, Zdravkovic witnessed a significant industry shift as AMD acquired ATI in 2006. After a brief stint at another company, Zdravkovic returned to AMD in 2015, bringing with him a wealth of experience and a unique perspective on the evolution of the graphics and computing industry.

Here is the full interview:

Read full story

AMD to Unify Gaming "RDNA" and Data Center "CDNA" into "UDNA": Singular GPU Architecture Similar to NVIDIA's CUDA

AleksandarK

Sep 9th, 2024 10:50 Discuss (57 Comments)

According to new information from Tom's Hardware, AMD has announced plans to unify its consumer-focused gaming RDNA and data center CDNA graphics architectures into a single, unified design called "UDNA." The announcement was made by AMD's Jack Huynh, Senior Vice President and General Manager of the Computing and Graphics Business Group, at IFA 2024 in Berlin. The goal of the new UDNA architecture is to provide a single focus point for developers so that each optimized application can run on consumer-grade GPU like Radeon RX 7900XTX as well as high-end data center GPU like Instinct MI300. This will create a unification similar to NVIDIA's CUDA, which enables CUDA-focused developers to run applications on everything ranging from laptops to data centers.

Jack HuynhSo, part of a big change at AMD is today we have a CDNA architecture for our Instinct data center GPUs and RDNA for the consumer stuff. It's forked. Going forward, we will call it UDNA. There'll be one unified architecture, both Instinct and client [consumer]. We'll unify it so that it will be so much easier for developers versus today, where they have to choose and value is not improving.

Read full story

NVIDIA Hit with DOJ Antitrust Probe over AI GPUs, Unfair Sales Tactics and Pricing Alleged

btarunr

Aug 2nd, 2024 05:36 Discuss (50 Comments)

NVIDIA has reportedly been hit with a US Department of Justice (DOJ) antitrust probe over the tactics the company allegedly employs to sell or lease its AI GPUs and data-center networking equipment, "The Information" reported. Shares of the NVIDIA stock fell 3.6% in the pre-market trading on Friday (08/02). The main complainants behind the probe appear to be a special interest group among the customers of AI GPUs, and not NVIDIA's competitors in the AI GPU industry per se. US Senator Elizabeth Warren and US progressives have been most vocal about calling upon the DOJ to investigate antitrust allegations against NVIDIA.

Meanwhile, US officials are reportedly reaching out to NVIDIA's competitors, including AMD and Intel, to gather information about the complaints. NVIDIA holds 80% of the AI GPU market, while AMD, and to a much lesser extent, Intel, have received spillover demand for AI GPUs. "The Information" report says that the complaint alleges NVIDIA pressured cloud customers to buy "multiple products". We don't know what this means, one theory holds that NVIDIA is getting them to commit to buying multiple generations of products (eg: Ampere, Hopper, and over to Blackwell); while another holds that it's getting them to buy multiple kinds of products, which include not just the AI GPUs, but also NVIDIA's first-party server systems and networking equipment. Yet another theory holds that it is bundle first-party software and services to go with the hardware, far beyond the basic software needed to get the hardware to work.

Read full story

AMD is Becoming a Software Company. Here's the Plan

Editorial by

W1zzard

Jul 8th, 2024 06:15 Discuss (140 Comments)

Just a few weeks ago, AMD invited us to Barcelona as part of a roundtable, to share their vision for the future of the company, and to get our feedback. On site, were prominent AMD leadership, including Phil Guido, Executive Vice President & Chief Commercial Officer and Jack Huynh, Senior VP & GM, Computing and Graphics Business Group. AMD is making changes in a big way to how they are approaching technology, shifting their focus from hardware development to emphasizing software, APIs, and AI experiences. Software is no longer just a complement to hardware; it's the core of modern technological ecosystems, and AMD is finally aligning its strategy accordingly.

The major difference between AMD and NVIDIA is that AMD is a hardware company that makes software on the side to support its hardware; while NVIDIA is a software company that designs hardware on the side to accelerate its software. This is about to change, as AMD is making a pivot toward software. They believe that they now have the full stack of computing hardware—all the way from CPUs, to AI accelerators, to GPUs, to FPGAs, to data-processing and even server architecture. The only frontier left for AMD is software.

Read full story

Curious "Navi 48 XTX" Graphics Card Prototype Detected in Regulatory Filings

btarunr

Jun 11th, 2024 02:24 Discuss (73 Comments)

A curiously described graphics card was detected by Olrak29 as it was making it through international shipping. The shipment description for the card reads "GRAPHIC CARD NAVI48 G28201 DT XTX REVB-PRE-CORRELATION AO PLATSI TT(SAMSUNG)-Q2 2024-3A-102-G28201." This can be decoded as a graphics card with the board number "G28201," for the desktop platform. It features a maxed out version of the "Navi 48" silicon, and is based on the B revision of the PCB. It features Samsung-made memory chips, and is dated Q2-2024.

AMD is planning to retreat from the enthusiast segment of gaming graphics cards with the RDNA 4 generation. The company originally entered this segment with the RX 6800 series and RX 6900 series RDNA 2 generation, where it saw unexpected success with the crypto-mining market boom, besides being competitive with the RTX 3080 and RTX 3090. This bust by the time RDNA 3 and the RX 7900 series arrived, and the chip wasn't competitive with NVIDIA's top-end. Around this time, the AI acceleration boom squeezed foundry allocation of all major chipmakers, including AMD, making large chips based on the latest process nodes even less viable for a market such as enthusiast graphics—the company would rather make CDNA AI accelerators with its allocation. Given all this, the company's fastest GPUs from the RDNA 4 generation could be the ones that succeed the current RX 7800 XT and RX 7700 XT, so AMD could capture a slice of the performance segment.

AMD Adds RDNA 4 Generation Navi 44 and MI300X1 GPUs to ROCm Software

AleksandarK

May 24th, 2024 01:32 Discuss (22 Comments)

AMD has quietly added some interesting codenames to its ROCm hardware support list. The biggest surprise is the appearance of "RDNA 4" and "Navi 44" codenames, hinting at a potential successor to the current RDNA 3 GPU architecture powering AMD's Radeon RX 7000 series graphics cards. The upcoming Radeon RX 8000 series could see Navi 44 SKU with a codename "gfx1200". While details are scarce, the inclusion of RDNA 4 and Navi 44 in the ROCm list suggests AMD is working on a new GPU microarchitecture that could bring significant performance and efficiency gains. While RDNA 4 may be destined for future Radeon gaming GPUs, in the data center GPU compute market, AMD is preparing a CDNA 4 based successors to the MI300 series. However, it appears that we haven't seen all the MI300 variants first. Equally intriguing is the "MI300X1" codename, which appears to reference an upcoming AI-focused accelerator from AMD.

While we wait for more information, we can't decipher whether the Navi 44 GPU SKU is for the high-end or low-end segment. If previous generations are for reference, then the Navi 44 SKU would target the low end of the GPU performance spectrum. The previous generation RDNA 3 had Navi 33 as an entry-level model, whereas the RDNA 2 had a Navi 24 SKU for entry-level GPUs. We have reported on RDNA 4 merely being a "bug correction" generation to fix the perf/Watt curve and offer better efficiency overall. What happens finally, we have to wait and see. AMD could announce more details in its upcoming Computex keynote.

Dr. Lisa Su Responds to TinyBox's Radeon RX 7900 XTX GPU Firmware Problems

T0@st

Mar 6th, 2024 14:22 Discuss (24 Comments)

The TinyBox AI server system attracted plenty of media attention last week—its creator, George Hotz, decided to build with AMD RDNA 3.0 GPU hardware rather than the expected/traditional choice of CDNA 3.0. Tiny Corp. is a startup firm dealing in neural network frameworks—they currently "write and maintain tinygrad." Hotz & Co. are in the process of assembling rack-mounted 12U TinyBox systems for customers—an individual server houses an AMD EPYC 7532 processor and six XFX Speedster MERC310 Radeon RX 7900 XTX graphics cards. The Tiny Corp. social media account has engaged in numerous NVIDIA vs. AMD AI hardware debates/tirades—Hotz appears to favor the latter, as evidenced in his latest choice of components. ROCm support on Team Red AI Instinct accelerators is fairly mature at this point in time, but a much newer prospect on gaming-oriented graphics cards.

Tiny Corporation's unusual leveraging of Radeon RX 7900 XTX GPUs in a data center configuration has already hit a developmental roadblock. Yesterday, the company's social media account expressed driver-related frustrations in a public forum: "If AMD open sources their firmware, I'll fix their LLVM spilling bug and write a fuzzer for HSA. Otherwise, it's not worth putting tons of effort into fixing bugs on a platform you don't own." Hotz's latest complaint was taken onboard by AMD's top brass—Dr. Lisa Su responded with the following message: "Thanks for the collaboration and feedback. We are all in to get you a good solution. Team is on it." Her software engineers—within a few hours—managed to fling out a set of fixes in Tiny Corporation's direction. Hotz appreciated the quick turnaround, and proceeded to run a model without encountering major stability issues: "AMD sent me an updated set of firmware blobs to try. They are responsive, and there have been big strides in the driver in the last year. It will be good! This training run is almost 5 hours in, hasn't crashed yet." Tiny Corp. drummed up speculation about AMD open sourcing GPU MES firmware—Hotz disclosed that he will be talking (on the phone) to Team Red leadership.

AMD Delivers Leadership Portfolio of Data Center AI Solutions with AMD Instinct MI300 Series

Press Release by

GFreeman

Dec 6th, 2023 14:15 Discuss (4 Comments)

Today, AMD announced the availability of the AMD Instinct MI300X accelerators - with industry leading memory bandwidth for generative AI and leadership performance for large language model (LLM) training and inferencing - as well as the AMD Instinct MI300A accelerated processing unit (APU) - combining the latest AMD CDNA 3 architecture and "Zen 4" CPUs to deliver breakthrough performance for HPC and AI workloads.

"AMD Instinct MI300 Series accelerators are designed with our most advanced technologies, delivering leadership performance, and will be in large scale cloud and enterprise deployments," said Victor Peng, president, AMD. "By leveraging our leadership hardware, software and open ecosystem approach, cloud providers, OEMs and ODMs are bringing to market technologies that empower enterprises to adopt and deploy AI-powered solutions."

Read full story

GIGABYTE Unveils Next-gen HPC & AI Servers with AMD Instinct MI300 Series Accelerators

Press Release by

TheLostSwede

Dec 6th, 2023 11:12 Discuss (1 Comment)

GIGABYTE Technology: Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, and IT infrastructure, today announced the GIGABYTE G383-R80 for the AMD Instinct MI300A APU and two GIGABYTE G593 series servers for the AMD Instinct MI300X GPU and AMD EPYC 9004 Series processor. As a testament to the performance of AMD Instinct MI300 Series family of products, the El Capitan supercomputer at Lawrence Livermore National Laboratory uses the MI300A APU to power exascale computing. And these new GIGABYTE servers are the ideal platform to propel discoveries in HPC & AI at exascale.⁠

Marrying of a CPU & GPU: G383-R80
For incredible advancements in HPC there is the GIGABYTE G383-R80 that houses four LGA6096 sockets for MI300A APUs. This chip integrates a CPU that has twenty-four AMD Zen 4 cores with a powerful GPU built with AMD CDNA 3 GPU cores. And the chiplet design shares 128 GB of unified HBM3 memory for impressive performance for large AI models. The G383 server has lots of expansion slots for networking, storage, or other accelerators, with a total of twelve PCIe Gen 5 slots. And in the front of the chassis are eight 2.5" Gen 5 NVMe bays to handle heavy workloads such as real-time big data analytics and latency-sensitive workloads in finance and telecom. ⁠

Read full story

Two-ExaFLOP El Capitan Supercomputer Starts Installation Process with AMD Instinct MI300A

AleksandarK

Jul 6th, 2023 03:03 Discuss (29 Comments)

When Lawrence Livermore National Laboratory (LLNL) announced the creation of a two-ExaFLOP supercomputer named El Capitan, we heard that AMD would power it with its Instinct MI300 accelerator. Today, LNLL published a Tweet that states, "We've begun receiving & installing components for El Capitan, @NNSANews' first #exascale #supercomputer. While we're still a ways from deploying it for national security purposes in 2024, it's exciting to see years of work becoming reality." As published images show, HPE racks filled with AMD Instinct MI300 are showing up now at LNLL's facility, and the supercomputer is expected to go operational in 2024. This could mean that November 2023 TOP500 list update wouldn't feature El Capitan, as system enablement would be very hard to achieve in four months until then.

The El Capitan supercomputer is expected to run on AMD Instinct MI300A accelerator, which features 24 Zen4 cores, CDNA3 architecture, and 128 GB of HBM3 memory. All paired together in a four-accelerator configuration goes inside each node from HPE, also getting water cooling treatment. While we don't have many further details on the memory and storage of El Capitan, we know that the system will exceed two ExFLOPS at peak and will consume close to 40 MW of power.

AMD Confirms that Instinct MI300X GPU Can Consume 750 W

T0@st

Jun 15th, 2023 13:15 Discuss (43 Comments)

AMD recently revealed its Instinct MI300X GPU at their Data Center and AI Technology Premiere event on Tuesday (June 15). The keynote presentation did not provide any details about the new accelerator model's power consumption, but that did not stop one tipster - Hoang Anh Phu - from obtaining this information from Team Red's post-event footnotes. A comparative observation was made: "MI300X (192 GB HBM3, OAM Module) TBP is 750 W, compared to last gen, MI250X TBP is only 500-560 W." A leaked Giga Computing roadmap from last month anticipated server-grade GPUs hitting the 700 W mark.

NVIDIA's Hopper H100 took the crown - with its demand for a maximum of 700 W - as the most power-hungry data center enterprise GPU until now. The MI300X's OCP Accelerator Module-based design now surpasses Team Green's flagship with a slightly greater rating. AMD's new "leadership generative AI accelerator" sports 304 CDNA 3 compute units, which is a clear upgrade over the MI250X's 220 (CDNA 2) CUs. Engineers have also introduced new 24G B HBM3 stacks, so the MI300X can be specced with 192 GB of memory (as a maximum), the MI250X is limited to a 128 GB memory capacity with its slower HBM2E stacks. We hope to see sample units producing benchmark results very soon, with the MI300X pitted against H100.

AMD ROCm 5.5 Now Available on GitHub

GFreeman

May 2nd, 2023 06:13 Discuss (9 Comments)

As expected with AMD's activity on GitHub, ROCm 5.5 has now been officially released. It brings several big changes, including better RDNA 3 support. While officially focused on AMD's professional/workstation graphics cards, the ROCm 5.5 should also bring better support for Radeon RX 7000 series graphics cards on Linux.

Surprisingly, the release notes do not officially mention RDNA 3 improvements in its release notes, but those have been already tested and confirmed. The GPU support list is pretty short including AMD GFX9, RDNA, and CDNA GPUs, ranging from Radeon VII, Pro VII, W6800, V620, and Instinct lineup. The release notes do mention new HIP enhancements, enhanced stack size limit, raising it from 16k to 128k, new APIs, OpenMP enhancements, and more. You can check out the full release notes, downloads, and more details over at GitHub.

AMD Brings ROCm to Consumer GPUs on Windows OS

AleksandarK

Apr 14th, 2023 02:48 Discuss (21 Comments)

AMD has published an exciting development for its Radeon Open Compute Ecosystem (ROCm) users today. Now, ROCm is coming to the Windows operating system, and the company has extended ROCm support for consumer graphics cards instead of only supporting professional-grade GPUs. This development milestone is essential for making AMD's GPU family more competent with NVIDIA and its CUDA-accelerated GPUs. For those unaware, AMD ROCm is a software stack designed for GPU programming. Similarly to NVIDIA's CUDA, ROCm is designed for AMD GPUs and was historically limited to Linux-based OSes and GFX9, CDNA, and professional-grade RDNA GPUs.

However, according to documents obtained by Tom's Hardware (which are behind a login wall), AMD has brought support for ROCm to Radeon RX 6900 XT, Radeon RX 6600, and R9 Fury GPU. What is interesting is not the inclusion of RX 6900 XT and RX 6600 but the support for R9 Fury, an eight-year-old graphics card. Also, what is interesting is that out of these three GPUs, only R9 Fury has full ROCm support, the RX 6900 XT has HIP SDK support, and RX 6600 has only HIP runtime support. And to make matters even more complicated, the consumer-grade R9 Fury GPU has full ROCm support only on Linux and not Windows. The reason for this strange selection of support has yet to be discovered. However, it is a step in the right direction, as AMD has yet to enable more functionality on Windows and more consumer GPUs to compete with NVIDIA.

AMD Shows Instinct MI300 Exascale APU with 146 Billion Transistors

AleksandarK

Jan 5th, 2023 04:30 Discuss (44 Comments)

During its CES 2023 keynote, AMD announced its latest Instinct MI300 APU, a first of its kind in the data center world. Combining the CPU, GPU, and memory elements into a single package eliminates latency imposed by long travel distances of data from CPU to memory and from CPU to GPU throughout the PCIe connector. In addition to solving some latency issues, less power is needed to move the data and provide greater efficiency. The Instinct MI300 features 24 Zen4 cores with simultaneous multi-threading enabled, CDNA3 GPU IP, and 128 GB of HBM3 memory on a single package. The memory bus is 8192-bit wide, providing unified memory access for CPU and GPU cores. CLX 3.0 is also supported, making cache-coherent interconnecting a reality.

The Instinct MI300 APU package is an engineering marvel of its own, with advanced chiplet techniques used. AMD managed to do 3D stacking and has nine 5 nm logic chiplets that are 3D stacked on top of four 6 nm chiplets with HBM surrounding it. All of this makes the transistor count go up to 146 billion, representing the sheer complexity of a such design. For performance figures, AMD provided a comparison to Instinct MI250X GPU. In raw AI performance, the MI300 features an 8x improvement over MI250X, while the performance-per-watt is "reduced" to a 5x increase. While we do not know what benchmark applications were used, there is a probability that some standard benchmarks like MLPerf were used. For availability, AMD targets the end of 2023, when the "El Capitan" exascale supercomputer will arrive using these Instinct MI300 APU accelerators. Pricing is unknown and will be unveiled to enterprise customers first around launch.

AMD Instinct MI300 APU to Power El Capitan Exascale Supercomputer

AleksandarK

Jun 22nd, 2022 09:19 Discuss (5 Comments)

The Exascale supercomputing race is now well underway, as the US-based Frontier supercomputer got delivered, and now we wait to see the remaining systems join the race. Today, during 79th HPC User Forum at Oak Ridge National Laboratory (ORNL), Terri Quinn at Lawrence Livermore National Laboratory (LLNL) delivered a few insights into what El Capitan exascale machine will look like. And it seems like the new powerhouse will be based on AMD's Instinct MI300 APU. LLNL targets peak performance of over two exaFLOPs and a sustained performance of more than one exaFLOP, under 40 megawatts of power. This should require a very dense and efficient computing solution, just like the MI300 APU is.

As a reminder, the AMD Instinct MI300 is an APU that combines Zen 4 x86-64 CPU cores, CDNA3 compute-oriented graphics, large cache structures, and HBM memory used as DRAM on a single package. This is achieved using a multi-chip module design with 2.5D and 3D chiplet integration using Infinity architecture. The system will essentially utilize thousands of these APUs to become one large Linux cluster. It is slated for installation in 2023, with an operating lifespan from 2024 to 2030.

Alleged AMD Instinct MI300 Exascale APU Features Zen4 CPU and CDNA3 GPU

AleksandarK

May 13th, 2022 10:54 Discuss (16 Comments)

Today we got information that AMD's upcoming Instinct MI300 will be allegedly available as an Accelerated Processing Unit (APU). AMD APUs are processors that combine CPU and GPU into a single package. AdoredTV managed to get ahold of a slide that indicates that AMD Instinct MI300 accelerator will also come as an APU option that combines Zen4 CPU cores and CDNA3 GPU accelerator in a single, large package. With technologies like 3D stacking, MCM design, and HBM memory, these Instinct APUs are positioned to be a high-density compute the product. At least six HBM dies are going to be placed in a package, with the APU itself being a socketed design.

The leaked slide from AdoredTV indicates that the first tapeout is complete by the end of the month (presumably this month), with the first silicon hitting AMD's labs in Q3 of 2022. If the silicon turns out functional, we could see these APUs available sometime in the first half of 2023. Below, you can see an illustration of the AMD Instinct MI300 GPU. The APU version will potentially be of the same size with Zen4 and CDNA3 cores spread around the package. As Instinct MI300 accelerator is supposed to use eight compute tiles, we could see different combinations of CPU/GPU tiles offered. As we await the launch of the next-generation accelerators, we are yet to see what SKUs AMD will bring.

AMD Introduces Instinct MI210 Data Center Accelerator for Exascale-class HPC and AI in a PCIe Form-Factor

AleksandarK

Mar 22nd, 2022 09:03 Discuss (3 Comments)

AMD today announced a new addition to the Instinct MI200 family of accelerators. Officially titled Instinct MI210 accelerator, AMD tries to bring exascale-class technologies to mainstream HPC and AI customers with this model. Based on CDNA2 compute architecture built for heavy HPC and AI workloads, the card features 104 compute units (CUs), totaling 6656 Streaming Processors (SPs). With a peak engine clock of 1700 MHz, the card can output 181 TeraFLOPs of FP16 half-precision peak compute, 22.6 TeraFLOPs peak FP32 single-precision, and 22.6 TFLOPs peak FP62 double-precision compute. For single-precision matrix (FP32) compute, the card can deliver a peak of 45.3 TFLOPs. The INT4/INT8 precision settings provide 181 TOPs, while MI210 can compute the bfloat16 precision format with 181 TeraFLOPs at peak.

The card uses a 4096-bit memory interface connecting 64 GBs of HMB2e to the compute silicon. The total memory bandwidth is 1638.4 GB/s, while memory modules run at a 1.6 GHz frequency. It is important to note that the ECC is supported on the entire chip. AMD provides an Instinct MI210 accelerator as a PCIe solution, based on a PCIe 4.0 standard. The card is rated for a TDP of 300 Watts and is cooled passively. There are three infinity fabric links enabled, and the maximum bandwidth of the infinity fabric link is 100 GB/s. Pricing is unknown; however, availability is March 22nd, which is the immediate launch date.

AMD places this card directly aiming at NVIDIA A100 80 GB accelerator as far as the targeted segment, with emphasis on half-precision and INT4/INT8 heavy applications.

NVIDIA to Split Graphics and Compute Architecture Naming, "Blackwell" Architecture Spotted

btarunr

Mar 1st, 2022 02:31 Discuss (14 Comments)

The recent NVIDIA data-leak springs up information on various upcoming graphics parts. Besides "Ada Lovelace," "Hopper," we come across a new codename, "Blackwell." It turns out that NVIDIA is splitting the the graphics and compute architecture naming with the next generation, not unlike what AMD did, with its RDNA and CDNA series. The current "Ampere" architecture is being used both for compute and graphics, with the streaming multiprocessor for the two being slightly different—the compute "Ampere" has more FP64 and Tensor components, while the graphics "Ampere" does away with these in favor of RT cores and graphics-relevant components.

The graphics architecture to succeed GeForce "Ampere" will be GeForce "Ada Lovelace." GPUs in this series are identified in the leaked code as "AD102," "AD103," "AD104," "AD106," "AD107," and "AD10B," succeeding a similar numbering for parts with the "A" (GeForce Ampere) series. The compute architecture succeeding "Ampere" will be codenamed "Hopper." with parts in the series being codenamed "GH100" and "GH202." Another compute or datacenter architecture is "Blackwell," with parts being codenamed "GB100" and "GB102." From all accounts, NVIDIA is planning to launch the GeForce 40-series "Ada" graphics card lineup in the second half of 2022. The company is in need of a similar refresh for its compute product lineup, and could debut "Hopper" either toward the end of 2022 or next year. "Blackwell" could follow "Hopper."

Return to Keyword Browsing

News Posts matching #CDNA

Micron HBM Designed into Leading AMD AI Platform

AMD Instinct MI355X Draws up to 1,400 Watts in OAM Form Factor

AMD Patents Provide Early UDNA Insights - "Blackwell-esque" Ray Tracing Performance Could be Achievable

Oracle Plans to Use 30,000 AMD Instinct MI355X GPUs for AI Cloud

AMD to Skip RDNA 5: UDNA Takes the Spotlight After RDNA 4

GIGABYTE Releases Servers with AMD EPYC 9005 Series Processors and AMD Instinct MI325X GPUs

AMD Launches Instinct MI325X Accelerator for AI Workloads: 256 GB HBM3E Memory and 2.6 PetaFLOPS FP8 Compute

Interview with AMD's Senior Vice President and Chief Software Officer Andrej Zdravkovic: UDNA, ROCm for Radeon, AI Everywhere, and Much More!

AMD to Unify Gaming "RDNA" and Data Center "CDNA" into "UDNA": Singular GPU Architecture Similar to NVIDIA's CUDA

NVIDIA Hit with DOJ Antitrust Probe over AI GPUs, Unfair Sales Tactics and Pricing Alleged

AMD is Becoming a Software Company. Here's the Plan

Curious "Navi 48 XTX" Graphics Card Prototype Detected in Regulatory Filings

AMD Adds RDNA 4 Generation Navi 44 and MI300X1 GPUs to ROCm Software

Dr. Lisa Su Responds to TinyBox's Radeon RX 7900 XTX GPU Firmware Problems

AMD Delivers Leadership Portfolio of Data Center AI Solutions with AMD Instinct MI300 Series

GIGABYTE Unveils Next-gen HPC & AI Servers with AMD Instinct MI300 Series Accelerators

Two-ExaFLOP El Capitan Supercomputer Starts Installation Process with AMD Instinct MI300A

AMD Confirms that Instinct MI300X GPU Can Consume 750 W

AMD ROCm 5.5 Now Available on GitHub

AMD Brings ROCm to Consumer GPUs on Windows OS

AMD Shows Instinct MI300 Exascale APU with 146 Billion Transistors

AMD Instinct MI300 APU to Power El Capitan Exascale Supercomputer

Alleged AMD Instinct MI300 Exascale APU Features Zen4 CPU and CDNA3 GPU

AMD Introduces Instinct MI210 Data Center Accelerator for Exascale-class HPC and AI in a PCIe Form-Factor

NVIDIA to Split Graphics and Compute Architecture Naming, "Blackwell" Architecture Spotted

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts