News Posts matching #ROCm

Return to Keyword Browsing

AMD's Answer to AI Advancement: ROCm 7.0 Is Here

In August, AMD will release ROCm 7, its open computing platform for high‑performance computing, machine learning, and scientific applications. This version will support a range of hardware, from Ryzen AI-equipped laptops to Radeon AI Pro desktop cards and server-grade Instinct GPUs, which have just received an update. Before the end of 2025, ROCm 7 will be integrated directly into Linux and Windows, allowing for a seamless installation process with just a few clicks. AMD isn't planning to update ROCm once every few months, either. Instead, developers will receive day-zero fixes and a major update every two weeks, complete with performance enhancements and new features. Additionally, a dedicated Dev Cloud will provide everyone with instant access to the latest AMD hardware for testing and experimentation.

Early benchmarks are encouraging. On one test, an Instinct MI300X running ROCm 7 reached roughly three times the speed recorded with the original ROCm 6 release. Of course, your mileage will vary depending on model choice, quantization, and other factors. This shift follows comments from AMD's Senior Vice President and Chief Software Officer, Andrej Zdravkovic, whom we interviewed last September. He emphasized ROCm's open-source design and the utility of HIPIFY, a tool that converts CUDA code to run on ROCm. This will enable a full-scale ROCm transition, now accelerated by a 3x performance uplift simply by updating the software version. If ROCm 7 lives up to its promise, AMD could finally unlock the potential of its hardware across devices, both big and small, and provide NVIDIA with good competition in the coming years.

Giga Computing Joins AMD Advancing AI 2025 to Share Advanced Cooling AI Solutions for AMD Instinct MI355X and MI350X GPUs

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, today announced participation at AMD Advancing AI 2025 to join conversations with AI thought leaders and to share powerful GIGABYTE servers for AI innovations. This one-day event will be highlighted by the keynote from AMD's Dr. Lisa Su and afterwords attendees will join customer breakout sessions, workshops, and more, including discussions with the Giga Computing team.

At AMD Advancing AI Day, GIGABYTE servers demonstrate powerful solutions for AMD Instinct MI350X and MI355X GPUs. The new server platforms are highly efficient and compute dense, and the GIGABYTE G4L3 series exemplifies this with its support for direct liquid cooling (DLC) technology for the MI355X GPU. In traditional data centers without liquid cooling infrastructure, the GIGABYTE G893 Series provides a reliable air-cooled platform for the MI350X GPU. Together, these platforms showcase GIGABYTE's readiness to meet diverse deployment needs—whether maximizing performance with liquid cooling or ensuring broad compatibility in traditional air-cooled environments. With support for the latest AMD Instinct GPUs, GIGABYTE is driving the next wave of AI innovation.

Compal Optimizes AI Workloads with AMD Instinct MI355X at AMD Advancing AI 2025 and International Supercomputing Conference 2025

As AI computing accelerates toward higher density and greater energy efficiency, Compal Electronics (Compal; Stock Ticker: 2324.TW), a global leader in IT and computing solutions, unveiled its latest high-performance server platform: SG720-2A/ OG720-2A at both AMD Advancing AI 2025 in the U.S. and the International Supercomputing Conference (ISC) 2025 in Europe. It features the AMD Instinct MI355X GPU architecture and offers both single-phase and two-phase liquid cooling configurations, showcasing Compal's leadership in thermal innovation and system integration. Tailored for next-generation generative AI and large language model (LLM) training, the SG720-2A/OG720-2A delivers exceptional flexibility and scalability for modern data center operations, drawing significant attention across the industry.

With generative AI and LLMs driving increasingly intensive compute demands, enterprises are placing greater emphasis on infrastructure that offers both performance and adaptability. The SG720-2A/OG720-2A emerges as a robust solution, combining high-density GPU integration and flexible liquid cooling options, positioning itself as an ideal platform for next-generation AI training and inference workloads.

AMD Unveils Vision for an Open AI Ecosystem, Detailing New Silicon, Software and Systems at Advancing AI 2025

AMD delivered its comprehensive, end-to-end integrated AI platform vision and introduced its open, scalable rack-scale AI infrastructure built on industry standards at its 2025 Advancing AI event.

AMD and its partners showcased:
  • How they are building the open AI ecosystem with the new AMD Instinct MI350 Series accelerators
  • The continued growth of the AMD ROCm ecosystem
  • The company's powerful, new, open rack-scale designs and roadmap that bring leadership rack-scale AI performance beyond 2027

Latest AMD Linux Radeon Drivers Grants RX 9060 XT & AI PRO R9700 SKU Support

AMD's "Radeon Software for Linux 25.10.1" release notes mention the introduction of support for three important ASIC SKUs: RX 9060 XT, AI PRO R9700, and RX 9070 GRE. Two of these models are still awaiting release; the TechPowerUp team spent time with demonstration samples at the recently concluded Computex 2025 trade show. Coincidentally, the special v25.10.1 update became available on the same day as Team Red's big (May 21) presentation. During that day's proceedings, the company committed themselves to providing ROCm support for freshly unveiled graphics products.

Interestingly, it has taken a number of weeks to get the China market exclusive Radeon RX 9070 GRE 12 GB card up and running under Linux environments. GPU industry watchers are still wondering whether this mid-range option will trickle out to global markets; akin to the staggered trail made by the RDNA 3 generation's Radeon RX 7900 GRE (around early 2024). Team Red's open-source software team has readied support almost two weeks ahead of the launch of Radeon RX 9060 XT 16 GB and 8 GB models. The workstation-grade Radeon AI PRO R9700 32 GB model is expected to arrive at some point in July.

AMD Updates ROCm to Support Ryzen AI Max and Radeon RX 9000 Series

AMD announced its Radeon Open Compute (ROCm) platform with hardware acceleration support for the Ryzen AI Max 300 "Strix Halo" client processors, and the Radeon RX 9000 series gaming GPUs. For the Ryzen AI Max 300 "Strix Halo," this would unlock the compute power of the 40 RDNA 3.5 compute units, with their 80 AI accelerators, and 2,560 stream processors, besides the AI-specific ISA of the up to 16 "Zen 5" CPU cores, including their full fat 512-bit FPU for executing AVX512 instructions. For the Radeon RX 9000 series, this would mean putting those up to 64 RDNA 4 compute units with up to 128 AI accelerators and up to 4,096 stream processors to use.

AMD also announced that it has updated the ROCm product stack with support for various main distributions of Linux, including OpenSuSE (available now), Ubuntu, and Red Hat EPEL, with the latter two getting ROCm support in the second half of 2025. Lastly, ROCm gets full Windows support, including Pytorch and ONNX-EP. A preview of the Pytorch support can be expected in Q3-2025, while a preview for ONNX-EP could arrive in July 2025.

AMD & HUMAIN Reveal Formation of $10 Billion Strategic Collab, Aimed at Advancing Global AI

AMD and HUMAIN, Saudi Arabia's new AI enterprise, today announced a landmark agreement to build the world's most open, scalable, resilient, and cost-efficient AI infrastructure, that will power the future of global intelligence through a network of AMD-based AI computing centers stretching from the Kingdom of Saudi Arabia to the United States. As part of the agreement, the parties will invest up to $10B to deploy 500 megawatts of AI compute capacity over the next five years. The AI superstructure built by AMD and HUMAIN will be open by design, accessible at scale, and optimized to power AI workloads across enterprise, start-up and sovereign markets. HUMAIN will oversee end-to-end delivery, including hyperscale data center, sustainable power systems, and global fiber interconnects, and AMD will provide the full spectrum of the AMD AI compute portfolio and the AMD ROCm open software ecosystem.

"At AMD, we have a bold vision to enable the future of AI everywhere—bringing open, high-performance computing to every developer, AI start-up and enterprise around the world," said Dr. Lisa Su, Chair and CEO, AMD. "Our investment with HUMAIN is a significant milestone in advancing global AI infrastructure. Together, we are building a globally significant AI platform that delivers performance, openness and reach at unprecedented levels." With initial deployments already underway across key global regions, the collaboration is on track to activate multi-exaflop capacity by early 2026, supported by next-gen AI silicon, modular data center zones, and a developer-enablement focused software platform stack built around open standards and interoperability.

AMD Reports First Quarter 2025 Financial Results

AMD today announced financial results for the first quarter of 2025. First quarter revenue was $7.4 billion, gross margin was 50%, operating income was $806 million, net income was $709 million and diluted earnings per share was $0.44. On a non-GAAP(*) basis, gross margin was 54%, operating income was $1.8 billion, net income was $1.6 billion and diluted earnings per share was $0.96.

"We delivered an outstanding start to 2025 as year-over-year growth accelerated for the fourth consecutive quarter driven by strength in our core businesses and expanding data center and AI momentum," said Dr. Lisa Su, AMD chair and CEO. "Despite the dynamic macro and regulatory environment, our first quarter results and second quarter outlook highlight the strength of our differentiated product portfolio and consistent execution positioning us well for strong growth in 2025."

AMD Readies Radeon PRO W9000 Series Powered by RDNA 4

AMD is readying a new line of professional graphics cards based on its latest RDNA 4 graphics architecture. The company has assigned the silicon variant "Navi 48 XTW" to power its next flagship pro-vis product, which will likely be branded under the Radeon PRO W9000 series. According to the source of this leak, the card comes with 32 GB of memory, which is probably ECC GDDR6, across the chip's 256-bit wide memory bus. The product should offer the same core-configuration as the Radeon RX 9070 XT gaming GPU, with 64 compute units worth 4,096 stream processors, 128 AI accelerators, 64 RT accelerators, 256 TMUs, and 128 ROPs.

Besides professional visualization, AMD could target the AI acceleration crowd. The company is hosting the "Advancing AI" press event in June, where it is widely expected to announce its next-generation AI GPUs and updates to ROCm. It could also use the occasion to unveil the Radeon PRO W9000 series product, promoting them to the AI acceleration crowd.

AMD Launches ROCm 6.4 with Technical Upgrades, Still no Support for RDNA 4

AMD officially released ROCm 6.4, its latest open‑source GPU compute stack, bringing several under‑the‑hood improvements while still lacking official RDNA 4 support. The update improves compatibility between ROCm's user‑space libraries and the AMDKFD kernel driver, making it easier to run across a wider range of Linux kernels. AMD has also expanded its internal testing to cover more combinations of user and kernel versions, which should reduce integration headaches for HPC and AI workloads. On the framework side, ROCm 6.4 now supports PyTorch 2.5 and 2.6 out of the box, so developers can use the latest deep‑learning features without building from source. The Megatron‑LM integration adds three new fused kernels, Attention (QKV), Layer Norm, and ROPE, to speed up transformer model training by combining multiple operations into single GPU passes. Video decoding gets a boost, too, with VP9 support in both rocDecode and rocPyDecode, plus a new bitstream reader module to streamline media pipelines.

Oracle Linux 9 is now officially supported, and the Radeon PRO W7800 48 GB workstation card has been validated under ROCm. AMD also enabled CPX mode with NPS4 memory configurations, catering to advanced memory bandwidth scenarios on MI Instinct accelerators. Despite these updates, ROCm 6.4 still does not officially support RDNA 4 GPUs, such as the RX 9070 series. While community members report that the new release can run on those cards unofficially, the lack of formal enablement means RDNA 4's doubled FP16 throughput, eight times INT4 sparsity acceleration, and FP8 capabilities remain largely untapped in ROCm workflows. On Linux, consumer Radeon support is limited to just a few models, even though Windows coverage for RDNA 2 and 3 families has expanded since 2022. With AMD's "Advancing AI" event coming in June, many developers are hoping for an announcement about RDNA 4 integration. Until then, those who need guaranteed, day‑one GPU support may continue to look at alternative ecosystems.

AMD Instinct GPUs are Ready to Take on Today's Most Demanding AI Models

Customers evaluating AI infrastructure today rely on a combination of industry-standard benchmarks and real-world model performance metrics—such as those from Llama 3.1 405B, DeepSeek-R1, and other leading open-source models—to guide their GPU purchase decisions. At AMD, we believe that delivering value across both dimensions is essential to driving broader AI adoption and real-world deployment at scale. That's why we take a holistic approach—optimizing performance for rigorous industry benchmarks like MLperf while also enabling Day 0 support and rapid tuning for the models most widely used in production by our customers.

This strategy helps ensure AMD Instinct GPUs deliver not only strong, standardized performance, but also high-throughput, scalable AI inferencing across the latest generative and language models used by customers. We will explore how AMD's continued investment in benchmarking, open model enablement, software and ecosystem tools helps unlock greater value for customers—from MLPerf Inference 5.0 results to Llama 3.1 405B and DeepSeek-R1 performance, ROCm software advances, and beyond.

AMD Completes Acquisition of ZT Systems

AMD today announced the completion of its acquisition of ZT Systems, a leading provider of AI and general-purpose compute infrastructure for the world's largest hyperscale providers. The acquisition will enable a new class of end-to-end AI solutions based on the combination of AMD CPU, GPU and networking silicon, open-source AMD ROCm software and rack-scale systems capabilities. It will also accelerate the design and deployment of AMD-powered AI infrastructure at scale optimized for the cloud.

AMD expects the transaction to be accretive on a non-GAAP basis by the end of 2025. The world-class design teams will join the AMD Data Center Solutions business unit led by AMD Executive Vice President Forrest Norrod. AMD is actively engaged with multiple potential strategic partners to acquire ZT Systems' industry-leading U.S.-based data center infrastructure manufacturing business in 2025.

AMD Instinct MI400 to Include new Dedicated Multimedia IO Die

AMD's upcoming Instinct MI400 accelerator series, scheduled for 2026 introduction, is set to incorporate a new Multimedia IO Die (MID) architecture alongside significant compute density improvements. According to recent patches discovered in AMD-GFX mailing lists, the accelerator will feature a dual Active Interposer Die (AID) design, with each AID housing four Accelerated Compute Dies (XCDs)—doubling the XCD count per AID compared to the current MI300 series. Introducing dedicated Multimedia IO Dies is a new entry in AMD's accelerator design philosophy. Documentation reveals support for up to two MIDs, with each likely paired to an AID, suggesting a more specialized approach to multimedia processing and interface management.

Specifications from the Register Remapping Table (RRMT) implementation indicate sophisticated die-to-die communication pathways, with support for local and remote transactions across XCDs, AIDs, and the new MIDs. The system enables granular control over eight potential XCD configurations (XCD0 through XCD7), suggesting that AMD can scale compute up and down with SKUs. While AMD has yet to release detailed specifications for the MI400 series, separating multimedia functions into dedicated dies could optimize performance and power efficiency. As the 2026 launch window approaches, AMD will spend the remaining time refining the software stack and ROCm support for its next-generation accelerator based on UDNA architecture. Since designing an accelerator is a year-long effort from the physical implementation standpoint, we expect the Instinct MI400 design to be finalized by now. All left is silicon bring-up, software optimization, and mass production, likely at TSMC's facilities.

AMD's Pain Point is ROCm Software, NVIDIA's CUDA Software is Still Superior for AI Development: Report

The battle of AI acceleration in the data center is, as most readers are aware, insanely competitive, with NVIDIA offering a top-tier software stack. However, AMD has tried in recent years to capture a part of the revenue that hyperscalers and OEMs are willing to spend with its Instinct MI300X accelerator lineup for AI and HPC. Despite having decent hardware, the company is not close to bridging the gap software-wise with its competitor, NVIDIA. According to the latest report from SemiAnalysis, a research and consultancy firm, they have run a five-month experiment using Instinct MI300X for training and benchmark runs. And the findings were surprising: even with better hardware, AMD's software stack, including ROCm, has massively degraded AMD's performance.

"When comparing NVIDIA's GPUs to AMD's MI300X, we found that the potential on paper advantage of the MI300X was not realized due to a lack within AMD public release software stack and the lack of testing from AMD," noted SemiAnalysis, breaking down arguments in the report further, adding that "AMD's software experience is riddled with bugs rendering out of the box training with AMD is impossible. We were hopeful that AMD could emerge as a strong competitor to NVIDIA in training workloads, but, as of today, this is unfortunately not the case. The CUDA moat has yet to be crossed by AMD due to AMD's weaker-than-expected software Quality Assurance (QA) culture and its challenging out-of-the-box experience."

AMD Releases ROCm 6.3 with SGLang, Fortran Compiler, Multi-Node FFT, Vision Libraries, and More

AMD has released the new ROCm 6.3 version which introduces several new features and optimizations, including SGLang integration for accelerated AI inferencing, a re-engineered FlashAttention-2 for optimized AI training and inference, the introduction of multi-node Fast Fourier Transform (FFT), new Fortran compiler, and enhanced computer vision libraries like rocDecode, rocJPEG, and rocAL.

According to AMD, the SGLang, a runtime that is now supported by ROCm 6.3, is purpose-built for optimizing inference on models like LLMs and VLMs on AMD Instinct GPUs, and promises 6x higher throughput and much easier usage thanks to Python-integrated and pre-configured ROCm Docker containers. In addition, the AMD ROCm 6.3 also brings further transformer optimizations with FlashAttention-2, which should bring significant improvements in forward and backward pass compared to FlashAttention-1, a whole new AMD Fortran compiler with direct GPU offloading, backward compatibility, and integration with HIP Kernels and ROCm libraries, a whole new multi-node FFT support in rocFFT, which simplifies multi-node scaling and improved scalability, as well as enhanced computer vision libraries, rocDecode, rocJPEG, and rocAL, for AV1 codec support, GPU-accelerated JPEG decoding, and better audio augmentation.

AMD and Fujitsu to Begin Strategic Partnership to Create Computing Platforms for AI and High-Performance Computing (HPC)

AMD and Fujitsu Limited today announced that they have signed a memorandum of understanding (MOU) to form a strategic partnership to create computing platforms for AI and high-performance computing (HPC). The partnership, encompassing aspects from technology development to commercialization, will seek to facilitate the creation of open source and energy efficient platforms comprised of advanced processors with superior power performance and highly flexible AI/HPC software and aims to accelerate open-source AI and/or HPC initiatives.

Due to the rapid spread of AI, including generative AI, cloud service providers and end-users are seeking optimized architectures at various price and power per performance configurations. From end-to-end, AMD supports an open ecosystem, and strongly believes in giving customers choice. Fujitsu has worked to develop FUJITSU-MONAKA, a next-generation Arm-based processor that aims to achieve both high performance and low power consumption. With FUJITSU-MONAKA, together with AMD Instinct accelerators, customers have an additional choice to achieve large-scale AI workload processing to whilst attempting to reduce the data center total cost of ownership.

AMD Introduces the Radeon PRO V710 to Microsoft Azure

AMD today introduced the Radeon PRO V710, the newest member of AMD's family of visual cloud GPUs. Available today in private preview on Microsoft Azure, the Radeon PRO V710 brings new capabilities to the public cloud. The AMD Radeon PRO V710's 54 Compute Units, along with 28 GB of VRAM, 448 GB/s memory transfer rate, and 54 MB of L3 AMD Infinity Cache technology, support small to medium ML inference workloads and small model training using open-source AMD ROCm software.

With support for hardware virtualization implemented in compliance with the PCI Express SR-IOV standard, instances based on the Radeon PRO V710 can provide robust isolation between multiple virtual machines running on the same physical GPU and between the host and guest environments. The efficient RDNA 3 architecture provides excellent performance per watt, enabling a single slot, passively cooled form factor compliant with the PCIe CEM spec.

AMD Instinct MI300X Accelerators Available on Oracle Cloud Infrastructure

AMD today announced that Oracle Cloud Infrastructure (OCI) has chosen AMD Instinct MI300X accelerators with ROCm open software to power its newest OCI Compute Supercluster instance called BM.GPU.MI300X.8. For AI models that can comprise hundreds of billions of parameters, the OCI Supercluster with AMD MI300X supports up to 16,384 GPUs in a single cluster by harnessing the same ultrafast network fabric technology used by other accelerators on OCI. Designed to run demanding AI workloads including large language model (LLM) inference and training that requires high throughput with leading memory capacity and bandwidth, these OCI bare metal instances have already been adopted by companies including Fireworks AI.

"AMD Instinct MI300X and ROCm open software continue to gain momentum as trusted solutions for powering the most critical OCI AI workloads," said Andrew Dieckmann, corporate vice president and general manager, Data Center GPU Business, AMD. "As these solutions expand further into growing AI-intensive markets, the combination will benefit OCI customers with high performance, efficiency, and greater system design flexibility."

AMD's New Strix Halo "Zen 5" Mobile Chips to Feature 40 iGPU CUs

The upcoming Strix Point Halo processors from AMD now have a new name - Ryzen AI Max - and come with big promises of impressive power. This rumor, first reported by VideoCardz and originating from Weibo leaker Golden Pig Upgrade, reveals key details about the first three processors in this lineup, along with their specifications.

The leaker claims AMD might roll out a new naming system for these processors branding them as part of the Ryzen AI Max series. These chips will run on the anticipated Strix Halo APU. This series includes three models, with the top-end version boasting up to 16 Zen 5 cores and 40 Compute Units (CUs) for graphics. This setup is expected for the best model contrary to earlier rumors that AMD would drop such a variant. In fact, word has it that at least two of the models in this lineup will come with 40 RDNA 3.5 Compute Units. The leaker also hints that Strix Halo will handle up to 96 GB of video memory suggesting AMD aims to make this processor work with its ROCm (Open Compute Platform) system.

Interview with AMD's Senior Vice President and Chief Software Officer Andrej Zdravkovic: UDNA, ROCm for Radeon, AI Everywhere, and Much More!

A few days ago, we reported on AMD's newest expansion plans for Serbia. The company opened two new engineering design centers with offices in Belgrade and Nis. We were invited to join the opening ceremony and got an exclusive interview with one of AMD's top executives, Andrej Zdravkovic, who is the senior vice president and Chief Software Officer. Previously, we reported on AMD's transition to become a software company. The company has recently tripled its software engineering workforce and is moving some of its best people to support these teams. AMD's plan is spread over a three to five-year timeframe to improve its software ecosystem, accelerating hardware development to launch new products more frequently and to react to changes in software demand. AMD found that to help these expansion efforts, opening new design centers in Serbia would be very advantageous.

We sat down with Andrej Zdravkovic to discuss the purpose of AMD's establishment in Serbia and the future of some products. Zdravkovic is actually an engineer from Serbia, where he completed his Bachelor's and Master's degrees in electrical engineering from Belgrade University. In 1998, Zdravkovic joined ATI and quickly rose through the ranks, eventually becoming a senior director. During his decade-long tenure, Zdravkovic witnessed a significant industry shift as AMD acquired ATI in 2006. After a brief stint at another company, Zdravkovic returned to AMD in 2015, bringing with him a wealth of experience and a unique perspective on the evolution of the graphics and computing industry.
Here is the full interview:

AMD Opens New Engineering Design Center in Serbia

Today, AMD (NASDAQ: AMD) opened a new engineering design center in Serbia, with offices in Belgrade and Nis, strengthening its presence in the Balkans region. The new design center will employ highly skilled software engineers focused on the development of software technologies optimized for AMD leadership compute platforms, including the AMD ROCm software stack for AMD Instinct data center accelerators and AMD Radeon graphics cards. The center was established through an agreement with HTEC, a global technology services company.

"Software plays a critical role in unlocking the capabilities of our leadership AMD hardware. Our new design center will be instrumental in enabling both the design and deployment of future generations of AMD Instinct and Radeon accelerators to help make end-to-end AI solutions more accessible to customers around the world," said Andrej Zdravkovic, senior vice president and chief software officer at AMD. "Our investments in Serbia are a testament to the Balkan region's strong engineering talent, and we are excited to collaborate with HTEC, local universities and the vibrant ecosystem in Belgrade and Nis as we deepen our presence in the region over the coming years."

AMD MI300X Accelerators are Competitive with NVIDIA H100, Crunch MLPerf Inference v4.1

The MLCommons consortium on Wednesday posted MLPerf Inference v4.1 benchmark results for popular AI inferencing accelerators available in the market, across brands that include NVIDIA, AMD, and Intel. AMD's Instinct MI300X accelerators emerged competitive to NVIDIA's "Hopper" H100 series AI GPUs. AMD also used the opportunity to showcase the kind of AI inferencing performance uplifts customers can expect from its next-generation EPYC "Turin" server processors powering these MI300X machines. "Turin" features "Zen 5" CPU cores, sporting a 512-bit FPU datapath, and improved performance in AI-relevant 512-bit SIMD instruction-sets, such as AVX-512, and VNNI. The MI300X, on the other hand, banks on the strengths of its memory sub-system, FP8 data format support, and efficient KV cache management.

The MLPerf Inference v4.1 benchmark focused on the 70 billion-parameter LLaMA2-70B model. AMD's submissions included machines featuring the Instinct MI300X, powered by the current EPYC "Genoa" (Zen 4), and next-gen EPYC "Turin" (Zen 5). The GPUs are backed by AMD's ROCm open-source software stack. The benchmark evaluated inference performance using 24,576 Q&A samples from the OpenORCA dataset, with each sample containing up to 1024 input and output tokens. Two scenarios were assessed: the offline scenario, focusing on batch processing to maximize throughput in tokens per second, and the server scenario, which simulates real-time queries with strict latency limits (TTFT ≤ 2 seconds, TPOT ≤ 200 ms). This lets you see the chip's mettle in both high-throughput and low-latency queries.

AMD is Becoming a Software Company. Here's the Plan

Just a few weeks ago, AMD invited us to Barcelona as part of a roundtable, to share their vision for the future of the company, and to get our feedback. On site, were prominent AMD leadership, including Phil Guido, Executive Vice President & Chief Commercial Officer and Jack Huynh, Senior VP & GM, Computing and Graphics Business Group. AMD is making changes in a big way to how they are approaching technology, shifting their focus from hardware development to emphasizing software, APIs, and AI experiences. Software is no longer just a complement to hardware; it's the core of modern technological ecosystems, and AMD is finally aligning its strategy accordingly.

The major difference between AMD and NVIDIA is that AMD is a hardware company that makes software on the side to support its hardware; while NVIDIA is a software company that designs hardware on the side to accelerate its software. This is about to change, as AMD is making a pivot toward software. They believe that they now have the full stack of computing hardware—all the way from CPUs, to AI accelerators, to GPUs, to FPGAs, to data-processing and even server architecture. The only frontier left for AMD is software.

New AMD ROCm 6.1 Software for Radeon Release Offers More Choices to AI Developers

AMD has unveiled the latest release of its open software, AMD ROCm 6.1.3, marking the next step in its strategy to make ROCm software broadly available across its GPU portfolio, including AMD Radeon desktop GPUs. The new release gives developers broader support for Radeon GPUs to run ROCm AI workloads. "The new AMD ROCm release extends functional parity from data center to desktops, enabling AI research and development on readily available and accessible platforms," said Andrej Zdravkovic, senior vice president at AMD.

Key feature enhancements in this release focus on improving compatibility, accessibility, and scalability, and include:
  • Multi-GPU support to enable building scalable AI desktops for multi-serving, multi-user solutions.
  • Beta-level support for Windows Subsystem for Linux, allowing these solutions to work with ROCm on a Windows OS-based system.
  • TensorFlow Framework support offering more choice for AI development.

AMD Introduces New Radeon PRO W7900 Dual Slot at Computex 2024

In addition to the new Zen 5 Ryzen 9000 series desktop CPUs and Ryzen AI 300 series mobile CPUs, as well as the new Ryzen 5000XT series AM4 socket desktop CPUs and updates to the AMD Instinct AI GPU roadmap, AMD rather silently announced the new Radeon PRO W7900 Dual Slot workstation graphics card at the Computex 2024. While not a completely new product, as it is just a model update of the currently available flagship Radeon PRO W7900 workstation graphics card, it is still a rather important update since AMD managed to squeeze it into a dual-slot design, which gives it support for multi-GPU setups.

As said, the AMD Radeon PRO W7900 Dual Slot still uses the same Navi 31 GPU with 96 Compute Units (CUs), 96 Ray Accelerators, 192 AI Accelerators, and 6144 Stream Processors, as well as 48 GB of GDDR6 ECC memory on a 384-bit memory interface, giving it maximum memory bandwidth of 864 GB/s. It still needs 2x8-pin PCIe power connectors and has a Total Board Power (TBP) of 295 W. The card still comes with three DisplayPort 2.1 and one Enhanced Mini DisplayPort 1.2 outputs. What makes the new Radeon PRO W7900 Dual Slot special is the fact that AMD managed to get it down to a dual-slot design, even with the same blower-fan cooler. Unfortunately, it is not clear if the fan or its profiles are different, but it does make it suitable to be used in multi-GPU configurations.
Return to Keyword Browsing
Jul 12th, 2025 00:03 CDT change timezone

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts