News Posts matching #MI300

Return to Keyword Browsing

AMD's Pain Point is ROCm Software, NVIDIA's CUDA Software is Still Superior for AI Development: Report

The battle of AI acceleration in the data center is, as most readers are aware, insanely competitive, with NVIDIA offering a top-tier software stack. However, AMD has tried in recent years to capture a part of the revenue that hyperscalers and OEMs are willing to spend with its Instinct MI300X accelerator lineup for AI and HPC. Despite having decent hardware, the company is not close to bridging the gap software-wise with its competitor, NVIDIA. According to the latest report from SemiAnalysis, a research and consultancy firm, they have run a five-month experiment using Instinct MI300X for training and benchmark runs. And the findings were surprising: even with better hardware, AMD's software stack, including ROCm, has massively degraded AMD's performance.

"When comparing NVIDIA's GPUs to AMD's MI300X, we found that the potential on paper advantage of the MI300X was not realized due to a lack within AMD public release software stack and the lack of testing from AMD," noted SemiAnalysis, breaking down arguments in the report further, adding that "AMD's software experience is riddled with bugs rendering out of the box training with AMD is impossible. We were hopeful that AMD could emerge as a strong competitor to NVIDIA in training workloads, but, as of today, this is unfortunately not the case. The CUDA moat has yet to be crossed by AMD due to AMD's weaker-than-expected software Quality Assurance (QA) culture and its challenging out-of-the-box experience."

AMD Custom Makes CPUs for Azure: 88 "Zen 4" Cores and HBM3 Memory

Microsoft has announced its new Azure HBv5 virtual machines, featuring unique custom hardware made by AMD. CEO Satya Nadella made the announcement during Microsoft Ignite, introducing a custom-designed AMD processor solution that achieves remarkable performance metrics. The new HBv5 virtual machines deliver an extraordinary 6.9 TB/s of memory bandwidth, utilizing four specialized AMD processors equipped with HBM3 technology. This represents an eightfold improvement over existing cloud alternatives and a staggering 20-fold increase compared to previous Azure HBv3 configurations. Each HBv5 virtual machine boasts impressive specifications, including up to 352 AMD EPYC "Zen4" CPU cores capable of reaching 4 GHz peak frequencies. The system provides users with 400-450 GB of HBM3 RAM and features doubled Infinity Fabric bandwidth compared to any previous AMD EPYC server platform. Given that each VM had four CPUs, this yields 88 Zen 4 cores per CPU socket, with 9 GB of memory per core.

The architecture includes 800 Gb/s of NVIDIA Quantum-2 InfiniBand connectivity and 14 TB of local NVMe SSD storage. The development marks a strategic shift in addressing memory performance limitations, which Microsoft identifies as a critical bottleneck in HPC applications. This custom design particularly benefits sectors requiring intensive computational resources, including automotive design, aerospace simulation, weather modeling, and energy research. While the CPU appears custom-designed for Microsoft's needs, it bears similarities to previously rumored AMD processors, suggesting a possible connection to the speculated MI300C chip architecture. The system's design choices, including disabled SMT and single-tenant configuration, clearly focus on optimizing performance for specific HPC workloads. If readers can recall, Intel also made customized Xeons for AWS and their needs, which is normal in the hyperscaler space, given they drive most of the revenue.

TOP500: El Capitan Achieves Top Spot, Frontier and Aurora Follow Behind

The 64th edition of the TOP500 reveals that El Capitan has achieved the top spot and is officially the third system to reach exascale computing after Frontier and Aurora. Both systems have since moved down to No. 2 and No. 3 spots, respectively. Additionally, new systems have found their way onto the Top 10.

The new El Capitan system at the Lawrence Livermore National Laboratory in California, U.S.A., has debuted as the most powerful system on the list with an HPL score of 1.742 EFlop/s. It has 11,039,616 combined CPU and GPU cores and is based on AMD 4th generation EPYC processors with 24 cores at 1.8 GHz and AMD Instinct MI300A accelerators. El Capitan relies on a Cray Slingshot 11 network for data transfer and achieves an energy efficiency of 58.89 GigaFLOPS/watt. This power efficiency rating helped El Capitan achieve No. 18 on the GREEN500 list as well.

AMD Powers El Capitan: The World's Fastest Supercomputer

Today, AMD showcased its ongoing high performance computing (HPC) leadership at Supercomputing 2024 by powering the world's fastest supercomputer for the sixth straight Top 500 list.

The El Capitan supercomputer, housed at Lawrence Livermore National Laboratory (LLNL), powered by AMD Instinct MI300A APUs and built by Hewlett Packard Enterprise (HPE), is now the fastest supercomputer in the world with a High-Performance Linpack (HPL) score of 1.742 exaflops based on the latest Top 500 list. Both El Capitan and the Frontier system at Oak Ridge National Lab claimed numbers 18 and 22, respectively, on the Green 500 list, showcasing the impressive capabilities of the AMD EPYC processors and AMD Instinct GPUs to drive leadership performance and energy efficiency for HPC workloads.

Meta Shows Open-Architecture NVIDIA "Blackwell" GB200 System for Data Center

During the Open Compute Project (OCP) Summit 2024, Meta, one of the prime members of the OCP project, showed its NVIDIA "Blackwell" GB200 systems for its massive data centers. We previously covered Microsoft's Azure server rack with GB200 GPUs featuring one-third of the rack space for computing and two-thirds for cooling. A few days later, Google showed off its smaller GB200 system, and today, Meta is showing off its GB200 system—the smallest of the bunch. To train a dense transformer large language model with 405B parameters and a context window of up to 128k tokens, like the Llama 3.1 405B, Meta must redesign its data center infrastructure to run a distributed training job on two 24,000 GPU clusters. That is 48,000 GPUs used for training a single AI model.

Called "Catalina," it is built on the NVIDIA Blackwell platform, emphasizing modularity and adaptability while incorporating the latest NVIDIA GB200 Grace Blackwell Superchip. To address the escalating power requirements of GPUs, Catalina introduces the Orv3, a high-power rack capable of delivering up to 140kW. The comprehensive liquid-cooled setup encompasses a power shelf supporting various components, including a compute tray, switch tray, the Orv3 HPR, Wedge 400 fabric switch with 12.8 Tbps switching capacity, management switch, battery backup, and a rack management controller. Interestingly, Meta also upgraded its "Grand Teton" system for internal usage, such as deep learning recommendation models (DLRMs) and content understanding with AMD Instinct MI300X. Those are used to inference internal models, and MI300X appears to provide the best performance per Dollar for inference. According to Meta, the computational demand stemming from AI will continue to increase exponentially, so more NVIDIA and AMD GPUs is needed, and we can't wait to see what the company builds.

Dell Technologies Expands PowerEdge Server Series with 5th Generation AMD EPYC Processors

Dell Technologies (NYSE: DELL) expands the world's broadest generative AI (GenAI) solutions portfolio with Dell AI Factory additions tailored for AMD environments. These solutions offer enterprises enhanced AI capabilities, including greater scalability and flexibility, to stay competitive in the evolving technology landscape.

"By integrating AMD technology into the latest Dell servers, AI solutions and services through the Dell AI Factory, we're providing the performance and efficiencies enterprises need today and in the future," said Arthur Lewis, president, Infrastructure Solutions Group, Dell Technologies. "Together with AMD, we are setting new standards in AI performance, giving enterprises powerful and cost-effective solutions essential for modern data-driven environments."

AMD to Unify Gaming "RDNA" and Data Center "CDNA" into "UDNA": Singular GPU Architecture Similar to NVIDIA's CUDA

According to new information from Tom's Hardware, AMD has announced plans to unify its consumer-focused gaming RDNA and data center CDNA graphics architectures into a single, unified design called "UDNA." The announcement was made by AMD's Jack Huynh, Senior Vice President and General Manager of the Computing and Graphics Business Group, at IFA 2024 in Berlin. The goal of the new UDNA architecture is to provide a single focus point for developers so that each optimized application can run on consumer-grade GPU like Radeon RX 7900XTX as well as high-end data center GPU like Instinct MI300. This will create a unification similar to NVIDIA's CUDA, which enables CUDA-focused developers to run applications on everything ranging from laptops to data centers.
Jack HuynhSo, part of a big change at AMD is today we have a CDNA architecture for our Instinct data center GPUs and RDNA for the consumer stuff. It's forked. Going forward, we will call it UDNA. There'll be one unified architecture, both Instinct and client [consumer]. We'll unify it so that it will be so much easier for developers versus today, where they have to choose and value is not improving.

ASUS Presents Comprehensive AI Server Lineup

ASUS today announced its ambitious All in AI initiative, marking a significant leap into the server market with a complete AI infrastructure solution, designed to meet the evolving demands of AI-driven applications from edge, inference and generative AI the new, unparalleled wave of AI supercomputing. ASUS has proven its expertise lies in striking the perfect balance between hardware and software, including infrastructure and cluster architecture design, server installation, testing, onboarding, remote management and cloud services - positioning the ASUS brand and AI server solutions to lead the way in driving innovation and enabling the widespread adoption of AI across industries.

Meeting diverse AI needs
In partnership with NVIDIA, Intel and AMD, ASUS offer comprehensive AI-infrastructure solutions with robust software platforms and services, from entry-level AI servers and machine-learning solutions to full racks and data centers for large-scale supercomputing. At the forefront is the ESC AI POD with NVIDIA GB200 NVL72, a cutting-edge rack designed to accelerate trillion-token LLM training and real-time inference operations. Complemented by the latest NVIDIA Blackwell GPUs, NVIDIA Grace CPUs and 5th Gen NVIDIA NVLink technology, ASUS servers ensure unparalleled computing power and efficiency.

AMD Adds RDNA 4 Generation Navi 44 and MI300X1 GPUs to ROCm Software

AMD has quietly added some interesting codenames to its ROCm hardware support list. The biggest surprise is the appearance of "RDNA 4" and "Navi 44" codenames, hinting at a potential successor to the current RDNA 3 GPU architecture powering AMD's Radeon RX 7000 series graphics cards. The upcoming Radeon RX 8000 series could see Navi 44 SKU with a codename "gfx1200". While details are scarce, the inclusion of RDNA 4 and Navi 44 in the ROCm list suggests AMD is working on a new GPU microarchitecture that could bring significant performance and efficiency gains. While RDNA 4 may be destined for future Radeon gaming GPUs, in the data center GPU compute market, AMD is preparing a CDNA 4 based successors to the MI300 series. However, it appears that we haven't seen all the MI300 variants first. Equally intriguing is the "MI300X1" codename, which appears to reference an upcoming AI-focused accelerator from AMD.

While we wait for more information, we can't decipher whether the Navi 44 GPU SKU is for the high-end or low-end segment. If previous generations are for reference, then the Navi 44 SKU would target the low end of the GPU performance spectrum. The previous generation RDNA 3 had Navi 33 as an entry-level model, whereas the RDNA 2 had a Navi 24 SKU for entry-level GPUs. We have reported on RDNA 4 merely being a "bug correction" generation to fix the perf/Watt curve and offer better efficiency overall. What happens finally, we have to wait and see. AMD could announce more details in its upcoming Computex keynote.

Demand for NVIDIA's Blackwell Platform Expected to Boost TSMC's CoWoS Total Capacity by Over 150% in 2024

NVIDIA's next-gen Blackwell platform, which includes B-series GPUs and integrates NVIDIA's own Grace Arm CPU in models such as the GB200, represents a significant development. TrendForce points out that the GB200 and its predecessor, the GH200, both feature a combined CPU+GPU solution, primarily equipped with the NVIDIA Grace CPU and H200 GPU. However, the GH200 accounted for only approximately 5% of NVIDIA's high-end GPU shipments. The supply chain has high expectations for the GB200, with projections suggesting that its shipments could exceed millions of units by 2025, potentially making up nearly 40 to 50% of NVIDIA's high-end GPU market.

Although NVIDIA plans to launch products such as the GB200 and B100 in the second half of this year, upstream wafer packaging will need to adopt more complex and high-precision CoWoS-L technology, making the validation and testing process time-consuming. Additionally, more time will be required to optimize the B-series for AI server systems in aspects such as network communication and cooling performance. It is anticipated that the GB200 and B100 products will not see significant production volumes until 4Q24 or 1Q25.

Unannounced AMD Instinct MI388X Accelerator Pops Up in SEC Filing

AMD's Instinct family has welcomed a new addition—the MI388X AI accelerator—as discovered in a lengthy regulatory 10K filing (submitted to the SEC). The document reveals that the unannounced SKU—along with the MI250, MI300X and MI300A integrated circuits—cannot be sold to Chinese customers due to updated US trade regulations (new requirements were issued around October 2023). Versal VC2802 and VE2802 FPGA products are also mentioned in the same section. Earlier this month, AMD's Chinese market-specific Instinct MI309 package was deemed to be too powerful for purpose by the US Department of Commerce.

AMD has not published anything about the Instinct MI388X's official specification, and technical details have not emerged via leaks. The "X" tag likely implies that it has been designed for AI and HPC applications, akin to the recently launched MI300X accelerator. The designation of a higher model number could (naturally) point to a potentially more potent spec sheet, although Tom's Hardware posits that MI388X is a semi-custom spinoff of an existing model.

HBM3 Initially Exclusively Supplied by SK Hynix, Samsung Rallies Fast After AMD Validation

TrendForce highlights the current landscape of the HBM market, which as of early 2024, is primarily focused on HBM3. NVIDIA's upcoming B100 or H200 models will incorporate advanced HBM3e, signaling the next step in memory technology. The challenge, however, is the supply bottleneck caused by both CoWoS packaging constraints and the inherently long production cycle of HBM—extending the timeline from wafer initiation to the final product beyond two quarters.

The current HBM3 supply for NVIDIA's H100 solution is primarily met by SK hynix, leading to a supply shortfall in meeting burgeoning AI market demands. Samsung's entry into NVIDIA's supply chain with its 1Znm HBM3 products in late 2023, though initially minor, signifies its breakthrough in this segment.

AMD Stalls on Instinct MI309 China AI Chip Launch Amid US Export Hurdles

According to the latest report from Bloomberg, AMD has hit a roadblock in offering its top-of-the-line AI accelerator in the Chinese market. The newest AI chip is called Instinct MI309, a lower-performance Instinct MI300 variant tailored to meet the latest US export rules for selling advanced chips to China-based entities. However, the Instinct MI309 still appears too powerful to gain unconditional approval from the US Department of Commerce, leaving AMD in need of an export license. Originally, the US Department of Commerce made a rule: Total Processing Performance (TPP) score should not exceed 4800, effectively capping AI performance at 600 FP8 TFLOPS. This rule ensures that processors with slightly lower performance may still be sold to Chinese customers, provided their performance density (PD) is sufficiently low.

However, AMD's latest creation, Instinct MI309, is everything but slow. Based on the powerful Instinct MI300, AMD has not managed to bring it down to acceptable levels to acquire a US export license from the Department of Commerce. It is still unknown which Chinese customer was trying to acquire AMD's Instinct MI309; however, it could be one of the Chinese AI labs trying to get ahold of more training hardware for their domestic models. NVIDIA has employed a similar tactic, selling A800 and H800 chips to China, until the US also ended the export of these chips to China. AI labs located in China can only use domestic hardware, including accelerators from Alibaba, Huawei, and Baidu. Cloud services hosting GPUs in US can still be accessed by Chinese companies, but that is currently under US regulators watchlist.

NVIDIA Expects Upcoming Blackwell GPU Generation to be Capacity-Constrained

NVIDIA is anticipating supply issues for its upcoming Blackwell GPUs, which are expected to significantly improve artificial intelligence compute performance. "We expect our next-generation products to be supply constrained as demand far exceeds supply," said Colette Kress, NVIDIA's chief financial officer, during a recent earnings call. This prediction of scarcity comes just days after an analyst noted much shorter lead times for NVIDIA's current flagship Hopper-based H100 GPUs tailored to AI and high-performance computing. The eagerly anticipated Blackwell architecture and B100 GPUs built on it promise major leaps in capability—likely spurring NVIDIA's existing customers to place pre-orders already. With skyrocketing demand in the red-hot AI compute market, NVIDIA appears poised to capitalize on the insatiable appetite for ever-greater processing power.

However, the scarcity of NVIDIA's products may present an excellent opportunity for significant rivals like AMD and Intel. If both companies can offer a product that could beat NVIDIA's current H100 and provide a suitable software stack, customers would be willing to jump to their offerings and not wait many months for the anticipated high lead times. Intel is preparing the next-generation Gaudi 3 and working on the Falcon Shores accelerator for AI and HPC. AMD is shipping its Instinct MI300 accelerator, a highly competitive product, while already working on the MI400 generation. It remains to be seen if AI companies will begin the adoption of non-NVIDIA hardware or if they will remain a loyal customer and agree to the higher lead times of the new Blackwell generation. However, capacity constrain should only be a problem at launch, where the availability should improve from quarter to quarter. As TSMC improves CoWoS packaging capacity and 3 nm production, NVIDIA's allocation of the 3 nm wafers will likely improve over time as the company moves its priority from H100 to B100.

TSMC Plans to Put a Trillion Transistors on a Single Package by 2030

During the recent IEDM conference, TSMC previewed its process roadmap for delivering next-generation chip packages packing over one trillion transistors by 2030. This aligns with similar long-term visions from Intel. Such enormous transistor counts will come through advanced 3D packaging of multiple chipsets. But TSMC also aims to push monolithic chip complexity higher, ultimately enabling 200 billion transistor designs on a single die. This requires steady enhancement of TSMC's planned N2, N2P, N1.4, and N1 nodes, which are slated to arrive between now and the end of the decade. While multi-chipset architectures are currently gaining favor, TSMC asserts both packaging density and raw transistor density must scale up in tandem. Some perspective on the magnitude of TSMC's goals include NVIDIA's 80 billion transistor GH100 GPU—among today's largest chips, excluding wafer-scale designs from Cerebras.

Yet TSMC's roadmap calls for more than doubling that, first with over 100 billion transistor monolithic designs, then eventually 200 billion. Of course, yields become more challenging as die sizes grow, which is where advanced packaging of smaller chiplets becomes crucial. Multi-chip module offerings like AMD's MI300X and Intel's Ponte Vecchio already integrate dozens of tiles, with PVC having 47 tiles. TSMC envisions this expansion to chip packages housing more than a trillion transistors via its CoWoS, InFO, 3D stacking, and many other technologies. While the scaling cadence has recently slowed, TSMC remains confident in achieving both packaging and process breakthroughs to meet future density demands. The foundry's continuous investment ensures progress in unlocking next-generation semiconductor capabilities. But physics ultimately dictates timelines, no matter how aggressive the roadmap.

China Continues to Enhance AI Chip Self-Sufficiency, but High-End AI Chip Development Remains Constrained

Huawei's subsidiary HiSilicon has made significant strides in the independent R&D of AI chips, launching the next-gen Ascend 910B. These chips are utilized not only in Huawei's public cloud infrastructure but also sold to other Chinese companies. This year, Baidu ordered over a thousand Ascend 910B chips from Huawei to build approximately 200 AI servers. Additionally, in August, Chinese company iFlytek, in partnership with Huawei, released the "Gemini Star Program," a hardware and software integrated device for exclusive enterprise LLMs, equipped with the Ascend 910B AI acceleration chip, according to TrendForce's research.

TrendForce conjectures that the next-generation Ascend 910B chip is likely manufactured using SMIC's N+2 process. However, the production faces two potential risks. Firstly, as Huawei recently focused on expanding its smartphone business, the N+2 process capacity at SMIC is almost entirely allocated to Huawei's smartphone products, potentially limiting future capacity for AI chips. Secondly, SMIC remains on the Entity List, possibly restricting access to advanced process equipment.

Supermicro Extends AI and GPU Rack Scale Solutions with Support for AMD Instinct MI300 Series Accelerators

Supermicro, Inc., a Total IT Solution Manufacturer for AI, Cloud, Storage, and 5G/Edge, is announcing three new additions to its AMD-based H13 generation of GPU Servers, optimized to deliver leading-edge performance and efficiency, powered by the new AMD Instinct MI300 Series accelerators. Supermicro's powerful rack scale solutions with 8-GPU servers with the AMD Instinct MI300X OAM configuration are ideal for large model training.

The new 2U liquid-cooled and 4U air-cooled servers with the AMD Instinct MI300A Accelerated Processing Units (APUs) accelerators are available and improve data center efficiencies and power the fast-growing complex demands in AI, LLM, and HPC. The new systems contain quad APUs for scalable applications. Supermicro can deliver complete liquid-cooled racks for large-scale environments with up to 1,728 TFlops of FP64 performance per rack. Supermicro worldwide manufacturing facilities streamline the delivery of these new servers for AI and HPC convergence.

AMD Showcases Growing Momentum for AMD Powered AI Solutions from the Data Center to PCs

Today at the "Advancing AI" event, AMD was joined by industry leaders including Microsoft, Meta, Oracle, Dell Technologies, HPE, Lenovo, Supermicro, Arista, Broadcom and Cisco to showcase how these companies are working with AMD to deliver advanced AI solutions spanning from cloud to enterprise and PCs. AMD launched multiple new products at the event, including the AMD Instinct MI300 Series data center AI accelerators, ROCm 6 open software stack with significant optimizations and new features supporting Large Language Models (LLMs) and Ryzen 8040 Series processors with Ryzen AI.

"AI is the future of computing and AMD is uniquely positioned to power the end-to-end infrastructure that will define this AI era, from massive cloud installations to enterprise clusters and AI-enabled intelligent embedded devices and PCs," said AMD Chair and CEO Dr. Lisa Su. "We are seeing very strong demand for our new Instinct MI300 GPUs, which are the highest-performance accelerators in the world for generative AI. We are also building significant momentum for our data center AI solutions with the largest cloud companies, the industry's top server providers, and the most innovative AI startups ꟷ who we are working closely with to rapidly bring Instinct MI300 solutions to market that will dramatically accelerate the pace of innovation across the entire AI ecosystem."

AMD Delivers Leadership Portfolio of Data Center AI Solutions with AMD Instinct MI300 Series

Today, AMD announced the availability of the AMD Instinct MI300X accelerators - with industry leading memory bandwidth for generative AI and leadership performance for large language model (LLM) training and inferencing - as well as the AMD Instinct MI300A accelerated processing unit (APU) - combining the latest AMD CDNA 3 architecture and "Zen 4" CPUs to deliver breakthrough performance for HPC and AI workloads.

"AMD Instinct MI300 Series accelerators are designed with our most advanced technologies, delivering leadership performance, and will be in large scale cloud and enterprise deployments," said Victor Peng, president, AMD. "By leveraging our leadership hardware, software and open ecosystem approach, cloud providers, OEMs and ODMs are bringing to market technologies that empower enterprises to adopt and deploy AI-powered solutions."

GIGABYTE Unveils Next-gen HPC & AI Servers with AMD Instinct MI300 Series Accelerators

GIGABYTE Technology: Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, and IT infrastructure, today announced the GIGABYTE G383-R80 for the AMD Instinct MI300A APU and two GIGABYTE G593 series servers for the AMD Instinct MI300X GPU and AMD EPYC 9004 Series processor. As a testament to the performance of AMD Instinct MI300 Series family of products, the El Capitan supercomputer at Lawrence Livermore National Laboratory uses the MI300A APU to power exascale computing. And these new GIGABYTE servers are the ideal platform to propel discoveries in HPC & AI at exascale.⁠

Marrying of a CPU & GPU: G383-R80
For incredible advancements in HPC there is the GIGABYTE G383-R80 that houses four LGA6096 sockets for MI300A APUs. This chip integrates a CPU that has twenty-four AMD Zen 4 cores with a powerful GPU built with AMD CDNA 3 GPU cores. And the chiplet design shares 128 GB of unified HBM3 memory for impressive performance for large AI models. The G383 server has lots of expansion slots for networking, storage, or other accelerators, with a total of twelve PCIe Gen 5 slots. And in the front of the chassis are eight 2.5" Gen 5 NVMe bays to handle heavy workloads such as real-time big data analytics and latency-sensitive workloads in finance and telecom. ⁠

Manufacturers Anticipate Completion of NVIDIA's HBM3e Verification by 1Q24; HBM4 Expected to Launch in 2026

TrendForce's latest research into the HBM market indicates that NVIDIA plans to diversify its HBM suppliers for more robust and efficient supply chain management. Samsung's HBM3 (24 GB) is anticipated to complete verification with NVIDIA by December this year. The progress of HBM3e, as outlined in the timeline below, shows that Micron provided its 8hi (24 GB) samples to NVIDIA by the end of July, SK hynix in mid-August, and Samsung in early October.

Given the intricacy of the HBM verification process—estimated to take two quarters—TrendForce expects that some manufacturers might learn preliminary HBM3e results by the end of 2023. However, it's generally anticipated that major manufacturers will have definite results by 1Q24. Notably, the outcomes will influence NVIDIA's procurement decisions for 2024, as final evaluations are still underway.

Dell Allegedly Prohibits Sales of High-End Radeon and Instinct MI GPUs in China

AMD's lineup of Radeon and Instinct GPUs, including the flagship RX 7900 XTX/XT, the professional-grade PRO W7900, and the upcoming Instinct MI300, are facing sales prohibitions in China, according to an alleged sales advisory guide from Dell. This restriction mirrors the earlier ban on NVIDIA's RTX 4090, underscoring the increasing export limitations U.S.-based companies face for high-end semiconductor products that could be repurposed for military and strategic applications. Notably, Dell's report lists several AMD Instinct accelerators, which are integral to data center infrastructure, and Radeon GPUs, which are widely used in PCs, indicating the broad impact of the advisory.

The ban includes discrete GPUs like AMD's Radeon RX 7900 XTX and 7900 XT, which, despite their data-center potential, may still be sold under specific "NEC" eligibility. This status allows for continued sales in restricted regions like sales of NVIDIA's RTX 4090. However, the process to secure NEC eligibility is lengthy, potentially leading to supply shortages and increased GPU prices—a trend already observed with the RX 7900 XTX in China, where it's become a high-end alternative in light of the RTX 4090's scarcity and inflated pricing. The Dell sales advisory also lists that sales of the aforementioned products are banned in 22 countries, including Russia, Iran, Iraq, and others listed below.

AMD Brings New AI and Compute Capabilities to Microsoft Customers

Today at Microsoft Ignite, AMD and Microsoft featured how AMD products, including the upcoming AMD Instinct MI300X accelerator, AMD EPYC CPUs and AMD Ryzen CPUs with AI engines, are enabling new services and compute capabilities across cloud and generative AI, Confidential Computing, Cloud Computing and smarter, more intelligent PCs.

"AMD is fostering AI everywhere - from the cloud, to the enterprise and end point devices - all powered by our CPUs, GPUs, accelerators and AI engines," said Vamsi Boppana, Senior Vice President, AI, AMD. "Together with Microsoft and a rapidly growing ecosystem of software and hardware partners, AMD is accelerating innovation to bring the benefits of AI to a broad portfolio of compute engines, with expanding software capabilities."

Microsoft Introduces 128-Core Arm CPU for Cloud and Custom AI Accelerator

During its Ignite conference, Microsoft introduced a duo of custom-designed silicon made to accelerate AI and excel in cloud workloads. First of the two is Microsoft's Azure Cobalt 100 CPU, a 128-core design that features a 64-bit Armv9 instruction set, implemented in a cloud-native design that is set to become a part of Microsoft's offerings. While there aren't many details regarding the configuration, the company claims that the performance target is up to 40% when compared to the current generation of Arm servers running on Azure cloud. The SoC has used Arm's Neoverse CSS platform customized for Microsoft, with presumably Arm Neoverse N2 cores.

The next and hottest topic in the server space is AI acceleration, which is needed for running today's large language models. Microsoft hosts OpenAI's ChatGPT, Microsoft's Copilot, and many other AI services. To help make them run as fast as possible, Microsoft's project Athena now has the name of Maia 100 AI accelerator, which is manufactured on TSMC's 5 nm process. It features 105 billion transistors and supports various MX data formats, even those smaller than 8-bit bit, for maximum performance. Currently tested on GPT 3.5 Turbo, we have yet to see performance figures and comparisons with competing hardware from NVIDIA, like H100/H200 and AMD, with MI300X. The Maia 100 has an aggregate bandwidth of 4.8 Terabits per accelerator, which uses a custom Ethernet-based networking protocol for scaling. These chips are expected to appear in Microsoft data centers early next year, and we hope to get some performance numbers soon.

IT Leaders Optimistic about Ways AI will Transform their Business and are Ramping up Investments

Today, AMD released the findings from a new survey of global IT leaders which found that 3 in 4 IT leaders are optimistic about the potential benefits of AI—from increased employee efficiency to automated cybersecurity solutions—and more than 2 in 3 are increasing investments in AI technologies. However, while AI presents clear opportunities for organizations to become more productive, efficient, and secure, IT leaders expressed uncertainty on their AI adoption timeliness due to their lack of implementation roadmaps and the overall readiness of their existing hardware and technology stack.

AMD commissioned the survey of 2,500 IT leaders across the United States, United Kingdom, Germany, France, and Japan to understand how AI technologies are re-shaping the workplace, how IT leaders are planning their AI technology and related Client hardware roadmaps, and what their biggest challenges are for adoption. Despite some hesitations around security and a perception that training the workforce would be burdensome, it became clear that organizations that have already implemented AI solutions are seeing a positive impact and organizations that delay risk being left behind. Of the organizations prioritizing AI deployments, 90% report already seeing increased workplace efficiency.
Return to Keyword Browsing
Jan 20th, 2025 22:29 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts