News Posts matching #LLM

Return to Keyword Browsing

Western Digital Enterprise SSDs Certified to Support NVIDIA GB200 NVL72 System for Compute-Intensive AI Environments

Western Digital Corp. today announced that its PCIe Gen 5 DC SN861 E.1S enterprise-class NVMe SSDs have been certified to support the NVIDIA GB200 NVL72 rack-scale system.

The rapid rise of AI, ML, and large language models (LLMs) is creating a challenge for companies with two opposing forces. Data generation and consumption are accelerating, while organizations face pressure to quickly derive value from this data. Performance, scalability, and efficiency are essential for AI technology stacks as storage demands rise. Certified to be compatible with the GB200 NVL72 system, Western Digital's enterprise SSD addresses the growing needs of the AI market for high-speed accelerated computing combined with low latency to serve compute-intensive AI environments.

Lenovo Accelerates Business Transformation with New ThinkSystem Servers Engineered for Optimal AI and Powered by AMD

Today, Lenovo announced its industry-leading ThinkSystem infrastructure solutions powered by AMD EPYC 9005 Series processors, as well as AMD Instinct MI325X accelerators. Backed by 225 of AMD's world-record performance benchmarks, the Lenovo ThinkSystem servers deliver an unparalleled combination of AMD technology-based performance and efficiency to tackle today's most demanding edge-to-cloud workloads, including AI training, inferencing and modeling.

"Lenovo is helping organizations of all sizes and across various industries achieve AI-powered business transformations," said Vlad Rozanovich, Senior Vice President, Lenovo Infrastructure Solutions Group. "Not only do we deliver unmatched performance, we offer the right mix of solutions to change the economics of AI and give customers faster time-to-value and improved total value of ownership."

Supermicro Currently Shipping Over 100,000 GPUs Per Quarter in its Complete Rack Scale Liquid Cooled Servers

Supermicro, Inc., a Total IT Solution Provider for Cloud, AI/ML, Storage, and 5G/Edge, is announcing a complete liquid cooling solution that includes powerful Coolant Distribution Units (CDUs), cold plates, Coolant Distribution Manifolds (CDMs), cooling towers and end to end management software. This complete solution reduces ongoing power costs and Day 0 hardware acquisition and data center cooling infrastructure costs. The entire end-to-end data center scale liquid cooling solution is available directly from Supermicro.

"Supermicro continues to innovate, delivering full data center plug-and-play rack scale liquid cooling solutions," said Charles Liang, CEO and president of Supermicro. "Our complete liquid cooling solutions, including SuperCloud Composer for the entire life-cycle management of all components, are now cooling massive, state-of-the-art AI factories, reducing costs and improving performance. The combination of Supermicro deployment experience and delivering innovative technology is resulting in data center operators coming to Supermicro to meet their technical and financial goals for both the construction of greenfield sites and the modernization of existing data centers. Since Supermicro supplies all the components, the time to deployment and online are measured in weeks, not months."

Intel Updates "AI Playground" Application for Local AI Models with "Lunar Lake" Support

Intel has announced the release of an updated version of its AI Playground application, now optimized for the new Intel Core Ultra 200V "Lunar Lake" series of processors. This latest iteration, version 1.21b, brings a host of new features and improvements designed to make AI more accessible to users of Intel's AI-enabled PCs. AI Playground, first launched earlier this year, offers a user-friendly interface for various AI functions, including image generation, enhancement, and natural language processing. The new version introduces several key enhancements. These include a fresh, exclusive theme for 200V series processor users, an expanded LLM picker now featuring Phi3, Qwen2, and Mistral models, and a conversation manager for saving and revisiting chat discussions. Additionally, users will find adjustable font sizes for improved readability and a simplified aspect ratio tool for image creation and enhancement.

One of the most significant aspects of AI Playground is its ability to run entirely locally on the user's machine. This approach ensures that all computations, prompts, and outputs remain on the device, addressing privacy concerns often associated with cloud-based AI services. The application is optimized to take advantage of the Xe Cores and XMX AI engines found in the Intel Core Ultra 200V series processors, allowing even lightweight devices to perform complex AI tasks efficiently. Intel has also improved the installation process, addressing potential conflicts and providing better error handling. The company encourages user engagement through its Intel Insiders Discord channel, helping the community around AI Playground's development and use. Although the models users can run locally are smaller in size, usually up to 7 billion parameters with 8/4-bit quants, having a centralized application to help run them locally is significant for slowly embedding AI in all aspects of personal computing.

Advantech Launches AIR-310, Ultra-Low-Profile Scalable AI Inference

Advantech, a leading provider of edge computing solutions, introduces the AIR-310, a compact edge AI inference system featuring an MXM GPU card. Powered by 12th/13th/14th Gen Intel Core 65 W desktop processors, the AIR-310 delivers up to 12.99 TFLOPS of scalable AI performance via the NVIDIA Quadro 2000A GPU card in a 1.5U chassis (215 x 225 x 55 mm). Despite its compact size, it offers versatile connectivity with three LAN ports and four USB 3.0 ports, enabling seamless integration of sensors and cameras for vision AI applications.

The system includes smart fan management, operates in temperatures from 0 to 50°C (32 to 122°F), and is shock-resistant, capable of withstanding 3G vibration and 30G shock. Bundled with Intel Arc A370 and NVIDIA A2000 GPUs, it is certified to IEC 61000-6-2, IEC 61000-6-4, and CB/UL standards, ensuring stable 24/7 operation in harsh environments, including space-constrained or mobile equipment. The AIR-310 supports Windows 11, Linux Ubuntu 24.04, and the Edge AI SDK, enabling accelerated inference deployment for applications such as factory inspections, real-time video surveillance, GenAI/LLM, and medical imaging.

AMD Instinct MI300X Accelerators Available on Oracle Cloud Infrastructure

AMD today announced that Oracle Cloud Infrastructure (OCI) has chosen AMD Instinct MI300X accelerators with ROCm open software to power its newest OCI Compute Supercluster instance called BM.GPU.MI300X.8. For AI models that can comprise hundreds of billions of parameters, the OCI Supercluster with AMD MI300X supports up to 16,384 GPUs in a single cluster by harnessing the same ultrafast network fabric technology used by other accelerators on OCI. Designed to run demanding AI workloads including large language model (LLM) inference and training that requires high throughput with leading memory capacity and bandwidth, these OCI bare metal instances have already been adopted by companies including Fireworks AI.

"AMD Instinct MI300X and ROCm open software continue to gain momentum as trusted solutions for powering the most critical OCI AI workloads," said Andrew Dieckmann, corporate vice president and general manager, Data Center GPU Business, AMD. "As these solutions expand further into growing AI-intensive markets, the combination will benefit OCI customers with high performance, efficiency, and greater system design flexibility."

SK hynix Presents Upgraded AiMX Solution at AI Hardware and Edge AI Summit 2024

SK hynix unveiled an enhanced Accelerator-in-Memory based Accelerator (AiMX) card at the AI Hardware & Edge AI Summit 2024 held September 9-12 in San Jose, California. Organized annually by Kisaco Research, the summit brings together representatives from the AI and machine learning ecosystem to share industry breakthroughs and developments. This year's event focused on exploring cost and energy efficiency across the entire technology stack. Marking its fourth appearance at the summit, SK hynix highlighted how its AiM products can boost AI performance across data centers and edge devices.

Booth Highlights: Meet the Upgraded AiMX
In the AI era, high-performance memory products are vital for the smooth operation of LLMs. However, as these LLMs are trained on increasingly larger datasets and continue to expand, there is a growing need for more efficient solutions. SK hynix addresses this demand with its PIM product AiMX, an AI accelerator card that combines multiple GDDR6-AiMs to provide high bandwidth and outstanding energy efficiency. At the AI Hardware & Edge AI Summit 2024, SK hynix presented its updated 32 GB AiMX prototype which offers double the capacity of the original card featured at last year's event. To highlight the new AiMX's advanced processing capabilities in a multi-batch environment, SK hynix held a demonstration of the prototype card with the Llama 3 70B model, an open source LLM. In particular, the demonstration underlined AiMX's ability to serve as a highly effective attention accelerator in data centers.

SambaNova Launches Fastest AI Platform Based on Its SN40L Chip

SambaNova Systems, provider of the fastest and most efficient chips and AI models, announced SambaNova Cloud, the world's fastest AI inference service enabled by the speed of its SN40L AI chip. Developers can log on for free via an API today — no waiting list — and create their own generative AI applications using both the largest and most capable model, Llama 3.1 405B, and the lightning-fast Llama 3.1 70B. SambaNova Cloud runs Llama 3.1 70B at 461 tokens per second (t/s) and 405B at 132 t/s at full precision.

"SambaNova Cloud is the fastest API service for developers. We deliver world record speed and in full 16-bit precision - all enabled by the world's fastest AI chip," said Rodrigo Liang, CEO of SambaNova Systems. "SambaNova Cloud is bringing the most accurate open source models to the vast developer community at speeds they have never experienced before."

NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Benchmark

As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another. In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.

The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token. MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.

Cerebras Launches the World's Fastest AI Inference

Today, Cerebras Systems, the pioneer in high performance AI compute, announced Cerebras Inference, the fastest AI inference solution in the world. Delivering 1,800 tokens per second for Llama3.1 8B and 450 tokens per second for Llama3.1 70B, Cerebras Inference is 20 times faster than NVIDIA GPU-based solutions in hyperscale clouds. Starting at just 10c per million tokens, Cerebras Inference is priced at a fraction of GPU solutions, providing 100x higher price-performance for AI workloads.

Unlike alternative approaches that compromise accuracy for performance, Cerebras offers the fastest performance while maintaining state of the art accuracy by staying in the 16-bit domain for the entire inference run. Cerebras Inference is priced at a fraction of GPU-based competitors, with pay-as-you-go pricing of 10 cents per million tokens for Llama 3.1 8B and 60 cents per million tokens for Llama 3.1 70B.

FuriosaAI Unveils RNGD Power-Efficient AI Processor at Hot Chips 2024

Today at Hot Chips 2024, FuriosaAI is pulling back the curtain on RNGD (pronounced "Renegade"), our new AI accelerator designed for high-performance, highly efficient large language model (LLM) and multimodal model inference in data centers. As part of his Hot Chips presentation, Furiosa co-founder and CEO June Paik is sharing technical details and providing the first hands-on look at the fully functioning RNGD card.

With a TDP of 150 watts, a novel chip architecture, and advanced memory technology like HBM3, RNGD is optimized for inference with demanding LLMs and multimodal models. It's built to deliver high performance, power efficiency, and programmability all in a single product - a trifecta that the industry has struggled to achieve in GPUs and other AI chips.

AMD Completes Acquisition of Silo AI

AMD today announced the completion of its acquisition of Silo AI, the largest private AI lab in Europe. The all-cash transaction valued at approximately $665 million furthers the company's commitment to deliver end-to-end AI solutions based on open standards and in strong partnership with the global AI ecosystem. Silo AI brings a team of world-class AI scientists and engineers to AMD experienced in developing cutting-edge AI models, platforms and solutions for large enterprise customers including Allianz, Philips, Rolls-Royce and Unilever. Their expertise spans diverse markets and they have created state-of-the-art open source multilingual Large Language Models (LLMs) including Poro and Viking on AMD platforms. The Silo AI team will join the AMD Artificial Intelligence Group (AIG), led by AMD Senior Vice President Vamsi Boppana.

"AI is our number one strategic priority, and we continue to invest in both the talent and software capabilities to support our growing customer deployments and roadmaps," said Vamsi Boppana, AMD senior vice president, AIG. "The Silo AI team has developed state-of-the-art language models that have been trained at scale on AMD Instinct accelerators and they have broad experience developing and integrating AI models to solve critical problems for end customers. We expect their expertise and software capabilities will directly improve the experience for customers in delivering the best performing AI solutions on AMD platforms."

AI SSD Procurement Capacity Estimated to Exceed 45 EB in 2024; NAND Flash Suppliers Accelerate Process Upgrades

TrendForce's latest report on enterprise SSDs reveals that a surge in demand for AI has led AI server customers to significantly increase their orders for enterprise SSDs over the past two quarters. Upstream suppliers have been accelerating process upgrades and planning for 2YY products—slated to enter mass production in 2025—in order to meet the growing demand for SSDs in AI applications.

TrendForce observes that increased orders for enterprise SSDs from AI server customers have resulted in contract prices for this category rising by over 80% from 4Q23 to 3Q24. SSDs play a crucial role in AI development. In AI model training, SSDs primarily store model parameters, including evolving weights and deviations.

Intel Announces Arc A760A Automotive-grade GPU

In a strategic move to empower automakers with groundbreaking opportunities, Intel unveiled its first discrete graphics processing unit (dGPU), the Intel Arc Graphics for Automotive, at its AI Cockpit Innovation Experience event. To advance automotive AI, the product will be commercially deployed in vehicles as soon as 2025, accelerating automobile technology and unlocking a new era of AI-driven cockpit experiences and enhanced personalization for manufacturers and drivers alike.

Intel's entry into automotive discrete GPUs addresses growing demand for compute power in increasingly sophisticated vehicle cockpits. By adding the Intel Arc graphics for Automotive to its existing portfolio of AI-enhanced software-defined vehicle (SDV) system-on-chips (SoCs), Intel offers automakers an open, flexible and scalable platform solution that brings next-level, high-fidelity experiences to the vehicle.

Intel Releases AI Playground, a Unified Generative AI and Chat App for Intel Arc GPUs

Intel on Monday rolled out the first public release of AI Playground, an AI productivity suite the company showcased in its 2024 Computex booth. AI Playground is a well-packaged suite of generative AI applications and a chatbot, which are designed to leverage Intel Arc discrete GPUs with at least 8 GB of video memory. All utilities in the suite are designed under the OpenVINO framework, and take advantage of the XMX cores of Arc A-series discrete GPUs. Currently, only three GPU models from the lineup come with 8 GB or higher amounts of video memory, the A770, A750, and A580; and their mobile variants. The company is working on a variant of the suite that can work on Intel Core Ultra-H series processors, where it uses a combination of the NPU and the iGPU for acceleration. AI Playground is open source. Intel put in effort to make the suite as client-friendly as possible, by giving it a packaged installer that looks after installation of all software dependencies.

Intel AI Playground tools include an image generative AI that can turn prompts into standard or HD images, which is based on Stable Diffusion backed by DreamShaper 8 and Juggernaut XL models. It also supports Phi3, LCM LoRA, and LCM LoRA SDXL. All of these have been optimized for acceleration on Arc "Alchemist" GPUs. The utility also includes an AI image enhancement utility that can be used for upscaling along with detail reconstruction, styling, inpainting and outpainting, and certain kinds of image manipulation. The third most important tool is the text AI chatbot with all popular LLMs.

DOWNLOAD: Intel AI Playground

Tenstorrent Launches Next Generation Wormhole-based Developer Kits and Workstations

Tenstorrent is launching their next generation Wormhole chip featuring PCIe cards and workstations designed for developers who are interested in scalability for multi-chip development using Tenstorrent's powerful open-source software stacks.

These Wormhole-based cards and systems are now available for immediate order on tenstorrent.com:
  • Wormhole n150, powered by a single processor
  • Wormhole n300, powered by two processors
  • TT-LoudBox, a developer workstation powered by four Wormhole n300s (eight processors)

Gigabyte AI TOP Utility Reinventing Your Local AI Fine-tuning

GIGABYTE TECHNOLOGY Co. Ltd, a leading manufacturer of motherboards, graphics cards, and hardware solutions, released the GIGABYTE exclusive groundbreaking AI TOP Utility. With reinvented workflows, user-friendly interface, and real-time progress monitoring, AI TOP Utility provides a reinventing touch of local AI model training and fine-tuning. It features a variety of groundbreaking technologies that can be easily adapted by beginners or experts, for most common open-source LLMs, in anyplace even on your desk.

GIGABYTE AI TOP is the all-round solution for local AI Model Fine-tuning. Running local AI training and fine-tuning on sensitive data can relatively provide greater privacy and security with maximum flexibility and real-time adjustment. Collocating with GIGABYTE AI TOP hardware and AI TOP Utility, the common constraints of GPU VRAM insufficiency when trying to execute AI fine-tuning locally can be addressed. By GIGABYTE AI TOP series motherboard, PSU, and SSD, as well as GIGABYTE graphics cards lineup covering NVIDIA GeForce RTX 40 Series, AMD Radeon RX 7900 Series, Radeon Pro W7900 and W7800 series, the size of open-source LLM fine-tuning can now reach up to 236B and more.

HP is Betting on AI for their Notebooks and Desktops

HP Inc. today introduced two new innovations—the world's highest performance AI PC and the first integration of a trust framework into an AI model development platform. Both announcements expand HP's efforts to make AI real for companies and people with new and transformative AI experiences across the company's PCs, software, and partner ecosystem.

HP is empowering everyone, from corporate knowledge workers to freelancers and students, to unlock the power of AI. Users can connect with anyone in the world with real time translation to 40 languages, become master presenters with their personal communication coach, and quickly create videos like a pro.

AMD to Acquire Silo AI to Expand Enterprise AI Solutions Globally

AMD today announced the signing of a definitive agreement to acquire Silo AI, the largest private AI lab in Europe, in an all-cash transaction valued at approximately $665 million. The agreement represents another significant step in the company's strategy to deliver end-to-end AI solutions based on open standards and in strong partnership with the global AI ecosystem. The Silo AI team consists of world-class AI scientists and engineers with extensive experience developing tailored AI models, platforms and solutions for leading enterprises spanning cloud, embedded and endpoint computing markets.

Silo AI CEO and co-founder Peter Sarlin will continue to lead the Silo AI team as part of the AMD Artificial Intelligence Group, reporting to AMD senior vice president Vamsi Boppana. The acquisition is expected to close in the second half of 2024.

Gigabyte Launches AMD Radeon PRO W7000 Series Graphics Cards

GIGABYTE TECHNOLOGY Co. Ltd, a leading manufacturer of premium gaming hardware, today launched the cutting-edge AMD Radeon PRO W7000 series workstation graphics cards, including the flagship GIGABYTE Radeon PRO W7900 Dual Slot AI TOP 48G as well as the GIGABYTE Radeon PRO W7800 AI TOP 32G. Powered by AMD RDNA 3 architecture, these graphics cards offer a massive 48 GB and 32 GB of GDDR6 memory, respectively, delivering cutting-edge performance and exceptional experiences for workstation professionals, creators and AI developers.⁠⁠

GIGABYTE stands as the AMD professional graphics partner in the market, with a proven ability to design and manufacture the entire Radeon PRO series. Our dedication to quality products, unwavering business commitment, and comprehensive customer service empower us to deliver professional-grade GPU solutions, expanding user's choices in workstation and AI computing.

NVIDIA MLPerf Training Results Showcase Unprecedented Performance and Elasticity

The full-stack NVIDIA accelerated computing platform has once again demonstrated exceptional performance in the latest MLPerf Training v4.0 benchmarks. NVIDIA more than tripled the performance on the large language model (LLM) benchmark, based on GPT-3 175B, compared to the record-setting NVIDIA submission made last year. Using an AI supercomputer featuring 11,616 NVIDIA H100 Tensor Core GPUs connected with NVIDIA Quantum-2 InfiniBand networking, NVIDIA achieved this remarkable feat through larger scale - more than triple that of the 3,584 H100 GPU submission a year ago - and extensive full-stack engineering.

Thanks to the scalability of the NVIDIA AI platform, Eos can now train massive AI models like GPT-3 175B even faster, and this great AI performance translates into significant business opportunities. For example, in NVIDIA's recent earnings call, we described how LLM service providers can turn a single dollar invested into seven dollars in just four years running the Llama 3 70B model on NVIDIA HGX H200 servers. This return assumes an LLM service provider serving Llama 3 70B at $0.60/M tokens, with an HGX H200 server throughput of 24,000 tokens/second.

Intel Submits Gaudi 2 Results on MLCommons' Newest Benchmark

Today, MLCommons published results of its industry AI performance benchmark, MLPerf Training v4.0. Intel's results demonstrate the choice that Intel Gaudi 2 AI accelerators give enterprises and customers. Community-based software simplifies generative AI (GenAI) development and industry-standard Ethernet networking enables flexible scaling of AI systems. For the first time on the MLPerf benchmark, Intel submitted results on a large Gaudi 2 system (1,024 Gaudi 2 accelerators) trained in Intel Tiber Developer Cloud to demonstrate Gaudi 2 performance and scalability and Intel's cloud capacity for training MLPerf's GPT-3 175B1 parameter benchmark model.

"The industry has a clear need: address the gaps in today's generative AI enterprise offerings with high-performance, high-efficiency compute options. The latest MLPerf results published by MLCommons illustrate the unique value Intel Gaudi brings to market as enterprises and customers seek more cost-efficient, scalable systems with standard networking and open software, making GenAI more accessible to more customers," said Zane Ball, Intel corporate vice president and general manager, DCAI Product Management.

SK hynix Showcases Its Next-Gen Solutions at Computex 2024

SK hynix presented its leading AI memory solutions at COMPUTEX Taipei 2024 from June 4-7. As one of Asia's premier IT shows, COMPUTEX Taipei 2024 welcomed around 1,500 global participants including tech companies, venture capitalists, and accelerators under the theme "Connecting AI". Making its debut at the event, SK hynix underlined its position as a first mover and leading AI memory provider through its lineup of next-generation products.

"Connecting AI" With the Industry's Finest AI Memory Solutions
Themed "Memory, The Power of AI," SK hynix's booth featured its advanced AI server solutions, groundbreaking technologies for on-device AI PCs, and outstanding consumer SSD products. HBM3E, the fifth generation of HBM1, was among the AI server solutions on display. Offering industry-leading data processing speeds of 1.18 terabytes (TB) per second, vast capacity, and advanced heat dissipation capability, HBM3E is optimized to meet the requirements of AI servers and other applications. Another technology which has become crucial for AI servers is CXL as it can increase system bandwidth and processing capacity. SK hynix highlighted the strength of its CXL portfolio by presenting its CXL Memory Module-DDR5 (CMM-DDR5), which significantly expands system bandwidth and capacity compared to systems only equipped with DDR5. Other AI server solutions on display included the server DRAM products DDR5 RDIMM and MCR DIMM. In particular, SK hynix showcased its tall 128-gigabyte (GB) MCR DIMM for the first time at an exhibition.

New Performance Optimizations Supercharge NVIDIA RTX AI PCs for Gamers, Creators and Developers

NVIDIA today announced at Microsoft Build new AI performance optimizations and integrations for Windows that help deliver maximum performance on NVIDIA GeForce RTX AI PCs and NVIDIA RTX workstations. Large language models (LLMs) power some of the most exciting new use cases in generative AI and now run up to 3x faster with ONNX Runtime (ORT) and DirectML using the new NVIDIA R555 Game Ready Driver. ORT and DirectML are high-performance tools used to run AI models locally on Windows PCs.

WebNN, an application programming interface for web developers to deploy AI models, is now accelerated with RTX via DirectML, enabling web apps to incorporate fast, AI-powered capabilities. And PyTorch will support DirectML execution backends, enabling Windows developers to train and infer complex AI models on Windows natively. NVIDIA and Microsoft are collaborating to scale performance on RTX GPUs. These advancements build on NVIDIA's world-leading AI platform, which accelerates more than 500 applications and games on over 100 million RTX AI PCs and workstations worldwide.

AMD Instinct MI300X Accelerators Power Microsoft Azure OpenAI Service Workloads and New Azure ND MI300X V5 VMs

Today at Microsoft Build, AMD (NASDAQ: AMD) showcased its latest end-to-end compute and software capabilities for Microsoft customers and developers. By using AMD solutions such as AMD Instinct MI300X accelerators, ROCm open software, Ryzen AI processors and software, and Alveo MA35D media accelerators, Microsoft is able to provide a powerful suite of tools for AI-based deployments across numerous markets. The new Microsoft Azure ND MI300X virtual machines (VMs) are now generally available, giving customers like Hugging Face, access to impressive performance and efficiency for their most demanding AI workloads.

"The AMD Instinct MI300X and ROCm software stack is powering the Azure OpenAI Chat GPT 3.5 and 4 services, which are some of the world's most demanding AI workloads," said Victor Peng, president, AMD. "With the general availability of the new VMs from Azure, AI customers have broader access to MI300X to deliver high-performance and efficient solutions for AI applications."
Return to Keyword Browsing
Mar 28th, 2025 02:51 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts