News Posts matching #H100

Return to Keyword Browsing

NVIDIA Launches Blackwell-Powered DGX SuperPOD for Generative AI Supercomputing at Trillion-Parameter Scale

NVIDIA today announced its next-generation AI supercomputer—the NVIDIA DGX SuperPOD powered by NVIDIA GB200 Grace Blackwell Superchips—for processing trillion-parameter models with constant uptime for superscale generative AI training and inference workloads.

Featuring a new, highly efficient, liquid-cooled rack-scale architecture, the new DGX SuperPOD is built with NVIDIA DGX GB200 systems and provides 11.5 exaflops of AI supercomputing at FP4 precision and 240 terabytes of fast memory—scaling to more with additional racks.

NVIDIA Blackwell Platform Arrives to Power a New Era of Computing

Powering a new era of computing, NVIDIA today announced that the NVIDIA Blackwell platform has arrived—enabling organizations everywhere to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.

The Blackwell GPU architecture features six transformative technologies for accelerated computing, which will help unlock breakthroughs in data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing and generative AI—all emerging industry opportunities for NVIDIA.

Gigabyte Unveils Comprehensive and Powerful AI Platforms at NVIDIA GTC

GIGABYTE Technology and Giga Computing, a subsidiary of GIGABYTE and an industry leader in enterprise solutions, will showcase their solutions at the GIGABYTE booth #1224 at NVIDIA GTC, a global AI developer conference running through March 21. This event will offer GIGABYTE the chance to connect with its valued partners and customers, and together explore what the future in computing holds.

The GIGABYTE booth will focus on GIGABYTE's enterprise products that demonstrate AI training and inference delivered by versatile computing platforms based on NVIDIA solutions, as well as direct liquid cooling (DLC) for improved compute density and energy efficiency. Also not to be missed at the NVIDIA booth is the MGX Pavilion, which features a rack of GIGABYTE servers for the NVIDIA GH200 Grace Hopper Superchip architecture.

TSMC Reportedly Investing $16 Billion into New CoWoS Facilities

TSMC is experiencing unprecedented demand from AI chip customers—unnamed parties have (fancifully) requested the construction of entirely new fabrication facilities. Taiwan's leading semiconductor contract manufacturer seems to concentrating on "sensible" expansions, mainly in the area of CoWoS packaging output—according to an Economic Daily report, company leadership and local government were negotiating over the construction of four new advanced packaging plants. Insiders propose that plans have been revised—an investment in excess of 500 billion yuan ($16 billion) will enable the founding of six new CoWoS-focused facilities. TSMC is expected to make an official announcement next month—industry moles reckon that construction work will start in April. Two (of the six total) advanced packaging plants could become fully operational before the conclusion of 2024.

Lately, TSMC has initiated an ambitious recruitment drive—targeting around 6000 new workers. A touring entity is tasked with the attraction of "talents with high enthusiasm for semiconductors." The majority of new recruits are likely heading to new or expanded Taiwan-based facilities. The Economic Daily report proposes that Chiayi City's technological hub will play host to TSMC's new CoWoS packaging plants. A DigiTimes Asia news piece (from January) posited that TSMC leadership anticipates CoWoS output reaching 44,000 units by the end of 2024. This predicted tally could grow, thanks to the (rumored) activation of additional factories. CoWoS packaging is considered to be a vital aspect of AI accelerators—insiders believe that TSMC's latest investment will boost production of NVIDIA H100 GPUs. The combined output of six new CoWoS plants will assist greatly in the creation of next-gen B100 chips.

HBM3 Initially Exclusively Supplied by SK Hynix, Samsung Rallies Fast After AMD Validation

TrendForce highlights the current landscape of the HBM market, which as of early 2024, is primarily focused on HBM3. NVIDIA's upcoming B100 or H200 models will incorporate advanced HBM3e, signaling the next step in memory technology. The challenge, however, is the supply bottleneck caused by both CoWoS packaging constraints and the inherently long production cycle of HBM—extending the timeline from wafer initiation to the final product beyond two quarters.

The current HBM3 supply for NVIDIA's H100 solution is primarily met by SK hynix, leading to a supply shortfall in meeting burgeoning AI market demands. Samsung's entry into NVIDIA's supply chain with its 1Znm HBM3 products in late 2023, though initially minor, signifies its breakthrough in this segment.

Intel Gaudi2 Accelerator Beats NVIDIA H100 at Stable Diffusion 3 by 55%

Stability AI, the developers behind the popular Stable Diffusion generative AI model, have run some first-party performance benchmarks for Stable Diffusion 3 using popular data-center AI GPUs, including the NVIDIA H100 "Hopper" 80 GB, A100 "Ampere" 80 GB, and Intel's Gaudi2 96 GB accelerator. Unlike the H100, which is a super-scalar CUDA+Tensor core GPU; the Gaudi2 is purpose-built to accelerate generative AI and LLMs. Stability AI published its performance findings in a blog post, which reveals that the Intel Gaudi2 96 GB is posting a roughly 56% higher performance than the H100 80 GB.

With 2 nodes, 16 accelerators, and a constant batch size of 16 per accelerator (256 in all), the Intel Gaudi2 array is able to generate 927 images per second, compared to 595 images for the H100 array, and 381 images per second for the A100 array, keeping accelerator and node counts constant. Scaling things up a notch to 32 nodes, and 256 accelerators or a batch size of 16 per accelerator (total batch size of 4,096), the Gaudi2 array is posting 12,654 images per second; or 49.4 images per-second per-device; compared to 3,992 images per second or 15.6 images per-second per-device for the older-gen A100 "Ampere" array.

AMD CTO Teases Memory Upgrades for Revised Instinct MI300-series Accelerators

Brett Simpson, Partner and Co-Founder of Arete Research, sat down with AMD CTO Mark Papermaster during the former's "Investor Webinar Conference." A transcript of the Arete + AMD question and answer session appeared online last week—the documented fireside chat concentrated mostly on "AI compute market" topics. Papermaster was asked about his company's competitive approach when taking on NVIDIA's very popular range of A100 and H100 AI GPUs, as well as the recently launched GH200 chip. The CTO did not reveal any specific pricing strategies—a "big picture" was painted instead: "I think what's important when you just step back is to look at total cost of ownership, not just one GPU, one accelerator, but total cost of ownership. But now when you also look at the macro, if there's not competition in the market, you're going to see not only a growth of the price of these devices due to the added content that they have, but you're -- without a check and balance, you're going to see very, very high margins, more than that could be sustained without a competitive environment."

Papermaster continued: "And what I think is very key with -- as AMD has brought competition market for these most powerful AI training and inference devices is you will see that check and balance. And we have a very innovative approach. We've been a leader in chiplet design. And so we have the right technology for the right purpose of the AI build-out that we do. We have, of course, a GPU accelerator. But there's many other circuitry associated with being able to scale and build out these large clusters, and we're very, very efficient in our design." Team Red started to ship its flagship accelerator, Instinct MI300X, to important customers at the start of 2024—Arete Research's Simpson asked about the possibility of follow-up models. In response, AMD's CTO referenced some recent history: "Well, I think the first thing that I'll highlight is what we did to arrive at this point, where we are a competitive force. We've been investing for years in building up our GPU road map to compete in both HPC and AI. We had a very, very strong harbor train that we've been on, but we had to build our muscle in the software enablement."

Supermicro Accelerates Performance of 5G and Telco Cloud Workloads with New and Expanded Portfolio of Infrastructure Solutions

Supermicro, Inc. (NASDAQ: SMCI), a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, delivers an expanded portfolio of purpose-built infrastructure solutions to accelerate performance and increase efficiency in 5G and telecom workloads. With one of the industry's most diverse offerings, Supermicro enables customers to expand public and private 5G infrastructures with improved performance per watt and support for new and innovative AI applications. As a long-term advocate of open networking platforms and a member of the O-RAN Alliance, Supermicro's portfolio incorporates systems featuring 5th Gen Intel Xeon processors, AMD EPYC 8004 Series processors, and the NVIDIA Grace Hopper Superchip.

"Supermicro is expanding our broad portfolio of sustainable and state-of-the-art servers to address the demanding requirements of 5G and telco markets and Edge AI," said Charles Liang, president and CEO of Supermicro. "Our products are not just about technology, they are about delivering tangible customer benefits. We quickly bring data center AI capabilities to the network's edge using our Building Block architecture. Our products enable operators to offer new capabilities to their customers with improved performance and lower energy consumption. Our edge servers contain up to 2 TB of high-speed DDR5 memory, 6 PCIe slots, and a range of networking options. These systems are designed for increased power efficiency and performance-per-watt, enabling operators to create high-performance, customized solutions for their unique requirements. This reassures our customers that they are investing in reliable and efficient solutions."

NVIDIA Expects Upcoming Blackwell GPU Generation to be Capacity-Constrained

NVIDIA is anticipating supply issues for its upcoming Blackwell GPUs, which are expected to significantly improve artificial intelligence compute performance. "We expect our next-generation products to be supply constrained as demand far exceeds supply," said Colette Kress, NVIDIA's chief financial officer, during a recent earnings call. This prediction of scarcity comes just days after an analyst noted much shorter lead times for NVIDIA's current flagship Hopper-based H100 GPUs tailored to AI and high-performance computing. The eagerly anticipated Blackwell architecture and B100 GPUs built on it promise major leaps in capability—likely spurring NVIDIA's existing customers to place pre-orders already. With skyrocketing demand in the red-hot AI compute market, NVIDIA appears poised to capitalize on the insatiable appetite for ever-greater processing power.

However, the scarcity of NVIDIA's products may present an excellent opportunity for significant rivals like AMD and Intel. If both companies can offer a product that could beat NVIDIA's current H100 and provide a suitable software stack, customers would be willing to jump to their offerings and not wait many months for the anticipated high lead times. Intel is preparing the next-generation Gaudi 3 and working on the Falcon Shores accelerator for AI and HPC. AMD is shipping its Instinct MI300 accelerator, a highly competitive product, while already working on the MI400 generation. It remains to be seen if AI companies will begin the adoption of non-NVIDIA hardware or if they will remain a loyal customer and agree to the higher lead times of the new Blackwell generation. However, capacity constrain should only be a problem at launch, where the availability should improve from quarter to quarter. As TSMC improves CoWoS packaging capacity and 3 nm production, NVIDIA's allocation of the 3 nm wafers will likely improve over time as the company moves its priority from H100 to B100.

NVIDIA Accelerates Quantum Computing Exploration at Australia's Pawsey Supercomputing Centre

NVIDIA today announced that Australia's Pawsey Supercomputing Research Centre will add the NVIDIA CUDA Quantum platform accelerated by NVIDIA Grace Hopper Superchips to its National Supercomputing and Quantum Computing Innovation Hub, furthering its work driving breakthroughs in quantum computing.

Researchers at the Perth-based center will leverage CUDA Quantum - an open-source hybrid quantum computing platform that features powerful simulation tools, and capabilities to program hybrid CPU, GPU and QPU systems - as well as, the NVIDIA cuQuantum software development kit of optimized libraries and tools for accelerating quantum computing workflows. The NVIDIA Grace Hopper Superchip - which combines the NVIDIA Grace CPU and Hopper GPU architectures - provides extreme performance to run high-fidelity and scalable quantum simulations on accelerators and seamlessly interface with future quantum hardware infrastructure.

GIGABYTE Elevates Computing Horizons at SupercomputingAsia 2024

GIGABYTE, a global leader in high-performance computing solutions, collaborates with industry partner Xenon at SupercomputingAsia 2024, held at the Sydney International Convention and Exhibition Centre from February 19 to 22. This collaboration showcases cutting-edge technologies, offering diverse solutions that redefine the high-performance computing landscape.

GIGABYTE's Highlights at SCA 2024
At booth 19, GIGABYTE presents the G593-SD0, our flagship AI server, and the industry's first Nvidia-certified HGX H100 8-GPU Server. Equipped with 4th/5th Gen Intel Xeon Scalable Processors, it incorporates GIGABYTE's thermal design, ensuring optimal performance within its density-optimized 5U server chassis, pushing the boundaries of AI computing. Additionally, GIGABYTE introduces the 2U 4-node H263-S62 server, designed for 4th Gen Intel Xeon Scalable Processors and now upgraded to the latest 5th Gen, tailored for hybrid and private cloud applications. It features a DLC (Direct Liquid Cooling) solution to efficiently manage heat generated by high-performance computing. Also on display is the newly released W773-W80 workstation, supporting the latest NVIDIA RTX 6000 Ada and catering to CAD, DME, research, data and image analysis, and SMB private cloud applications. At SCA 2024, explore our offerings, including rackmount servers and motherboards, reflecting GIGABYTE's commitment to innovative and reliable solutions. This offers a valuable opportunity to discuss your IT infrastructure requirements with our sales and consulting teams, supported by GIGABYTE and Xenon in Australia.

NVIDIA Unveils "Eos" to Public - a Top Ten Supercomputer

Providing a peek at the architecture powering advanced AI factories, NVIDIA released a video that offers the first public look at Eos, its latest data-center-scale supercomputer. An extremely large-scale NVIDIA DGX SuperPOD, Eos is where NVIDIA developers create their AI breakthroughs using accelerated computing infrastructure and fully optimized software. Eos is built with 576 NVIDIA DGX H100 systems, NVIDIA Quantum-2 InfiniBand networking and software, providing a total of 18.4 exaflops of FP8 AI performance. Revealed in November at the Supercomputing 2023 trade show, Eos—named for the Greek goddess said to open the gates of dawn each day—reflects NVIDIA's commitment to advancing AI technology.

Eos Supercomputer Fuels Innovation
Each DGX H100 system is equipped with eight NVIDIA H100 Tensor Core GPUs. Eos features a total of 4,608 H100 GPUs. As a result, Eos can handle the largest AI workloads to train large language models, recommender systems, quantum simulations and more. It's a showcase of what NVIDIA's technologies can do, when working at scale. Eos is arriving at the perfect time. People are changing the world with generative AI, from drug discovery to chatbots to autonomous machines and beyond. To achieve these breakthroughs, they need more than AI expertise and development skills. They need an AI factory—a purpose-built AI engine that's always available and can help ramp their capacity to build AI models at scale Eos delivers. Ranked No. 9 in the TOP 500 list of the world's fastest supercomputers, Eos pushes the boundaries of AI technology and infrastructure.

NVIDIA to Create AI Semi-custom Chip Business Unit

NVIDIA is reportedly working to set up a new business unit focused on designing semi-custom chips for some of its largest data-center customers, Reuters reports. NVIDIA dominates the AI HPC processor market, although even its biggest customers are having to shop from its general lineup of A100 series and H100 series HPC processors. There are reports of some of these customers venturing out of the NVIDIA fold, wanting to develop their own AI processor designs. It is to cater to exactly this segment that NVIDIA is setting up the new unit.

A semi-custom chip isn't just a bespoke chip designed to a customer's specifications. It is co-developed by NVIDIA and its customer, using mainly NVIDIA IP blocks, but also integrating some third-party IP blocks the customer may want; and more importantly, approach semiconductor fabrication companies such as TSMC, Samsung, or Intel Foundry Services as separate entities from NVIDIA for their wafer allocation. For example, a company like Google may have a certain amount of wafer pre-allocation with TSMC (eg: for its Tensor SoCs powering the Pixel smartphones), which it may want to tap into for a semi-custom AI HPC processor for its cloud business. NVIDIA assesses a $30 billion TAM for this specific business unit—that's all its current customers wanting to pursue their own AI processor projects, who will now be motivated to stick to NVIDIA.

Lenovo HPC Infrastructure Powers Pre-Exascale Supercomputer Marenostrum 5 to Enable New Scientific Advances and Solve Global Challenges

Lenovo (HKSE: 992) (ADR: LNVGY) has today announced that the General Purpose Partition of the MareNostrum 5, a new pre-exascale supercomputer running on Lenovo's HPC infrastructure, has been classified as the top x86 general-purpose cluster on the recently published TOP500 list of the most powerful supercomputers globally.

Officially inaugurated at Barcelona Supercomputing Center on December 21st, MareNostrum 5 has been built for the European High Performance Computing Joint Undertaking (EuroHPC JU). The pre-exascale supercomputer will bolster the EU's mission to provide Europe with the most advanced supercomputing technology and accelerate the capacity for artificial intelligence (AI) research, enabling new scientific advances that will help solve global challenges. It aims to empower a wide range of complex HPC-specific applications, from climate research and engineering to material science and earth sciences, adeptly handling tasks that extend beyond the capabilities of cloud computing.

OpenAI Reportedly Talking to TSMC About Custom Chip Venture

OpenAI is reported to be initiating R&D on a proprietary AI processing solution—the research organization's CEO, Sam Altman, has commented on the in-efficient operation of datacenters running NVIDIA H100 and A100 GPUs. He foresees a future scenario where his company becomes less reliant on Team Green's off-the-shelf AI-crunchers, with a deployment of bespoke AI processors. A short Reuters interview also underlined Altman's desire to find alternatives sources of power: "It motivates us to go invest more in (nuclear) fusion." The growth of artificial intelligence industries has put an unprecedented strain on energy providers, so tech firms could be semi-forced into seeking out frugal enterprise hardware.

The Financial Times has followed up on last week's Bloomberg report of OpenAI courting investment partners in the Middle East. FT's news piece alleges that Altman is in talks with billionaire businessman Sheikh Tahnoon bin Zayed al-Nahyan, a very well connected member of the United Arab Emirates Royal Family. OpenAI's leadership is reportedly negotiating with TSMC—The Financial Times alleges that Taiwan's top chip foundry is an ideal manufacturing partner. This revelation contradicts Bloomberg's recent reports of a potential custom OpenAI AI chip venture involving purpose-built manufacturing facilities. The whole project is said to be at an early stage of development, so Altman and his colleagues are most likely exploring a variety of options.

AMD Instinct MI300X GPUs Featured in LaminiAI LLM Pods

LaminiAI appears to be one of AMD's first customers to receive a bulk order of Instinct MI300X GPUs—late last week, Sharon Zhou (CEO and co-founder) posted about the "next batch of LaminiAI LLM Pods" up and running with Team Red's cutting-edge CDNA 3 series accelerators inside. Her short post on social media stated: "rocm-smi...like freshly baked bread, 8x MI300X is online—if you're building on open LLMs and you're blocked on compute, lmk. Everyone should have access to this wizard technology called LLMs."

An attached screenshot of a ROCm System Management Interface (ROCm SMI) session showcases an individual Pod configuration sporting eight Instinct MI300X GPUs. According to official blog entries, LaminiAI has utilized bog-standard MI300 accelerators since 2023, so it is not surprising to see their partnership continue to grow with AMD. Industry predictions have the Instinct MI300X and MI300A models placed as great alternatives to NVIDIA's dominant H100 "Hopper" series—AMD stock is climbing due to encouraging financial analyst estimations.

Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

Mark Zuckerberg has shared some interesting insights about Meta's AI infrastructure buildout, which is on track to include an astonishing number of NVIDIA H100 Tensor GPUs. In the post on Instagram, Meta's CEO has noted the following: "We're currently training our next-gen model Llama 3, and we're building massive compute infrastructure to support our future roadmap, including 350k H100s by the end of this year -- and overall almost 600k H100s equivalents of compute if you include other GPUs." That means that the company will enhance its AI infrastructure with 350,000 H100 GPUs on top of the existing GPUs, which is equivalent to 250,000 H100 in terms of computing power, for a total of 600,000 H100-equivalent GPUs.

The raw number of GPUs installed comes at a steep price. With the average selling price of H100 GPU nearing 30,000 US dollars, Meta's investment will settle the company back around $10.5 billion. Other GPUs should be in the infrastructure, but most will comprise the NVIDIA Hopper family. Additionally, Meta is currently training the LLama 3 AI model, which will be much more capable than the existing LLama 2 family and will include better reasoning, coding, and math-solving capabilities. These models will be open-source. Later down the pipeline, as the artificial general intelligence (AGI) comes into play, Zuckerberg has noted that "Our long term vision is to build general intelligence, open source it responsibly, and make it widely available so everyone can benefit." So, expect to see these models in the GitHub repositories in the future.

Indian Client Purchases Additional $500 Million Batch of NVIDIA AI GPUs

Indian data center operator Yotta is reportedly set to spend big with another placed with NVIDIA—a recent Reuters article outlines a $500 million purchase of Team Green AI GPUs. Yotta is in the process of upgrading its AI Cloud infrastructure, and their total tally for this endeavor (involving Hopper and newer Grace Hopper models) is likely to hit $1 billion. An official company statement from December confirmed the existence of an extra procurement of GPUs, but they did not provide any details regarding budget or hardware choices at that point in time. Reuters contacted Sunil Gupta, Yotta's CEO, last week for a comment on the situation. The co-founder elaborated: "that the order would comprise nearly 16,000 of NVIDIA's artificial intelligence chips H100 and GH200 and will be placed by March 2025."

Team Green is ramping up its embrace of the Indian data center market, as US sanctions have made it difficult to conduct business with enterprise customers in nearby Chinese territories. Reuters state that Gupta's firm (Yotta) is: "part of Indian billionaire Niranjan Hiranandani's real estate group, (in turn) a partner firm for NVIDIA in India and runs three data centre campuses, in Mumbai, Gujarat and near New Delhi." Microsoft, Google and Amazon are investing heavily in cloud and data centers situated in India. Shankar Trivedi, an NVIDIA executive, recently attended Vibrant Gujarat Global Summit—the article's reporter conducted a brief interview with him. Trivedi stated that Yotta is targeting a March 2024 start for a new NVIDIA-powered AI data center located in the region's tech hub: Gujarat International Finance Tec-City.

TSMC Plans to Put a Trillion Transistors on a Single Package by 2030

During the recent IEDM conference, TSMC previewed its process roadmap for delivering next-generation chip packages packing over one trillion transistors by 2030. This aligns with similar long-term visions from Intel. Such enormous transistor counts will come through advanced 3D packaging of multiple chipsets. But TSMC also aims to push monolithic chip complexity higher, ultimately enabling 200 billion transistor designs on a single die. This requires steady enhancement of TSMC's planned N2, N2P, N1.4, and N1 nodes, which are slated to arrive between now and the end of the decade. While multi-chipset architectures are currently gaining favor, TSMC asserts both packaging density and raw transistor density must scale up in tandem. Some perspective on the magnitude of TSMC's goals include NVIDIA's 80 billion transistor GH100 GPU—among today's largest chips, excluding wafer-scale designs from Cerebras.

Yet TSMC's roadmap calls for more than doubling that, first with over 100 billion transistor monolithic designs, then eventually 200 billion. Of course, yields become more challenging as die sizes grow, which is where advanced packaging of smaller chiplets becomes crucial. Multi-chip module offerings like AMD's MI300X and Intel's Ponte Vecchio already integrate dozens of tiles, with PVC having 47 tiles. TSMC envisions this expansion to chip packages housing more than a trillion transistors via its CoWoS, InFO, 3D stacking, and many other technologies. While the scaling cadence has recently slowed, TSMC remains confident in achieving both packaging and process breakthroughs to meet future density demands. The foundry's continuous investment ensures progress in unlocking next-generation semiconductor capabilities. But physics ultimately dictates timelines, no matter how aggressive the roadmap.

China Continues to Enhance AI Chip Self-Sufficiency, but High-End AI Chip Development Remains Constrained

Huawei's subsidiary HiSilicon has made significant strides in the independent R&D of AI chips, launching the next-gen Ascend 910B. These chips are utilized not only in Huawei's public cloud infrastructure but also sold to other Chinese companies. This year, Baidu ordered over a thousand Ascend 910B chips from Huawei to build approximately 200 AI servers. Additionally, in August, Chinese company iFlytek, in partnership with Huawei, released the "Gemini Star Program," a hardware and software integrated device for exclusive enterprise LLMs, equipped with the Ascend 910B AI acceleration chip, according to TrendForce's research.

TrendForce conjectures that the next-generation Ascend 910B chip is likely manufactured using SMIC's N+2 process. However, the production faces two potential risks. Firstly, as Huawei recently focused on expanding its smartphone business, the N+2 process capacity at SMIC is almost entirely allocated to Huawei's smartphone products, potentially limiting future capacity for AI chips. Secondly, SMIC remains on the Entity List, possibly restricting access to advanced process equipment.

Dell Partners with Imbue on New AI Compute Cluster Using Nearly 10,000 NVIDIA H100 GPUs

Dell Technologies and Imbue, an independent AI research company, have entered into a $150 million agreement to build a new high-performance computing cluster for training foundation models optimized for reasoning. Imbue is one of the few independent AI labs that develops its own foundation models, and trains them to have more advanced reasoning capabilities—like knowing when to ask for more information, analyzing and critiquing their own outputs, or breaking down a difficult goal into a plan and then executing on it. Imbue trains AI agents on top of those models that can do work for people across diverse fields in ways that are robust, safe, and useful. Imbue's goal is to create practical tools for building agents that could enable workers across a broad set of domains, including helping engineers write new code, analysts understand and draft complex policy proposals, and much more.

AWS and NVIDIA Partner to Deliver 65 ExaFLOP AI Supercomputer, Other Solutions

Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), and NVIDIA (NASDAQ: NVDA) today announced an expansion of their strategic collaboration to deliver the most-advanced infrastructure, software and services to power customers' generative artificial intelligence (AI) innovations. The companies will bring together the best of NVIDIA and AWS technologies—from NVIDIA's newest multi-node systems featuring next-generation GPUs, CPUs and AI software, to AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability—that are ideal for training foundation models and building generative AI applications.

The expanded collaboration builds on a longstanding relationship that has fueled the generative AI era by offering early machine learning (ML) pioneers the compute performance required to advance the state-of-the-art in these technologies.

Manufacturers Anticipate Completion of NVIDIA's HBM3e Verification by 1Q24; HBM4 Expected to Launch in 2026

TrendForce's latest research into the HBM market indicates that NVIDIA plans to diversify its HBM suppliers for more robust and efficient supply chain management. Samsung's HBM3 (24 GB) is anticipated to complete verification with NVIDIA by December this year. The progress of HBM3e, as outlined in the timeline below, shows that Micron provided its 8hi (24 GB) samples to NVIDIA by the end of July, SK hynix in mid-August, and Samsung in early October.

Given the intricacy of the HBM verification process—estimated to take two quarters—TrendForce expects that some manufacturers might learn preliminary HBM3e results by the end of 2023. However, it's generally anticipated that major manufacturers will have definite results by 1Q24. Notably, the outcomes will influence NVIDIA's procurement decisions for 2024, as final evaluations are still underway.

SK hynix Showcases Next-Gen AI and HPC Solutions at SC23

SK hynix presented its leading AI and high-performance computing (HPC) solutions at Supercomputing 2023 (SC23) held in Denver, Colorado between November 12-17. Organized by the Association for Computing Machinery and IEEE Computer Society since 1988, the annual SC conference showcases the latest advancements in HPC, networking, storage, and data analysis. SK hynix marked its first appearance at the conference by introducing its groundbreaking memory solutions to the HPC community. During the six-day event, several SK hynix employees also made presentations revealing the impact of the company's memory solutions on AI and HPC.

Displaying Advanced HPC & AI Products
At SC23, SK hynix showcased its products tailored for AI and HPC to underline its leadership in the AI memory field. Among these next-generation products, HBM3E attracted attention as the HBM solution meets the industry's highest standards of speed, capacity, heat dissipation, and power efficiency. These capabilities make it particularly suitable for data-intensive AI server systems. HBM3E was presented alongside NVIDIA's H100, a high-performance GPU for AI that uses HBM3 for its memory.

Microsoft Introduces 128-Core Arm CPU for Cloud and Custom AI Accelerator

During its Ignite conference, Microsoft introduced a duo of custom-designed silicon made to accelerate AI and excel in cloud workloads. First of the two is Microsoft's Azure Cobalt 100 CPU, a 128-core design that features a 64-bit Armv9 instruction set, implemented in a cloud-native design that is set to become a part of Microsoft's offerings. While there aren't many details regarding the configuration, the company claims that the performance target is up to 40% when compared to the current generation of Arm servers running on Azure cloud. The SoC has used Arm's Neoverse CSS platform customized for Microsoft, with presumably Arm Neoverse N2 cores.

The next and hottest topic in the server space is AI acceleration, which is needed for running today's large language models. Microsoft hosts OpenAI's ChatGPT, Microsoft's Copilot, and many other AI services. To help make them run as fast as possible, Microsoft's project Athena now has the name of Maia 100 AI accelerator, which is manufactured on TSMC's 5 nm process. It features 105 billion transistors and supports various MX data formats, even those smaller than 8-bit bit, for maximum performance. Currently tested on GPT 3.5 Turbo, we have yet to see performance figures and comparisons with competing hardware from NVIDIA, like H100/H200 and AMD, with MI300X. The Maia 100 has an aggregate bandwidth of 4.8 Terabits per accelerator, which uses a custom Ethernet-based networking protocol for scaling. These chips are expected to appear in Microsoft data centers early next year, and we hope to get some performance numbers soon.
Return to Keyword Browsing
Dec 20th, 2024 01:15 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts