News Posts matching #Hopper

Return to Keyword Browsing

NVIDIA Hopper Leaps Ahead in Generative AI at MLPerf

It's official: NVIDIA delivered the world's fastest platform in industry-standard tests for inference on generative AI. In the latest MLPerf benchmarks, NVIDIA TensorRT-LLM—software that speeds and simplifies the complex job of inference on large language models—boosted the performance of NVIDIA Hopper architecture GPUs on the GPT-J LLM nearly 3x over their results just six months ago. The dramatic speedup demonstrates the power of NVIDIA's full-stack platform of chips, systems and software to handle the demanding requirements of running generative AI. Leading companies are using TensorRT-LLM to optimize their models. And NVIDIA NIM—a set of inference microservices that includes inferencing engines like TensorRT-LLM—makes it easier than ever for businesses to deploy NVIDIA's inference platform.

Raising the Bar in Generative AI
TensorRT-LLM running on NVIDIA H200 Tensor Core GPUs—the latest, memory-enhanced Hopper GPUs—delivered the fastest performance running inference in MLPerf's biggest test of generative AI to date. The new benchmark uses the largest version of Llama 2, a state-of-the-art large language model packing 70 billion parameters. The model is more than 10x larger than the GPT-J LLM first used in the September benchmarks. The memory-enhanced H200 GPUs, in their MLPerf debut, used TensorRT-LLM to produce up to 31,000 tokens/second, a record on MLPerf's Llama 2 benchmark. The H200 GPU results include up to 14% gains from a custom thermal solution. It's one example of innovations beyond standard air cooling that systems builders are applying to their NVIDIA MGX designs to take the performance of Hopper GPUs to new heights.

Chinese Research Institute Utilizing "Banned" NVIDIA H100 AI GPUs

NVIDIA's freshly unveiled "Blackwell" B200 and GB200 AI GPUs will be getting plenty of coverage this year, but many organizations will be sticking with current or prior generation hardware. Team Green is in the process of shipping out compromised "Hopper" designs to customers in China, but the region's appetite for powerful AI-crunching hardware is growing. Last year's China-specific H800 design, and the older "Ampere" A800 chip were deemed too potent—new regulations prevented further sales. Recently, AMD's Instinct MI309 AI accelerator was considered "too powerful to gain unconditional approval from the US Department of Commerce." Natively-developed solutions are catching up with Western designs, but some institutions are not prepared to queue up for emerging technologies.

NVIDIA's new H20 AI GPU as well as Ada Lovelace-based L20 PCIe and L2 PCIe models are weakened enough to get a thumbs up from trade regulators, but likely not compelling enough for discerning clients. The Telegraph believes that NVIDIA's uncompromised H100 AI GPU is currently in use at several Chinese establishments—the report cites information presented within four academic papers published on ArXiv, an open access science website. The Telegraph's news piece highlights one of the studies—it was: "co-authored by a researcher at 4paradigm, an AI company that was last year placed on an export control list by the US Commerce Department for attempting to acquire US technology to support China's military." Additionally, the Chinese Academy of Sciences appears to have conducted several AI-accelerated experiments, involving the solving of complex mathematical and logical problems. The article suggests that this research organization has acquired a very small batch of NVIDIA H100 GPUs (up to eight units). A "thriving black market" for high-end NVIDIA processors has emerged in the region—last Autumn, the Center for a New American Security (CNAS) published an in-depth article about ongoing smuggling activities.

Unwrapping the NVIDIA B200 and GB200 AI GPU Announcements

NVIDIA on Monday, at the 2024 GTC conference, unveiled the "Blackwell" B200 and GB200 AI GPUs. These are designed to offer an incredible 5X the AI inferencing performance gain over the current-gen "Hopper" H100, and come with four times the on-package memory. The B200 "Blackwell" is the largest chip physically possible using existing foundry tech, according to its makers. The chip is an astonishing 208 billion transistors, and is made up of two chiplets, which by themselves are the largest possible chips.

Each chiplet is built on the TSMC N4P foundry node, which is the most advanced 4 nm-class node by the Taiwanese foundry. Each chiplet has 104 billion transistors. The two chiplets have a high degree of connectivity with each other, thanks to a 10 TB/s custom interconnect. This is enough bandwidth and latency for the two to maintain cache coherency (i.e. address each other's memory as if they're their own). Each of the two "Blackwell" chiplets has a 4096-bit memory bus, and is wired to 96 GB of HBM3E spread across four 24 GB stacks; which totals to 192 GB for the B200 package. The GPU has a staggering 8 TB/s of memory bandwidth on tap. The B200 package features a 1.8 TB/s NVLink interface for host connectivity, and connectivity to another B200 chip.

NVIDIA Blackwell Platform Arrives to Power a New Era of Computing

Powering a new era of computing, NVIDIA today announced that the NVIDIA Blackwell platform has arrived—enabling organizations everywhere to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.

The Blackwell GPU architecture features six transformative technologies for accelerated computing, which will help unlock breakthroughs in data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing and generative AI—all emerging industry opportunities for NVIDIA.

NVIDIA's Selection of Micron HBM3E Supposedly Surprises Competing Memory Makers

SK Hynix believes that it leads the industry with the development and production of High Bandwidth Memory (HBM) solutions, but rival memory manufacturers are working hard on equivalent fifth generation packages. NVIDIA was expected to select SK Hynix as the main supplier of HBM3E parts for utilization on H200 "Hopper" AI GPUs, but a surprise announcement was issued by Micron's press team last month. The American firm revealed that HBM3E volume production had commenced: ""(our) 24 GB 8H HBM3E will be part of NVIDIA H200 Tensor Core GPUs, which will begin shipping in the second calendar quarter of 2024. This milestone positions Micron at the forefront of the industry, empowering artificial intelligence (AI) solutions with HBM3E's industry-leading performance and energy efficiency."

According to a Korea JoongAng Daily report, this boast has reportedly "shocked" the likes of SK Hynix and Samsung Electronics. They believe that Micron's: "announcement was a revolt from an underdog, as the US company barely held 10 percent of the global market last year." The article also points out some behind-the-scenes legal wrangling: "the cutthroat competition became more evident when the Seoul court sided with SK Hynix on Thursday (March 7) by granting a non-compete injunction to prevent its former researcher, who specialized in HBM, from working at Micron. He would be fined 10 million won for each day in violation." SK Hynix is likely pinning its next-gen AI GPU hopes on a 12-layer DRAM stacked HBM3E product—industry insiders posit that evaluation samples were submitted to NVIDIA last month. The outlook for these units is said to be very positive—mass production could start as early as this month.

Intel Gaudi2 Accelerator Beats NVIDIA H100 at Stable Diffusion 3 by 55%

Stability AI, the developers behind the popular Stable Diffusion generative AI model, have run some first-party performance benchmarks for Stable Diffusion 3 using popular data-center AI GPUs, including the NVIDIA H100 "Hopper" 80 GB, A100 "Ampere" 80 GB, and Intel's Gaudi2 96 GB accelerator. Unlike the H100, which is a super-scalar CUDA+Tensor core GPU; the Gaudi2 is purpose-built to accelerate generative AI and LLMs. Stability AI published its performance findings in a blog post, which reveals that the Intel Gaudi2 96 GB is posting a roughly 56% higher performance than the H100 80 GB.

With 2 nodes, 16 accelerators, and a constant batch size of 16 per accelerator (256 in all), the Intel Gaudi2 array is able to generate 927 images per second, compared to 595 images for the H100 array, and 381 images per second for the A100 array, keeping accelerator and node counts constant. Scaling things up a notch to 32 nodes, and 256 accelerators or a batch size of 16 per accelerator (total batch size of 4,096), the Gaudi2 array is posting 12,654 images per second; or 49.4 images per-second per-device; compared to 3,992 images per second or 15.6 images per-second per-device for the older-gen A100 "Ampere" array.

NVIDIA Grace Hopper Systems Gather at GTC

The spirit of software pioneer Grace Hopper will live on at NVIDIA GTC. Accelerated systems using powerful processors - named in honor of the pioneer of software programming - will be on display at the global AI conference running March 18-21, ready to take computing to the next level. System makers will show more than 500 servers in multiple configurations across 18 racks, all packing NVIDIA GH200 Grace Hopper Superchips. They'll form the largest display at NVIDIA's booth in the San Jose Convention Center, filling the MGX Pavilion.

MGX Speeds Time to Market
NVIDIA MGX is a blueprint for building accelerated servers with any combination of GPUs, CPUs and data processing units (DPUs) for a wide range of AI, high performance computing and NVIDIA Omniverse applications. It's a modular reference architecture for use across multiple product generations and workloads. GTC attendees can get an up-close look at MGX models tailored for enterprise, cloud and telco-edge uses, such as generative AI inference, recommenders and data analytics. The pavilion will showcase accelerated systems packing single and dual GH200 Superchips in 1U and 2U chassis, linked via NVIDIA BlueField-3 DPUs and NVIDIA Quantum-2 400 Gb/s InfiniBand networks over LinkX cables and transceivers. The systems support industry standards for 19- and 21-inch rack enclosures, and many provide E1.S bays for nonvolatile storage.

Quantum Machines Launches OPX1000, a High-density Processor-based Control Platform

In Sept. 2023, Quantum Machines (QM) unveiled OPX1000, our most advanced quantum control system to date - and the industry's leading controller in terms of performance and channel density. OPX1000 is the third generation of QM's processor-based quantum controllers. It enhances its predecessor, OPX+, by expanding analog performance and multiplying channel density to support the control of over 1,000 qubits. However, QM's vision for quantum controllers extends far beyond.

OPX1000 is designed as a platform for orchestrating the control of large-scale QPUs (quantum processing units). It's equipped with 8 frontend modules (FEMs) slots, representing the cutting-edge modular architecture for quantum control. The first low-frequency (LF) module was introduced in September 2023, and today, we're happy to introduce the Microwave (MW) FEM, which delivers additional value to our rapidly expanding customer base.

Supermicro Accelerates Performance of 5G and Telco Cloud Workloads with New and Expanded Portfolio of Infrastructure Solutions

Supermicro, Inc. (NASDAQ: SMCI), a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, delivers an expanded portfolio of purpose-built infrastructure solutions to accelerate performance and increase efficiency in 5G and telecom workloads. With one of the industry's most diverse offerings, Supermicro enables customers to expand public and private 5G infrastructures with improved performance per watt and support for new and innovative AI applications. As a long-term advocate of open networking platforms and a member of the O-RAN Alliance, Supermicro's portfolio incorporates systems featuring 5th Gen Intel Xeon processors, AMD EPYC 8004 Series processors, and the NVIDIA Grace Hopper Superchip.

"Supermicro is expanding our broad portfolio of sustainable and state-of-the-art servers to address the demanding requirements of 5G and telco markets and Edge AI," said Charles Liang, president and CEO of Supermicro. "Our products are not just about technology, they are about delivering tangible customer benefits. We quickly bring data center AI capabilities to the network's edge using our Building Block architecture. Our products enable operators to offer new capabilities to their customers with improved performance and lower energy consumption. Our edge servers contain up to 2 TB of high-speed DDR5 memory, 6 PCIe slots, and a range of networking options. These systems are designed for increased power efficiency and performance-per-watt, enabling operators to create high-performance, customized solutions for their unique requirements. This reassures our customers that they are investing in reliable and efficient solutions."

NVIDIA Expects Upcoming Blackwell GPU Generation to be Capacity-Constrained

NVIDIA is anticipating supply issues for its upcoming Blackwell GPUs, which are expected to significantly improve artificial intelligence compute performance. "We expect our next-generation products to be supply constrained as demand far exceeds supply," said Colette Kress, NVIDIA's chief financial officer, during a recent earnings call. This prediction of scarcity comes just days after an analyst noted much shorter lead times for NVIDIA's current flagship Hopper-based H100 GPUs tailored to AI and high-performance computing. The eagerly anticipated Blackwell architecture and B100 GPUs built on it promise major leaps in capability—likely spurring NVIDIA's existing customers to place pre-orders already. With skyrocketing demand in the red-hot AI compute market, NVIDIA appears poised to capitalize on the insatiable appetite for ever-greater processing power.

However, the scarcity of NVIDIA's products may present an excellent opportunity for significant rivals like AMD and Intel. If both companies can offer a product that could beat NVIDIA's current H100 and provide a suitable software stack, customers would be willing to jump to their offerings and not wait many months for the anticipated high lead times. Intel is preparing the next-generation Gaudi 3 and working on the Falcon Shores accelerator for AI and HPC. AMD is shipping its Instinct MI300 accelerator, a highly competitive product, while already working on the MI400 generation. It remains to be seen if AI companies will begin the adoption of non-NVIDIA hardware or if they will remain a loyal customer and agree to the higher lead times of the new Blackwell generation. However, capacity constrain should only be a problem at launch, where the availability should improve from quarter to quarter. As TSMC improves CoWoS packaging capacity and 3 nm production, NVIDIA's allocation of the 3 nm wafers will likely improve over time as the company moves its priority from H100 to B100.

GIGABYTE Advanced Data Center Solutions Unveils Telecom and AI Servers at MWC 2024

GIGABYTE Technology, an IT pioneer whose focus is to advance global industries through cloud and AI computing systems, is coming to MWC 2024 with its next-generation servers empowering telcos, cloud service providers, enterprises, and SMBs to swiftly harness the value of 5G and AI. Featured is a cutting-edge AI server boasting AMD Instinct MI300X 8-GPU, and a comprehensive AI/HPC server series supporting the latest chip technology from AMD, Intel, and NVIDIA. The showcase will also feature integrated green computing solutions excelling in heat dissipation and energy reduction.

Continuing the booth theme "Future of COMPUTING", GIGABYTE's presentation will cover servers for AI/HPC, RAN and Core networks, modular edge platforms, all-in-one green computing solutions, and AI-powered self-driving technology. The exhibits will demonstrate how industries extend AI applications from cloud to edge and terminal devices through 5G connectivity, expanding future opportunities with faster time to market and sustainable operations. The showcase spans from February 26th to 29th at Booth #5F60, Hall 5, Fira Gran Via, Barcelona.

NVIDIA CG100 "Grace" Server Processor Benchmarked by Academics

The Barcelona Supercomputing Center (BSC) and the State University of New York (Stony Brook and Buffalo campuses) have pitted NVIDIA's relatively new CG100 "Grace" Superchip against several rival products in a "wide variety of HPC and AI benchmarks." Team Green marketing material has focused mainly on the overall GH200 "Grace Hopper" package—so it is interesting to see technical institutes concentrate on the company's "first true" server processor (ARM-based), rather than the ever popular GPU aspect. The Next Platform's article summarized the chip's internal makeup: "(NVIDIA's) Grace CPU has a relatively high core count and a relatively low thermal footprint, and it has banks of low-power DDR5 (LPDDR5) memory—the kind used in laptops but gussied up with error correction to be server class—of sufficient capacity to be useful for HPC systems, which typically have 256 GB or 512 GB per node these days and sometimes less."

Benchmark results were revealed at last week's HPC Asia 2024 conference (in Nagoya, Japan)—Barcelona Supercomputing Center (BSC) and the State University of New York also uploaded their findings to the ACM Digital Library (link #1 & #2). BSC's MareNostrum 5 system contains an experimental cluster portion—consisting of NVIDIA Grace-Grace and Grace-Hopper superchips. We have heard plenty about the latter (in press releases), but the former is a novel concept—as outlined by The Next Platform: "Put two Grace CPUs together into a Grace-Grace superchip, a tightly coupled package using NVLink chip-to-chip interconnects that provide memory coherence across the LPDDR5 memory banks and that consumes only around 500 watts, and it gets plenty interesting for the HPC crowd. That yields a total of 144 Arm Neoverse "Demeter" V2 cores with the Armv9 architecture, and 1 TB of physical memory with 1.1 TB/sec of peak theoretical bandwidth. For some reason, probably relating to yield on the LPDDR5 memory, only 960 GB of that memory capacity and only 1 TB/sec of that memory bandwidth is actually available."

NVIDIA Readying H20 AI GPU for Chinese Market

NVIDIA's H800 AI GPU was rolled out last year to appease the Sanction Gods—but later on, the US Government deemed the cutdown "Hopper" part to be far too potent for Team Green's Chinese enterprise customers. Last October, newly amended export conditions banned sales of the H800, as well as the slightly older (plus similarly gimped) A800 "Ampere" GPU in the region. NVIDIA's engineering team returned to the drawing board, and developed a new range of compliantly weakened products. An exclusive Reuters report suggests that Team Green is taking pre-orders for a refreshed "Hopper" GPU—the latest China-specific flagship is called "HGX H20." NVIDIA web presences have not been updated with this new model, as well as Ada Lovelace-based L20 PCIe and L2 PCIe GPUs. Huawei's competing Ascend 910B is said to be slightly more performant in "some areas"—when compared to the H20—according to insiders within the distribution network.

The leakers reckon that NVIDIA's mainland distributors will be selling H20 models within a price range of $12,000 - $15,000—Huawei's locally developed Ascend 910B is priced at 120,000 RMB (~$16,900). One Reuters source stated that: "some distributors have started advertising the (NVIDIA H20) chips with a significant markup to the lower end of that range at about 110,000 yuan ($15,320). The report suggests that NVIDIA refused to comment on this situation. Another insider claimed that: "distributors are offering H20 servers, which are pre-configured with eight of the AI chips, for 1.4 million yuan. By comparison, servers that used eight of the H800 chips were sold at around 2 million yuan when they were launched a year ago." Small batches of H20 products are expected to reach important clients within the first quarter of 2024, followed by a wider release in Q2. It is believed that mass production will begin around Spring time.

Jensen Huang Heads to Taiwan, B100 "Blackwell" GPUs Reportedly in Focus

NVIDIA's intrepid CEO, Jensen Huang, has spent a fair chunk of January travelling around China—news outlets believe that Team Green's leader has conducted business meetings with very important clients in the region. Insiders proposed that his low-profile business trip included visits to NVIDIA operations in Shenzhen, Shanghai and Beijing. The latest updates allege that a stopover in Taiwan was also planned, following the conclusion of Mainland activities. Photos from an NVIDIA Chinese new year celebratory event have been spreading across the internet lately—many were surprised to see Huang appear on-stage in Shanghai and quickly dispense with his trademark black leather jacket. He swapped into a colorful "Year of the Wood Dragon" sleeveless shirt for a traditional dance routine.

It was not all fun and games during Huang's first trip to China in four years—inside sources have informed the Wall Street Journey about growing unrest within the nation's top ranked Cloud AI tech firms. Anonymous informants allege that leadership, at Alibaba Group and Tencent, are not happy with NVIDIA's selection of compromised enterprise GPUs—it is posited that NVIDIA's President has spent time convincing key clients to not adopt natively-developed solutions (unaffected by US Sanctions). The short hop over to Taiwan is reported not to be for R&R purposes—insiders had Huang's visiting key supply partners; TSMC and Wistron. Industry experts think that these meetings are linked to NVIDIA's upcoming "Blackwell" B100 AI GPU, and "supercharged" H200 "Hopper" accelerator. It is too early for the rumor mill to start speculation about nerfed versions of NVIDIA's 2024 enterprise products reaching Chinese shores, but Jensen Huang is seemingly ready to hold diplomatic talks with all sides.

AMD Instinct MI300X GPUs Featured in LaminiAI LLM Pods

LaminiAI appears to be one of AMD's first customers to receive a bulk order of Instinct MI300X GPUs—late last week, Sharon Zhou (CEO and co-founder) posted about the "next batch of LaminiAI LLM Pods" up and running with Team Red's cutting-edge CDNA 3 series accelerators inside. Her short post on social media stated: "rocm-smi...like freshly baked bread, 8x MI300X is online—if you're building on open LLMs and you're blocked on compute, lmk. Everyone should have access to this wizard technology called LLMs."

An attached screenshot of a ROCm System Management Interface (ROCm SMI) session showcases an individual Pod configuration sporting eight Instinct MI300X GPUs. According to official blog entries, LaminiAI has utilized bog-standard MI300 accelerators since 2023, so it is not surprising to see their partnership continue to grow with AMD. Industry predictions have the Instinct MI300X and MI300A models placed as great alternatives to NVIDIA's dominant H100 "Hopper" series—AMD stock is climbing due to encouraging financial analyst estimations.

Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

Mark Zuckerberg has shared some interesting insights about Meta's AI infrastructure buildout, which is on track to include an astonishing number of NVIDIA H100 Tensor GPUs. In the post on Instagram, Meta's CEO has noted the following: "We're currently training our next-gen model Llama 3, and we're building massive compute infrastructure to support our future roadmap, including 350k H100s by the end of this year -- and overall almost 600k H100s equivalents of compute if you include other GPUs." That means that the company will enhance its AI infrastructure with 350,000 H100 GPUs on top of the existing GPUs, which is equivalent to 250,000 H100 in terms of computing power, for a total of 600,000 H100-equivalent GPUs.

The raw number of GPUs installed comes at a steep price. With the average selling price of H100 GPU nearing 30,000 US dollars, Meta's investment will settle the company back around $10.5 billion. Other GPUs should be in the infrastructure, but most will comprise the NVIDIA Hopper family. Additionally, Meta is currently training the LLama 3 AI model, which will be much more capable than the existing LLama 2 family and will include better reasoning, coding, and math-solving capabilities. These models will be open-source. Later down the pipeline, as the artificial general intelligence (AGI) comes into play, Zuckerberg has noted that "Our long term vision is to build general intelligence, open source it responsibly, and make it widely available so everyone can benefit." So, expect to see these models in the GitHub repositories in the future.

Indian Client Purchases Additional $500 Million Batch of NVIDIA AI GPUs

Indian data center operator Yotta is reportedly set to spend big with another placed with NVIDIA—a recent Reuters article outlines a $500 million purchase of Team Green AI GPUs. Yotta is in the process of upgrading its AI Cloud infrastructure, and their total tally for this endeavor (involving Hopper and newer Grace Hopper models) is likely to hit $1 billion. An official company statement from December confirmed the existence of an extra procurement of GPUs, but they did not provide any details regarding budget or hardware choices at that point in time. Reuters contacted Sunil Gupta, Yotta's CEO, last week for a comment on the situation. The co-founder elaborated: "that the order would comprise nearly 16,000 of NVIDIA's artificial intelligence chips H100 and GH200 and will be placed by March 2025."

Team Green is ramping up its embrace of the Indian data center market, as US sanctions have made it difficult to conduct business with enterprise customers in nearby Chinese territories. Reuters state that Gupta's firm (Yotta) is: "part of Indian billionaire Niranjan Hiranandani's real estate group, (in turn) a partner firm for NVIDIA in India and runs three data centre campuses, in Mumbai, Gujarat and near New Delhi." Microsoft, Google and Amazon are investing heavily in cloud and data centers situated in India. Shankar Trivedi, an NVIDIA executive, recently attended Vibrant Gujarat Global Summit—the article's reporter conducted a brief interview with him. Trivedi stated that Yotta is targeting a March 2024 start for a new NVIDIA-powered AI data center located in the region's tech hub: Gujarat International Finance Tec-City.

AWS and NVIDIA Partner to Deliver 65 ExaFLOP AI Supercomputer, Other Solutions

Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), and NVIDIA (NASDAQ: NVDA) today announced an expansion of their strategic collaboration to deliver the most-advanced infrastructure, software and services to power customers' generative artificial intelligence (AI) innovations. The companies will bring together the best of NVIDIA and AWS technologies—from NVIDIA's newest multi-node systems featuring next-generation GPUs, CPUs and AI software, to AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability—that are ideal for training foundation models and building generative AI applications.

The expanded collaboration builds on a longstanding relationship that has fueled the generative AI era by offering early machine learning (ML) pioneers the compute performance required to advance the state-of-the-art in these technologies.

ASRock Rack Announces Support of NVIDIA H200 GPUs and GH200 Superchips and Highlights HPC and AI Server Platforms at SC 23

ASRock Rack Inc., the leading innovative server company, today is set to showcase a comprehensive range of servers for diverse AI workloads catering to scenarios from the edge, on-premises, and to the cloud at booth #1737 at SC 23 held at the Colorado Convention Center in Denver, USA. The event is from November 13th to 16th, and ASRock Rack will feature the following significant highlights:

At SC 23, ASRock Rack will demonstrate the NVIDIA-Qualified 2U4G-GENOA/M3 and 4U8G series GPU server solutions along with the NVIDIA H100 PCIe. The ASRock Rack 4U8G and 4U10G series GPU servers are able to accommodate eight to ten 400 W dual-slot GPU cards and 24 hot-swappable 2.5" drives, designed to deliver exceptional performance for demanding AI workloads deployed in the cloud environment. The 2U4G-GENOA/M3, tailored for lighter workloads, is powered by a single AMD EPYC 9004 series processor and is able to support four 400 W dual-slot GPUs while having additional PCIe and OCP NIC 3.0 slots for expansions.

GIGABYTE Demonstrates the Future of Computing at Supercomputing 2023 with Advanced Cooling and Scaled Data Centers

GIGABYTE Technology, Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, continues to be a leader in cooling IT hardware efficiently and in developing diverse server platforms for Arm and x86 processors, as well as AI accelerators. At SC23, GIGABYTE (booth #355) will showcase some standout platforms, including for the NVIDIA GH200 Grace Hopper Superchip and next-gen AMD Instinct APU. To better introduce its extensive lineup of servers, GIGABYTE will address the most important needs in supercomputing data centers, such as how to cool high-performance IT hardware efficiently and power AI that is capable of real-time analysis and fast time to results.

Advanced Cooling
For many data centers, it is becoming apparent that their cooling infrastructure must radically shift to keep pace with new IT hardware that continues to generate more heat and requires rapid heat transfer. Because of this, GIGABYTE has launched advanced cooling solutions that allow IT hardware to maintain ideal performance while being more energy-efficient and maintaining the same data center footprint. At SC23, its booth will have a single-phase immersion tank, the A1P0-EA0, which offers a one-stop immersion cooling solution. GIGABYTE is experienced in implementing immersion cooling with immersion-ready servers, immersion tanks, oil, tools, and services spanning the globe. Another cooling solution showcased at SC23 will be direct liquid cooling (DLC), and in particular, the new GIGABYTE cold plates and cooling modules for the NVIDIA Grace CPU Superchip, NVIDIA Grace Hopper Superchip, AMD EPYC 9004 processor, and 4th Gen Intel Xeon processor.

NVIDIA Supercharges Hopper, the World's Leading AI Computing Platform

NVIDIA today announced it has supercharged the world's leading AI computing platform with the introduction of the NVIDIA HGX H200. Based on NVIDIA Hopper architecture, the platform features the NVIDIA H200 Tensor Core GPU with advanced memory to handle massive amounts of data for generative AI and high performance computing workloads.

The NVIDIA H200 is the first GPU to offer HBM3e - faster, larger memory to fuel the acceleration of generative AI and large language models, while advancing scientific computing for HPC workloads. With HBM3e, the NVIDIA H200 delivers 141 GB of memory at 4.8 terabytes per second, nearly double the capacity and 2.4x more bandwidth compared with its predecessor, the NVIDIA A100. H200-powered systems from the world's leading server manufacturers and cloud service providers are expected to begin shipping in the second quarter of 2024.

NVIDIA Grace Hopper Superchip Powers 40+ AI Supercomputers

Dozens of new supercomputers for scientific computing will soon hop online, powered by NVIDIA's breakthrough GH200 Grace Hopper Superchip for giant-scale AI and high performance computing. The NVIDIA GH200 enables scientists and researchers to tackle the world's most challenging problems by accelerating complex AI and HPC applications running terabytes of data.

At the SC23 supercomputing show, NVIDIA today announced that the superchip is coming to more systems worldwide, including from Dell Technologies, Eviden, Hewlett Packard Enterprise (HPE), Lenovo, QCT and Supermicro. Bringing together the Arm-based NVIDIA Grace CPU and Hopper GPU architectures using NVIDIA NVLink-C2C interconnect technology, GH200 also serves as the engine behind scientific supercomputing centers across the globe. Combined, these GH200-powered centers represent some 200 exaflops of AI performance to drive scientific innovation.

NVIDIA Turbocharges Generative AI Training in MLPerf Benchmarks

NVIDIA's AI platform raised the bar for AI training and high performance computing in the latest MLPerf industry benchmarks. Among many new records and milestones, one in generative AI stands out: NVIDIA Eos - an AI supercomputer powered by a whopping 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking - completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes. That's a nearly 3x gain from 10.9 minutes, the record NVIDIA set when the test was introduced less than six months ago.

The benchmark uses a portion of the full GPT-3 data set behind the popular ChatGPT service that, by extrapolation, Eos could now train in just eight days, 73x faster than a prior state-of-the-art system using 512 A100 GPUs. The acceleration in training time reduces costs, saves energy and speeds time-to-market. It's heavy lifting that makes large language models widely available so every business can adopt them with tools like NVIDIA NeMo, a framework for customizing LLMs. In a new generative AI test ‌this round, 1,024 NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion text-to-image model in 2.5 minutes, setting a high bar on this new workload. By adopting these two tests, MLPerf reinforces its leadership as the industry standard for measuring AI performance, since generative AI is the most transformative technology of our time.

GIGABYTE Announces New Direct Liquid Cooling (DLC) Multi-Node Servers Ahead of SC23

GIGABYTE Technology, Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, today announced direct liquid cooling (DLC) multi-node servers for NVIDIA Grace CPU & NVIDIA Grace Hopper Superchip. In addition, a DLC ready Intel-based server for the NVIDIA HGX H100 8-GPU platform and a high-density server for AMD EPYC 9004 processors. For the ultimate in efficiency, is also a new 12U single-phase immersion tank. All these mentioned products will be at GIGABYTE booth #355 at SC23.

Just announced high-density CPU servers include Intel Xeon-based H263-S63-LAN1 and AMD EPYC-based H273-Z80-LAN1. These 2U 4 node servers employ DLC for all eight CPUs, and although it is dense computing CPU performance achieves its full potential. In August, GIGABYTE announced new servers for NVIDIA HGX H100 GPU, and now adds the DLC version to the G593 series, G593-SD0-LAX1, for NVIDIA HGX H100 8-GPU.

NVIDIA to Start Selling Arm-based CPUs to PC Clients by 2025

According to sources close to Reuters, NVIDIA is reportedly developing its custom CPUs based on Arm instruction set architecture (ISA), specifically tailored for the client ecosystem, also known as PC. NVIDIA has already developed an Arm-based CPU codenamed Grace, which is designed to handle server and HPC workloads in combination with the company's Hopper GPU. However, as we learn today, NVIDIA also wants to provide CPUs for PC users and to power Microsoft's Windows operating system. The push for more vendors of Arm-based CPUs is also supported by Microsoft, which is losing PC market share to Apple and its M-series of processors.

The creation of custom processors for PCs that Arm ISA would power makes the decades of x86-based applications either obsolete or in need of recompilation. Apple allows users to emulate x86 applications using the x86-to-Arm translation layer, and even Microsoft allows it for Windows-on-Arm devices. We are left to see how NVIDIA's solution would compete in the entire market of PC processors, which are expected to arrive in 2025. Still, the company could make some compelling solutions given its incredible silicon engineering history and performant Arm design like Grace. With the upcoming Arm-based processors hitting the market, we expect the Windows-on-Arm ecosystem to thrive and get massive investment from independent software vendors.
Return to Keyword Browsing
May 1st, 2024 05:57 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts