News Posts matching #NVIDIA H100

Return to Keyword Browsing

Demand for NVIDIA's Blackwell Platform Expected to Boost TSMC's CoWoS Total Capacity by Over 150% in 2024

Press Release by

Apr 16th, 2024 05:04 Discuss (6 Comments)

NVIDIA's next-gen Blackwell platform, which includes B-series GPUs and integrates NVIDIA's own Grace Arm CPU in models such as the GB200, represents a significant development. TrendForce points out that the GB200 and its predecessor, the GH200, both feature a combined CPU+GPU solution, primarily equipped with the NVIDIA Grace CPU and H200 GPU. However, the GH200 accounted for only approximately 5% of NVIDIA's high-end GPU shipments. The supply chain has high expectations for the GB200, with projections suggesting that its shipments could exceed millions of units by 2025, potentially making up nearly 40 to 50% of NVIDIA's high-end GPU market.

Although NVIDIA plans to launch products such as the GB200 and B100 in the second half of this year, upstream wafer packaging will need to adopt more complex and high-precision CoWoS-L technology, making the validation and testing process time-consuming. Additionally, more time will be required to optimize the B-series for AI server systems in aspects such as network communication and cooling performance. It is anticipated that the GB200 and B100 products will not see significant production volumes until 4Q24 or 1Q25.

Read full story

Intel Launches Gaudi 3 AI Accelerator: 70% Faster Training, 50% Faster Inference Compared to NVIDIA H100, Promises Better Efficiency Too

by

Apr 9th, 2024 13:59 Discuss (14 Comments)

During the Vision 2024 event, Intel announced its latest Gaudi 3 AI accelerator, promising significant improvements over its predecessor. Intel claims the Gaudi 3 offers up to 70% improvement in training performance, 50% better inference, and 40% better efficiency than Nvidia's H100 processors. The new AI accelerator is presented as a PCIe Gen 5 dual-slot add-in card with a 600 W TDP or an OAM module with 900 W. The PCIe card has the same peak 1,835 TeraFLOPS of FP8 performance as the OAM module despite a 300 W lower TDP. The PCIe version works as a group of four per system, while the OAM HL-325L modules can be run in an eight-accelerator configuration per server. This likely will result in a lower sustained performance, given the lower TDP, but it confirms that the same silicon is used, just finetuned with a lower frequency. Built on TSMC's N5 5 nm node, the AI accelerator features 64 Tensor Cores, delivering double the FP8 and quadruple FP16 performance over the previous generation Gaudi 2.

The Gaudi 3 AI chip comes with 128 GB of HBM2E with 3.7 TB/s of bandwidth and 24 200 Gbps Ethernet NICs, with dual 400 Gbps NICs used for scale-out. All of that is laid out on 10 tiles that make up the Gaudi 3 accelerator, which you can see pictured below. There is 96 MB of SRAM split between two compute tiles, which acts as a low-level cache that bridges data communication between Tensor Cores and HBM memory. Intel also announced support for the new performance-boosting standardized MXFP4 data format and is developing an AI NIC ASIC for Ultra Ethernet Consortium-compliant networking. The Gaudi 3 supports clusters of up to 8192 cards, coming from 1024 nodes comprised of systems with eight accelerators. It is on track for volume production in Q3, offering a cost-effective alternative to NVIDIA accelerators with the additional promise of a more open ecosystem. More information and a deeper dive can be found in the Gaudi 3 Whitepaper.

Chinese Research Institute Utilizing "Banned" NVIDIA H100 AI GPUs

by

Mar 19th, 2024 14:33 Discuss (9 Comments)

NVIDIA's freshly unveiled "Blackwell" B200 and GB200 AI GPUs will be getting plenty of coverage this year, but many organizations will be sticking with current or prior generation hardware. Team Green is in the process of shipping out compromised "Hopper" designs to customers in China, but the region's appetite for powerful AI-crunching hardware is growing. Last year's China-specific H800 design, and the older "Ampere" A800 chip were deemed too potent—new regulations prevented further sales. Recently, AMD's Instinct MI309 AI accelerator was considered "too powerful to gain unconditional approval from the US Department of Commerce." Natively-developed solutions are catching up with Western designs, but some institutions are not prepared to queue up for emerging technologies.

NVIDIA's new H20 AI GPU as well as Ada Lovelace-based L20 PCIe and L2 PCIe models are weakened enough to get a thumbs up from trade regulators, but likely not compelling enough for discerning clients. The Telegraph believes that NVIDIA's uncompromised H100 AI GPU is currently in use at several Chinese establishments—the report cites information presented within four academic papers published on ArXiv, an open access science website. The Telegraph's news piece highlights one of the studies—it was: "co-authored by a researcher at 4paradigm, an AI company that was last year placed on an export control list by the US Commerce Department for attempting to acquire US technology to support China's military." Additionally, the Chinese Academy of Sciences appears to have conducted several AI-accelerated experiments, involving the solving of complex mathematical and logical problems. The article suggests that this research organization has acquired a very small batch of NVIDIA H100 GPUs (up to eight units). A "thriving black market" for high-end NVIDIA processors has emerged in the region—last Autumn, the Center for a New American Security (CNAS) published an in-depth article about ongoing smuggling activities.

Microsoft and NVIDIA Announce Major Integrations to Accelerate Generative AI for Enterprises Everywhere

Press Release by

Mar 18th, 2024 17:40 Discuss (1 Comment)

At GTC on Monday, Microsoft Corp. and NVIDIA expanded their longstanding collaboration with powerful new integrations that leverage the latest NVIDIA generative AI and Omniverse technologies across Microsoft Azure, Azure AI services, Microsoft Fabric and Microsoft 365.

"Together with NVIDIA, we are making the promise of AI real, helping to drive new benefits and productivity gains for people and organizations everywhere," said Satya Nadella, Chairman and CEO, Microsoft. "From bringing the GB200 Grace Blackwell processor to Azure, to new integrations between DGX Cloud and Microsoft Fabric, the announcements we are making today will ensure customers have the most comprehensive platforms and tools across every layer of the Copilot stack, from silicon to software, to build their own breakthrough AI capability."

"AI is transforming our daily lives - opening up a world of new opportunities," said Jensen Huang, founder and CEO of NVIDIA. "Through our collaboration with Microsoft, we're building a future that unlocks the promise of AI for customers, helping them deliver innovative solutions to the world."

Read full story

NVIDIA Launches Blackwell-Powered DGX SuperPOD for Generative AI Supercomputing at Trillion-Parameter Scale

Press Release by

Mar 18th, 2024 16:42 Discuss (2 Comments)

NVIDIA today announced its next-generation AI supercomputer—the NVIDIA DGX SuperPOD powered by NVIDIA GB200 Grace Blackwell Superchips—for processing trillion-parameter models with constant uptime for superscale generative AI training and inference workloads.

Featuring a new, highly efficient, liquid-cooled rack-scale architecture, the new DGX SuperPOD is built with NVIDIA DGX GB200 systems and provides 11.5 exaflops of AI supercomputing at FP4 precision and 240 terabytes of fast memory—scaling to more with additional racks.

Read full story

NVIDIA Blackwell Platform Arrives to Power a New Era of Computing

Press Release by

Mar 18th, 2024 16:39 Discuss (20 Comments)

Powering a new era of computing, NVIDIA today announced that the NVIDIA Blackwell platform has arrived—enabling organizations everywhere to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.

The Blackwell GPU architecture features six transformative technologies for accelerated computing, which will help unlock breakthroughs in data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing and generative AI—all emerging industry opportunities for NVIDIA.

Read full story

TSMC and Synopsys Bring Breakthrough NVIDIA Computational Lithography Platform to Production

Press Release by

Mar 18th, 2024 16:32 Discuss (4 Comments)

NVIDIA today announced that TSMC and Synopsys are going into production with NVIDIA's computational lithography platform to accelerate manufacturing and push the limits of physics for the next generation of advanced semiconductor chips. TSMC, the world's leading foundry, and Synopsys, the leader in silicon to systems design solutions, have integrated NVIDIA cuLitho with their software, manufacturing processes and systems to speed chip fabrication, and in the future support the latest-generation NVIDIA Blackwell architecture GPUs.

"Computational lithography is a cornerstone of chip manufacturing," said Jensen Huang, founder and CEO of NVIDIA. "Our work on cuLitho, in partnership with TSMC and Synopsys, applies accelerated computing and generative AI to open new frontiers for semiconductor scaling." NVIDIA also introduced new generative AI algorithms that enhance cuLitho, a library for GPU-accelerated computational lithography, dramatically improving the semiconductor manufacturing process over current CPU-based methods.

Read full story

Gigabyte Unveils Comprehensive and Powerful AI Platforms at NVIDIA GTC

Press Release by

Mar 18th, 2024 15:50 Discuss (0 Comments)

GIGABYTE Technology and Giga Computing, a subsidiary of GIGABYTE and an industry leader in enterprise solutions, will showcase their solutions at the GIGABYTE booth #1224 at NVIDIA GTC, a global AI developer conference running through March 21. This event will offer GIGABYTE the chance to connect with its valued partners and customers, and together explore what the future in computing holds.

The GIGABYTE booth will focus on GIGABYTE's enterprise products that demonstrate AI training and inference delivered by versatile computing platforms based on NVIDIA solutions, as well as direct liquid cooling (DLC) for improved compute density and energy efficiency. Also not to be missed at the NVIDIA booth is the MGX Pavilion, which features a rack of GIGABYTE servers for the NVIDIA GH200 Grace Hopper Superchip architecture.

Read full story

TSMC Reportedly Investing $16 Billion into New CoWoS Facilities

by

Mar 18th, 2024 14:51 Discuss (1 Comment)

TSMC is experiencing unprecedented demand from AI chip customers—unnamed parties have (fancifully) requested the construction of entirely new fabrication facilities. Taiwan's leading semiconductor contract manufacturer seems to concentrating on "sensible" expansions, mainly in the area of CoWoS packaging output—according to an Economic Daily report, company leadership and local government were negotiating over the construction of four new advanced packaging plants. Insiders propose that plans have been revised—an investment in excess of 500 billion yuan ($16 billion) will enable the founding of six new CoWoS-focused facilities. TSMC is expected to make an official announcement next month—industry moles reckon that construction work will start in April. Two (of the six total) advanced packaging plants could become fully operational before the conclusion of 2024.

Lately, TSMC has initiated an ambitious recruitment drive—targeting around 6000 new workers. A touring entity is tasked with the attraction of "talents with high enthusiasm for semiconductors." The majority of new recruits are likely heading to new or expanded Taiwan-based facilities. The Economic Daily report proposes that Chiayi City's technological hub will play host to TSMC's new CoWoS packaging plants. A DigiTimes Asia news piece (from January) posited that TSMC leadership anticipates CoWoS output reaching 44,000 units by the end of 2024. This predicted tally could grow, thanks to the (rumored) activation of additional factories. CoWoS packaging is considered to be a vital aspect of AI accelerators—insiders believe that TSMC's latest investment will boost production of NVIDIA H100 GPUs. The combined output of six new CoWoS plants will assist greatly in the creation of next-gen B100 chips.

Intel Gaudi2 Accelerator Beats NVIDIA H100 at Stable Diffusion 3 by 55%

by

Mar 12th, 2024 01:44 Discuss (8 Comments)

Stability AI, the developers behind the popular Stable Diffusion generative AI model, have run some first-party performance benchmarks for Stable Diffusion 3 using popular data-center AI GPUs, including the NVIDIA H100 "Hopper" 80 GB, A100 "Ampere" 80 GB, and Intel's Gaudi2 96 GB accelerator. Unlike the H100, which is a super-scalar CUDA+Tensor core GPU; the Gaudi2 is purpose-built to accelerate generative AI and LLMs. Stability AI published its performance findings in a blog post, which reveals that the Intel Gaudi2 96 GB is posting a roughly 56% higher performance than the H100 80 GB.

With 2 nodes, 16 accelerators, and a constant batch size of 16 per accelerator (256 in all), the Intel Gaudi2 array is able to generate 927 images per second, compared to 595 images for the H100 array, and 381 images per second for the A100 array, keeping accelerator and node counts constant. Scaling things up a notch to 32 nodes, and 256 accelerators or a batch size of 16 per accelerator (total batch size of 4,096), the Gaudi2 array is posting 12,654 images per second; or 49.4 images per-second per-device; compared to 3,992 images per second or 15.6 images per-second per-device for the older-gen A100 "Ampere" array.

Read full story

NVIDIA Calls for Global Investment into Sovereign AI

Press Release by

Mar 10th, 2024 14:15 Discuss (30 Comments)

Nations have long invested in domestic infrastructure to advance their economies, control their own data and take advantage of technology opportunities in areas such as transportation, communications, commerce, entertainment and healthcare. AI, the most important technology of our time, is turbocharging innovation across every facet of society. It's expected to generate trillions of dollars in economic dividends and productivity gains. Countries are investing in sovereign AI to develop and harness such benefits on their own. Sovereign AI refers to a nation's capabilities to produce artificial intelligence using its own infrastructure, data, workforce and business networks.

Why Sovereign AI Is Important
The global imperative for nations to invest in sovereign AI capabilities has grown since the rise of generative AI, which is reshaping markets, challenging governance models, inspiring new industries and transforming others—from gaming to biopharma. It's also rewriting the nature of work, as people in many fields start using AI-powered "copilots." Sovereign AI encompasses both physical and data infrastructures. The latter includes sovereign foundation models, such as large language models, developed by local teams and trained on local datasets to promote inclusiveness with specific dialects, cultures and practices. For example, speech AI models can help preserve, promote and revitalize indigenous languages. And LLMs aren't just for teaching AIs human languages, but for writing software code, protecting consumers from financial fraud, teaching robots physical skills and much more.

Read full story

NVIDIA Grace Hopper Systems Gather at GTC

Press Release by

Feb 28th, 2024 08:57 Discuss (1 Comment)

The spirit of software pioneer Grace Hopper will live on at NVIDIA GTC. Accelerated systems using powerful processors - named in honor of the pioneer of software programming - will be on display at the global AI conference running March 18-21, ready to take computing to the next level. System makers will show more than 500 servers in multiple configurations across 18 racks, all packing NVIDIA GH200 Grace Hopper Superchips. They'll form the largest display at NVIDIA's booth in the San Jose Convention Center, filling the MGX Pavilion.

MGX Speeds Time to Market
NVIDIA MGX is a blueprint for building accelerated servers with any combination of GPUs, CPUs and data processing units (DPUs) for a wide range of AI, high performance computing and NVIDIA Omniverse applications. It's a modular reference architecture for use across multiple product generations and workloads. GTC attendees can get an up-close look at MGX models tailored for enterprise, cloud and telco-edge uses, such as generative AI inference, recommenders and data analytics. The pavilion will showcase accelerated systems packing single and dual GH200 Superchips in 1U and 2U chassis, linked via NVIDIA BlueField-3 DPUs and NVIDIA Quantum-2 400 Gb/s InfiniBand networks over LinkX cables and transceivers. The systems support industry standards for 19- and 21-inch rack enclosures, and many provide E1.S bays for nonvolatile storage.

Read full story

NVIDIA AI GPU Customers Reportedly Selling Off Excess Hardware

by

Feb 27th, 2024 10:59 Discuss (9 Comments)

The NVIDIA H100 Tensor Core GPU was last year's hot item for HPC and AI industry segments—the largest purchasers were reported to have acquired up to 150,000 units each. Demand grew so much that lead times of 36 to 52 weeks became the norm for H100-based server equipment. The latest rumblings indicate that things have stabilized—so much so that some organizations are "offloading chips" as the supply crunch cools off. Apparently it is more cost-effective to rent AI processing sessions through cloud service providers (CSPs)—the big three being Amazon Web Services, Google Cloud, and Microsoft Azure.

According to a mid-February Seeking Alpha report, wait times for the NVIDIA H100 80 GB GPU model have been reduced down to around three to four months. The Information believes that some companies have already reduced their order counts, while others have hardware sitting around, completely unused. Maintenance complexity and costs are reportedly cited as a main factors in "offloading" unneeded equipment, and turning to renting server time from CSPs. Despite improved supply conditions, AI GPU demand is still growing—driven mainly by organizations dealing with LLM models. A prime example being Open AI—as pointed out by The Information—insider murmurings have Sam Altman & Co. seeking out alternative solutions and production avenues.

Supermicro Accelerates Performance of 5G and Telco Cloud Workloads with New and Expanded Portfolio of Infrastructure Solutions

Press Release by

Feb 26th, 2024 03:11 Discuss (0 Comments)

Supermicro, Inc. (NASDAQ: SMCI), a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, delivers an expanded portfolio of purpose-built infrastructure solutions to accelerate performance and increase efficiency in 5G and telecom workloads. With one of the industry's most diverse offerings, Supermicro enables customers to expand public and private 5G infrastructures with improved performance per watt and support for new and innovative AI applications. As a long-term advocate of open networking platforms and a member of the O-RAN Alliance, Supermicro's portfolio incorporates systems featuring 5th Gen Intel Xeon processors, AMD EPYC 8004 Series processors, and the NVIDIA Grace Hopper Superchip.

"Supermicro is expanding our broad portfolio of sustainable and state-of-the-art servers to address the demanding requirements of 5G and telco markets and Edge AI," said Charles Liang, president and CEO of Supermicro. "Our products are not just about technology, they are about delivering tangible customer benefits. We quickly bring data center AI capabilities to the network's edge using our Building Block architecture. Our products enable operators to offer new capabilities to their customers with improved performance and lower energy consumption. Our edge servers contain up to 2 TB of high-speed DDR5 memory, 6 PCIe slots, and a range of networking options. These systems are designed for increased power efficiency and performance-per-watt, enabling operators to create high-performance, customized solutions for their unique requirements. This reassures our customers that they are investing in reliable and efficient solutions."

Read full story

Supermicro Unveils New Edge AI Systems

Press Release by

Feb 20th, 2024 20:35 Discuss (0 Comments)

Supermicro, Inc., a Total IT Solution Manufacturer for AI, Cloud, Storage, and 5G/Edge, is expanding its portfolio of AI solutions, allowing customers to leverage the power and capability of AI in edge locations, such as public spaces, retail stores, or industrial infrastructure. Using Supermicro application-optimized servers with NVIDIA GPUs makes it easier to fine-tune pre-trained models and for AI inference solutions to be deployed at the edge where the data is generated, improving response times and decision-making.

"Supermicro has the broadest portfolio of Edge AI solutions, capable of supporting pre-trained models for our customers' edge environments," said Charles Liang, president and CEO of Supermicro. "The Supermicro Hyper-E server, based on the dual 5th Gen Intel Xeon processors, can support up to three NVIDIA H100 Tensor Core GPUs, delivering unparalleled performance for Edge AI. With up to 8 TB of memory in these servers, we are bringing data center AI processing power to edge locations. Supermicro continues to provide the industry with optimized solutions as enterprises build a competitive advantage by processing AI data at their edge locations."

Read full story

NVIDIA Accelerates Quantum Computing Exploration at Australia's Pawsey Supercomputing Centre

Press Release by

Feb 19th, 2024 04:37 Discuss (1 Comment)

NVIDIA today announced that Australia's Pawsey Supercomputing Research Centre will add the NVIDIA CUDA Quantum platform accelerated by NVIDIA Grace Hopper Superchips to its National Supercomputing and Quantum Computing Innovation Hub, furthering its work driving breakthroughs in quantum computing.

Researchers at the Perth-based center will leverage CUDA Quantum - an open-source hybrid quantum computing platform that features powerful simulation tools, and capabilities to program hybrid CPU, GPU and QPU systems - as well as, the NVIDIA cuQuantum software development kit of optimized libraries and tools for accelerating quantum computing workflows. The NVIDIA Grace Hopper Superchip - which combines the NVIDIA Grace CPU and Hopper GPU architectures - provides extreme performance to run high-fidelity and scalable quantum simulations on accelerators and seamlessly interface with future quantum hardware infrastructure.

Read full story

NVIDIA Unveils "Eos" to Public - a Top Ten Supercomputer

Press Release by

Feb 16th, 2024 15:34 Discuss (20 Comments)

Providing a peek at the architecture powering advanced AI factories, NVIDIA released a video that offers the first public look at Eos, its latest data-center-scale supercomputer. An extremely large-scale NVIDIA DGX SuperPOD, Eos is where NVIDIA developers create their AI breakthroughs using accelerated computing infrastructure and fully optimized software. Eos is built with 576 NVIDIA DGX H100 systems, NVIDIA Quantum-2 InfiniBand networking and software, providing a total of 18.4 exaflops of FP8 AI performance. Revealed in November at the Supercomputing 2023 trade show, Eos—named for the Greek goddess said to open the gates of dawn each day—reflects NVIDIA's commitment to advancing AI technology.

Eos Supercomputer Fuels Innovation
Each DGX H100 system is equipped with eight NVIDIA H100 Tensor Core GPUs. Eos features a total of 4,608 H100 GPUs. As a result, Eos can handle the largest AI workloads to train large language models, recommender systems, quantum simulations and more. It's a showcase of what NVIDIA's technologies can do, when working at scale. Eos is arriving at the perfect time. People are changing the world with generative AI, from drug discovery to chatbots to autonomous machines and beyond. To achieve these breakthroughs, they need more than AI expertise and development skills. They need an AI factory—a purpose-built AI engine that's always available and can help ramp their capacity to build AI models at scale Eos delivers. Ranked No. 9 in the TOP 500 list of the world's fastest supercomputers, Eos pushes the boundaries of AI technology and infrastructure.

Read full story

Lenovo HPC Infrastructure Powers Pre-Exascale Supercomputer Marenostrum 5 to Enable New Scientific Advances and Solve Global Challenges

Press Release by

Jan 30th, 2024 12:27 Discuss (0 Comments)

Lenovo (HKSE: 992) (ADR: LNVGY) has today announced that the General Purpose Partition of the MareNostrum 5, a new pre-exascale supercomputer running on Lenovo's HPC infrastructure, has been classified as the top x86 general-purpose cluster on the recently published TOP500 list of the most powerful supercomputers globally.

Officially inaugurated at Barcelona Supercomputing Center on December 21st, MareNostrum 5 has been built for the European High Performance Computing Joint Undertaking (EuroHPC JU). The pre-exascale supercomputer will bolster the EU's mission to provide Europe with the most advanced supercomputing technology and accelerate the capacity for artificial intelligence (AI) research, enabling new scientific advances that will help solve global challenges. It aims to empower a wide range of complex HPC-specific applications, from climate research and engineering to material science and earth sciences, adeptly handling tasks that extend beyond the capabilities of cloud computing.

Read full story

OpenAI Reportedly Talking to TSMC About Custom Chip Venture

by

Jan 25th, 2024 14:34 Discuss (10 Comments)

OpenAI is reported to be initiating R&D on a proprietary AI processing solution—the research organization's CEO, Sam Altman, has commented on the in-efficient operation of datacenters running NVIDIA H100 and A100 GPUs. He foresees a future scenario where his company becomes less reliant on Team Green's off-the-shelf AI-crunchers, with a deployment of bespoke AI processors. A short Reuters interview also underlined Altman's desire to find alternatives sources of power: "It motivates us to go invest more in (nuclear) fusion." The growth of artificial intelligence industries has put an unprecedented strain on energy providers, so tech firms could be semi-forced into seeking out frugal enterprise hardware.

The Financial Times has followed up on last week's Bloomberg report of OpenAI courting investment partners in the Middle East. FT's news piece alleges that Altman is in talks with billionaire businessman Sheikh Tahnoon bin Zayed al-Nahyan, a very well connected member of the United Arab Emirates Royal Family. OpenAI's leadership is reportedly negotiating with TSMC—The Financial Times alleges that Taiwan's top chip foundry is an ideal manufacturing partner. This revelation contradicts Bloomberg's recent reports of a potential custom OpenAI AI chip venture involving purpose-built manufacturing facilities. The whole project is said to be at an early stage of development, so Altman and his colleagues are most likely exploring a variety of options.

Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

by

Jan 19th, 2024 09:47 Discuss (53 Comments)

Mark Zuckerberg has shared some interesting insights about Meta's AI infrastructure buildout, which is on track to include an astonishing number of NVIDIA H100 Tensor GPUs. In the post on Instagram, Meta's CEO has noted the following: "We're currently training our next-gen model Llama 3, and we're building massive compute infrastructure to support our future roadmap, including 350k H100s by the end of this year -- and overall almost 600k H100s equivalents of compute if you include other GPUs." That means that the company will enhance its AI infrastructure with 350,000 H100 GPUs on top of the existing GPUs, which is equivalent to 250,000 H100 in terms of computing power, for a total of 600,000 H100-equivalent GPUs.

The raw number of GPUs installed comes at a steep price. With the average selling price of H100 GPU nearing 30,000 US dollars, Meta's investment will settle the company back around $10.5 billion. Other GPUs should be in the infrastructure, but most will comprise the NVIDIA Hopper family. Additionally, Meta is currently training the LLama 3 AI model, which will be much more capable than the existing LLama 2 family and will include better reasoning, coding, and math-solving capabilities. These models will be open-source. Later down the pipeline, as the artificial general intelligence (AGI) comes into play, Zuckerberg has noted that "Our long term vision is to build general intelligence, open source it responsibly, and make it widely available so everyone can benefit." So, expect to see these models in the GitHub repositories in the future.

Samsung and Naver Developing an AI Chip Claiming to be 8x More Power Efficient than NVIDIA H100

by

Dec 22nd, 2023 05:56 Discuss (6 Comments)

Naver, the firm behind the HyperCLOVA X large language model (LLM), has been working with Samsung Electronics toward the development of power-efficient AI accelerators. The collaboration brings Naver's expertise with Samsung's vast systems IP over silicon design, the ability to build complex SoCs, semiconductor fabrication, and its plethora of DRAM technologies. The two recently designed a proof of concept for an upcoming AI chip, which they iterated on an FPGA. Naver claims the AI chip it is co-developing with Samsung will be 8 times more energy efficient than an NVIDIA H100 AI accelerator, but did not elaborate on its actual throughput. Its solution, among other things, leverages energy-efficient LPDDR memory from Samsung. The two companies have been working on this project since December 2022.

Dell Partners with Imbue on New AI Compute Cluster Using Nearly 10,000 NVIDIA H100 GPUs

Press Release by

Nov 30th, 2023 00:03 Discuss (7 Comments)

Dell Technologies and Imbue, an independent AI research company, have entered into a $150 million agreement to build a new high-performance computing cluster for training foundation models optimized for reasoning. Imbue is one of the few independent AI labs that develops its own foundation models, and trains them to have more advanced reasoning capabilities—like knowing when to ask for more information, analyzing and critiquing their own outputs, or breaking down a difficult goal into a plan and then executing on it. Imbue trains AI agents on top of those models that can do work for people across diverse fields in ways that are robust, safe, and useful. Imbue's goal is to create practical tools for building agents that could enable workers across a broad set of domains, including helping engineers write new code, analysts understand and draft complex policy proposals, and much more.

Read full story

TOP500 Update: Frontier Remains No.1 With Aurora Coming in at No. 2

Press Release by

Nov 14th, 2023 11:31 Discuss (13 Comments)

The 62nd edition of the TOP500 reveals that the Frontier system retains its top spot and is still the only exascale machine on the list. However, five new or upgraded systems have shaken up the Top 10.

Housed at the Oak Ridge National Laboratory (ORNL) in Tennessee, USA, Frontier leads the pack with an HPL score of 1.194 EFlop/s - unchanged from the June 2023 list. Frontier utilizes AMD EPYC 64C 2GHz processors and is based on the latest HPE Cray EX235a architecture. The system has a total of 8,699,904 combined CPU and GPU cores. Additionally, Frontier has an impressive power efficiency rating of 52.59 GFlops/watt and relies on HPE's Slingshot 11 network for data transfer.

Read full story

Supermicro Expands AI Solutions with the Upcoming NVIDIA HGX H200 and MGX Grace Hopper Platforms Featuring HBM3e Memory

Press Release by

Nov 13th, 2023 12:06 Discuss (3 Comments)

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is expanding its AI reach with the upcoming support for the new NVIDIA HGX H200 built with H200 Tensor Core GPUs. Supermicro's industry leading AI platforms, including 8U and 4U Universal GPU Systems, are drop-in ready for the HGX H200 8-GPU, 4-GPU, and with nearly 2x capacity and 1.4x higher bandwidth HBM3e memory compared to the NVIDIA H100 Tensor Core GPU. In addition, the broadest portfolio of Supermicro NVIDIA MGX systems supports the upcoming NVIDIA Grace Hopper Superchip with HBM3e memory. With unprecedented performance, scalability, and reliability, Supermicro's rack scale AI solutions accelerate the performance of computationally intensive generative AI, large language Model (LLM) training, and HPC applications while meeting the evolving demands of growing model sizes. Using the building block architecture, Supermicro can quickly bring new technology to market, enabling customers to become more productive sooner.

Supermicro is also introducing the industry's highest density server with NVIDIA HGX H100 8-GPUs systems in a liquid cooled 4U system, utilizing the latest Supermicro liquid cooling solution. The industry's most compact high performance GPU server enables data center operators to reduce footprints and energy costs while offering the highest performance AI training capacity available in a single rack. With the highest density GPU systems, organizations can reduce their TCO by leveraging cutting-edge liquid cooling solutions.

Read full story

GIGABYTE Demonstrates the Future of Computing at Supercomputing 2023 with Advanced Cooling and Scaled Data Centers

Press Release by

Nov 13th, 2023 11:23 Discuss (0 Comments)

GIGABYTE Technology, Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, continues to be a leader in cooling IT hardware efficiently and in developing diverse server platforms for Arm and x86 processors, as well as AI accelerators. At SC23, GIGABYTE (booth #355) will showcase some standout platforms, including for the NVIDIA GH200 Grace Hopper Superchip and next-gen AMD Instinct APU. To better introduce its extensive lineup of servers, GIGABYTE will address the most important needs in supercomputing data centers, such as how to cool high-performance IT hardware efficiently and power AI that is capable of real-time analysis and fast time to results.

Advanced Cooling
For many data centers, it is becoming apparent that their cooling infrastructure must radically shift to keep pace with new IT hardware that continues to generate more heat and requires rapid heat transfer. Because of this, GIGABYTE has launched advanced cooling solutions that allow IT hardware to maintain ideal performance while being more energy-efficient and maintaining the same data center footprint. At SC23, its booth will have a single-phase immersion tank, the A1P0-EA0, which offers a one-stop immersion cooling solution. GIGABYTE is experienced in implementing immersion cooling with immersion-ready servers, immersion tanks, oil, tools, and services spanning the globe. Another cooling solution showcased at SC23 will be direct liquid cooling (DLC), and in particular, the new GIGABYTE cold plates and cooling modules for the NVIDIA Grace CPU Superchip, NVIDIA Grace Hopper Superchip, AMD EPYC 9004 processor, and 4th Gen Intel Xeon processor.

Read full story

Return to Keyword Browsing

Apr 30th, 2024 20:49 EDT change timezone

Latest GPU Drivers

New Forum Posts

20:48 by TPCEA
Is there a formula to help normalize temperature testing when ambient is variable? (21)
20:35 by fehveiga
RX580 2048SP 8GB Mllse (0)
20:30 by Wirko
Would you guys be ok with 70C idle temp on NVME storage. (19)
20:21 by cvaldes
Arctic MX-6 shelf life is just a couple months? (42)
20:20 by neatfeatguy
TPU Merch (9)
19:50 by Steevo
Need help with a persistent infection possible rootkit or other device. (1)
19:48 by TPCEA
What's an inexpensive AIO product line with a strong pump and low price? (88)
19:45 by phill
Have you got pie today? (16322)
19:33 by phill
Milestones (13876)
19:31 by phill
WCG Daily Numbers (12502)

Popular Reviews

Apr 26th, 2024 Ugreen NASync DXP4800 Plus Review
Apr 29th, 2024 Team Group T-Force Vulcan ECO DDR5-6000 32 GB CL38 Review
Apr 25th, 2024 HYTE THICC Q60 240 mm AIO Review
Feb 12th, 2024 Upcoming Hardware Launches 2023 (Updated Feb 2024)
Apr 22nd, 2024 MOONDROP x Crinacle DUSK In-Ear Monitors Review - The Last 5%
Apr 17th, 2024 Thermalright Phantom Spirit 120 EVO Review
Apr 5th, 2023 AMD Ryzen 7 7800X3D Review - The Best Gaming CPU
Apr 18th, 2024 FiiO K19 Desktop DAC/Headphone Amplifier Review
Apr 12th, 2024 ASUS Radeon RX 7900 GRE TUF OC Review
Apr 30th, 2024 Montech Sky Two GX Review

Controversial News Posts