News Posts matching #HBM3

Return to Keyword Browsing

NVIDIA to Sell Over One Million H20 GPUs to China, Taking Home $12 Billion

When NVIDIA started preparing the H20 GPU for China, the company anticipated great demand from sanction-obeying GPUs. However, we now know precisely what the company makes from its Chinese venture: an astonishing $12 billion in take-home revenue. Due to the massive demand for NVIDIA GPUs, Chinese AI research labs are acquiring as many as they can get their hands on. According to a report from Financial Times, citing SemiAnalysis as its source, NVIDIA will sell over one million H20 GPUs in China. This number far outweighs the number of home-grown Huawei Ascend 910B accelerators that the Chinese companies plan to source, with numbers being "only" 550,000 Ascend 910B chips. While we don't know if Chinese semiconductor makers like SMIC are capable of producing more chips or if the demand isn't as high, we know why NVIDIA H20 chips are the primary target.

The Huawei Ascend 910B features Total Processing Performance (TPP), a metric developed by US Govt. to track GPU performance measuring TeraFLOPS times bit-length of over 5,000, while the NVIDIA H20 comes to 2,368 TPP, which is half of the Huawei accelerator. That is the performance on paper, where SemiAnalysis notes that the real-world performance is actually ahead for the H20 GPU due to better memory configuration of the GPU, including higher HBM3 memory bandwidth. All of this proves to be a better alternative than Ascend 910B accelerator, accounting for an estimate of over one million GPUs shipped this year in China. With an average price of $12,000 per NVIDIA H20 GPU, China's $12 billion revenue will undoubtedly help raise NVIDIA's 2024 profits even further.

AI Startup Etched Unveils Transformer ASIC Claiming 20x Speed-up Over NVIDIA H100

A new startup emerged out of stealth mode today to power the next generation of generative AI. Etched is a company that makes an application-specific integrated circuit (ASIC) to process "Transformers." The transformer is an architecture for designing deep learning models developed by Google and is now the powerhouse behind models like OpenAI's GPT-4o in ChatGPT, Antrophic Claude, Google Gemini, and Meta's Llama family. Etched wanted to create an ASIC for processing only the transformer models, making a chip called Sohu. The claim is Sohu outperforms NVIDIA's latest and greatest by an entire order of magnitude. Where a server configuration with eight NVIDIA H100 GPU clusters pushes Llama-3 70B models at 25,000 tokens per second, and the latest eight B200 "Blackwell" GPU cluster pushes 43,000 tokens/s, the eight Sohu clusters manage to output 500,000 tokens per second.

Why is this important? Not only does the ASIC outperform Hopper by 20x and Blackwell by 10x, but it also serves so many tokens per second that it enables an entirely new fleet of AI applications requiring real-time output. The Sohu architecture is so efficient that 90% of the FLOPS can be used, while traditional GPUs boast a 30-40% FLOP utilization rate. This translates into inefficiency and waste of power, which Etched hopes to solve by building an accelerator dedicated to power transformers (the "T" in GPT) at massive scales. Given that the frontier model development costs more than one billion US dollars, and hardware costs are measured in tens of billions of US Dollars, having an accelerator dedicated to powering a specific application can help advance AI faster. AI researchers often say that "scale is all you need" (resembling the legendary "attention is all you need" paper), and Etched wants to build on that.

SK hynix Showcases Its New AI Memory Solutions at HPE Discover 2024

SK hynix has returned to Las Vegas to showcase its leading AI memory solutions at HPE Discover 2024, Hewlett Packard Enterprise's (HPE) annual technology conference. Held from June 17-20, HPE Discover 2024 features a packed schedule with more than 150 live demonstrations, as well as technical sessions, exhibitions, and more. This year, attendees can also benefit from three new curated programs on edge computing and networking, hybrid cloud technology, and AI. Under the slogan "Memory, The Power of AI," SK hynix is displaying its latest memory solutions at the event including those supplied to HPE. The company is also taking advantage of the numerous networking opportunities to strengthen its relationship with the host company and its other partners.

The World's Leading Memory Solutions Driving AI
SK hynix's booth at HPE Discover 2024 consists of three product sections and a demonstration zone which showcase the unprecedented capabilities of its AI memory solutions. The first section features the company's groundbreaking memory solutions for AI, including HBM solutions. In particular, the industry-leading HBM3E has emerged as a core product to meet the growing demands of AI systems due to its exceptional processing speed, capacity, and heat dissipation. A key solution from the company's CXL lineup, CXL Memory Module-DDR5 (CMM-DDR5), is also on display in this section. In the AI era where high performance and capacity are vital, CMM-DDR5 has gained attention for its ability to expand system bandwidth by up to 50% and capacity by up to 100% compared to systems only equipped with DDR5 DRAM.

SK hynix Showcases Its Next-Gen Solutions at Computex 2024

SK hynix presented its leading AI memory solutions at COMPUTEX Taipei 2024 from June 4-7. As one of Asia's premier IT shows, COMPUTEX Taipei 2024 welcomed around 1,500 global participants including tech companies, venture capitalists, and accelerators under the theme "Connecting AI". Making its debut at the event, SK hynix underlined its position as a first mover and leading AI memory provider through its lineup of next-generation products.

"Connecting AI" With the Industry's Finest AI Memory Solutions
Themed "Memory, The Power of AI," SK hynix's booth featured its advanced AI server solutions, groundbreaking technologies for on-device AI PCs, and outstanding consumer SSD products. HBM3E, the fifth generation of HBM1, was among the AI server solutions on display. Offering industry-leading data processing speeds of 1.18 terabytes (TB) per second, vast capacity, and advanced heat dissipation capability, HBM3E is optimized to meet the requirements of AI servers and other applications. Another technology which has become crucial for AI servers is CXL as it can increase system bandwidth and processing capacity. SK hynix highlighted the strength of its CXL portfolio by presenting its CXL Memory Module-DDR5 (CMM-DDR5), which significantly expands system bandwidth and capacity compared to systems only equipped with DDR5. Other AI server solutions on display included the server DRAM products DDR5 RDIMM and MCR DIMM. In particular, SK hynix showcased its tall 128-gigabyte (GB) MCR DIMM for the first time at an exhibition.

Blackwell Shipments Imminent, Total CoWoS Capacity Expected to Surge by Over 70% in 2025

TrendForce reports that NVIDIA's Hopper H100 began to see a reduction in shortages in 1Q24. The new H200 from the same platform is expected to gradually ramp in Q2, with the Blackwell platform entering the market in Q3 and expanding to data center customers in Q4. However, this year will still primarily focus on the Hopper platform, which includes the H100 and H200 product lines. The Blackwell platform—based on how far supply chain integration has progressed—is expected to start ramping up in Q4, accounting for less than 10% of the total high-end GPU market.

The die size of Blackwell platform chips like the B100 is twice that of the H100. As Blackwell becomes mainstream in 2025, the total capacity of TSMC's CoWoS is projected to grow by 150% in 2024 and by over 70% in 2025, with NVIDIA's demand occupying nearly half of this capacity. For HBM, the NVIDIA GPU platform's evolution sees the H100 primarily using 80 GB of HBM3, while the 2025 B200 will feature 288 GB of HBM3e—a 3-4 fold increase in capacity per chip. The three major manufacturers' expansion plans indicate that HBM production volume will likely double by 2025.

Samsung Could Start 1nm Mass Production Sooner Than Expected

Samsung Foundry business is set to announce its technology roadmap and plans to strengthen the foundry ecosystem at the Foundry and SAFE Forum in Silicon Valley from June 12 to 13. Notably, Samsung is expected to advance its 1 nm process mass production plan, originally scheduled for 2027, to 2026. This move could look like a surprise since recent rumors (denied by Samsung) emerged about HBM3 and HBM3E chips running too hot and failing to be validated by NVIDIA.

Previously, Samsung successfully mass-produced the world's first 3 nm wafer foundry in June 2022. The company plans to start mass production of its second-generation 3 nm process in 2024 and 2 nm process in 2025. Speculations suggest Samsung may integrate these nodes and potentially begin mass-producing 2 nm chips as early as the second half of 2024. In comparison, rival TSMC aims to reach the A16 node (1.6 nm) in 2027 and start mass production of its 1.4 nm process around 2027-2028.
Samsung Foundry

NVIDIA Reportedly Having Issues with Samsung's HBM3 Chips Running Too Hot

According to Reuters, NVIDIA is having some major issues with Samsung's HBM3 chips, as NVIDIA hasn't managed to finalise its validations of the chips. Reuters are citing multiple sources that are familiar with the matter and it seems like Samsung is having some serious issues with its HMB3 chips if the sources are correct. Not only do the chips run hot, which itself is a big issue due to NVIDIA already having issues cooling some of its higher-end products, but the power consumption is apparently not where it should be either. Samsung is said to have tried to get its HBM3 and HBM3E parts validated by NVIDIA since sometime in 2023 according to Reuter's sources, which suggests that there have been issues for at least six months, if not longer.

The sources claim there are issues with both the 8- and 12-layer stacks of HMB3E parts from Samsung, suggesting that NVIDIA might only be able to supply parts from Micron and SK Hynix for now, the latter whom has been supplying HBM3 chips to NVIDIA since the middle of 2022 and HBM3E chips since March of this year. It's unclear if this is a production issue at Samsung's DRAM Fabs, a packaging related issue or something else entirely. The Reuter's piece goes on to speculating about Samsung not having had enough time to develop its HBM parts compared its competitors and that it's a rushed product, but Samsung issued a statement to the publication that it's a matter of customising the product for its customer's needs. Samsung also said that it's "the process of optimising its products through close collaboration with customers" without going into which customer(s). Samsung issued a further statement saying that "claims of failing due to heat and power consumption are not true" and that testing was going as expected.

HBM3e Production Surge Expected to Make Up 35% of Advanced Process Wafer Input by End of 2024

TrendForce reports that the three largest DRAM suppliers are increasing wafer input for advanced processes. Following a rise in memory contract prices, companies have boosted their capital investments, with capacity expansion focusing on the second half of this year. It is expected that wafer input for 1alpha nm and above processes will account for approximately 40% of total DRAM wafer input by the end of the year.

HBM production will be prioritized due to its profitability and increasing demand. However, limited yields of around 50-60% and a wafer area 60% larger than DRAM products mean a higher proportion of wafer input is required. Based on the TSV capacity of each company, HBM is expected to account for 35% of advanced process wafer input by the end of this year, with the remaining wafer capacity used for LPDDR5(X) and DDR5 products.

DRAM Contract Prices for Q2 Adjusted to a 13-18% Increase; NAND Flash around 15-20%

TrendForce's latest forecasts reveal contract prices for DRAM in the second quarter are expected to increase by 13-18%, while NAND Flash contract prices have been adjusted to a 15-20% Only eMMC/UFS will be seeing a smaller price increase of about 10%.

Before the 4/03 earthquake, TrendForce had initially predicted that DRAM contract prices would see a seasonal rise of 3-8% and NAND Flash 13-18%, significantly tapering from Q1 as seen from spot price indicators which showed weakening price momentum and reduced transaction volumes. This was primarily due to subdued demand outside of AI applications, particularly with no signs of recovery in demand for notebooks and smartphones. Inventory levels were gradually increasing, especially among PC OEMs. Additionally, with DRAM and NAND Flash prices having risen for 2-3 consecutive quarters, the willingness of buyers to accept further substantial price increases had diminished.

HBM Prices to Increase by 5-10% in 2025, Accounting for Over 30% of Total DRAM Value

Avril Wu, TrendForce Senior Research Vice President, reports that the HBM market is poised for robust growth, driven by significant pricing premiums and increased capacity needs for AI chips. HBM's unit sales price is several times higher than that of conventional DRAM and about five times that of DDR5. This pricing, combined with product iterations in AI chip technology that increase single-device HBM capacity, is expected to dramatically raise HBM's share in both the capacity and market value of the DRAM market from 2023 to 2025. Specifically, HBM's share of total DRAM bit capacity is estimated to rise from 2% in 2023 to 5% in 2024 and surpass 10% by 2025. In terms of market value, HBM is projected to account for more than 20% of the total DRAM market value starting in 2024, potentially exceeding 30% by 2025.

2024 sees HBM demand growth rate near 200%, set to double in 2025
Wu also pointed out that negotiations for 2025 HBM pricing have already commenced in 2Q24. However, due to the limited overall capacity of DRAM, suppliers have preliminarily increased prices by 5-10% to manage capacity constraints, affecting HBM2e, HBM3, and HBM3e. This early negotiation phase is attributed to three main factors: Firstly, HBM buyers maintain high confidence in AI demand prospects and are willing to accept continued price increases.

SK hynix Strengthens AI Memory Leadership & Partnership With Host at the TSMC 2024 Tech Symposium

SK hynix showcased its next-generation technologies and strengthened key partnerships at the TSMC 2024 Technology Symposium held in Santa Clara, California on April 24. At the event, the company displayed its industry-leading HBM AI memory solutions and highlighted its collaboration with TSMC involving the host's CoWoS advanced packaging technology.

TSMC, a global semiconductor foundry, invites its major partners to this annual conference in the first half of each year so they can share their new products and technologies. Attending the event under the slogan "Memory, the Power of AI," SK hynix received significant attention for presenting the industry's most powerful AI memory solution, HBM3E. The product has recently demonstrated industry-leading performance, achieving input/output (I/O) transfer speed of up to 10 gigabits per second (Gbps) in an AI system during a performance validation evaluation.

Demand for NVIDIA's Blackwell Platform Expected to Boost TSMC's CoWoS Total Capacity by Over 150% in 2024

NVIDIA's next-gen Blackwell platform, which includes B-series GPUs and integrates NVIDIA's own Grace Arm CPU in models such as the GB200, represents a significant development. TrendForce points out that the GB200 and its predecessor, the GH200, both feature a combined CPU+GPU solution, primarily equipped with the NVIDIA Grace CPU and H200 GPU. However, the GH200 accounted for only approximately 5% of NVIDIA's high-end GPU shipments. The supply chain has high expectations for the GB200, with projections suggesting that its shipments could exceed millions of units by 2025, potentially making up nearly 40 to 50% of NVIDIA's high-end GPU market.

Although NVIDIA plans to launch products such as the GB200 and B100 in the second half of this year, upstream wafer packaging will need to adopt more complex and high-precision CoWoS-L technology, making the validation and testing process time-consuming. Additionally, more time will be required to optimize the B-series for AI server systems in aspects such as network communication and cooling performance. It is anticipated that the GB200 and B100 products will not see significant production volumes until 4Q24 or 1Q25.

NVIDIA Hopper Leaps Ahead in Generative AI at MLPerf

It's official: NVIDIA delivered the world's fastest platform in industry-standard tests for inference on generative AI. In the latest MLPerf benchmarks, NVIDIA TensorRT-LLM—software that speeds and simplifies the complex job of inference on large language models—boosted the performance of NVIDIA Hopper architecture GPUs on the GPT-J LLM nearly 3x over their results just six months ago. The dramatic speedup demonstrates the power of NVIDIA's full-stack platform of chips, systems and software to handle the demanding requirements of running generative AI. Leading companies are using TensorRT-LLM to optimize their models. And NVIDIA NIM—a set of inference microservices that includes inferencing engines like TensorRT-LLM—makes it easier than ever for businesses to deploy NVIDIA's inference platform.

Raising the Bar in Generative AI
TensorRT-LLM running on NVIDIA H200 Tensor Core GPUs—the latest, memory-enhanced Hopper GPUs—delivered the fastest performance running inference in MLPerf's biggest test of generative AI to date. The new benchmark uses the largest version of Llama 2, a state-of-the-art large language model packing 70 billion parameters. The model is more than 10x larger than the GPT-J LLM first used in the September benchmarks. The memory-enhanced H200 GPUs, in their MLPerf debut, used TensorRT-LLM to produce up to 31,000 tokens/second, a record on MLPerf's Llama 2 benchmark. The H200 GPU results include up to 14% gains from a custom thermal solution. It's one example of innovations beyond standard air cooling that systems builders are applying to their NVIDIA MGX designs to take the performance of Hopper GPUs to new heights.

2024 HBM Supply Bit Growth Estimated to Reach 260%, Making Up 14% of DRAM Industry

TrendForce reports that significant capital investments have occurred in the memory sector due to the high ASP and profitability of HBM. Senior Vice President Avril Wu notes that by the end of 2024, the DRAM industry is expected to allocate approximately 250K/m (14%) of total capacity to producing HBM TSV, with an estimated annual supply bit growth of around 260%. Additionally, HBM's revenue share within the DRAM industry—around 8.4% in 2023—is projected to increase to 20.1% by the end of 2024.

HBM supply tightens with order volumes rising continuously into 2024
Wu explains that in terms of production differences between HBM and DDR5, the die size of HBM is generally 35-45% larger than DDR5 of the same process and capacity (for example, 24Gb compared to 24Gb). The yield rate (including TSV packaging) for HBM is approximately 20-30% lower than that of DDR5, and the production cycle (including TSV) is 1.5 to 2 months longer than DDR5.

HBM3 Initially Exclusively Supplied by SK Hynix, Samsung Rallies Fast After AMD Validation

TrendForce highlights the current landscape of the HBM market, which as of early 2024, is primarily focused on HBM3. NVIDIA's upcoming B100 or H200 models will incorporate advanced HBM3e, signaling the next step in memory technology. The challenge, however, is the supply bottleneck caused by both CoWoS packaging constraints and the inherently long production cycle of HBM—extending the timeline from wafer initiation to the final product beyond two quarters.

The current HBM3 supply for NVIDIA's H100 solution is primarily met by SK hynix, leading to a supply shortfall in meeting burgeoning AI market demands. Samsung's entry into NVIDIA's supply chain with its 1Znm HBM3 products in late 2023, though initially minor, signifies its breakthrough in this segment.

DRAM Industry Sees Nearly 30% Revenue Growth in 4Q23 Due to Rising Prices and Volume

TrendForce reports a 29.6% QoQ in DRAM industry revenue for 4Q23, reaching US$17.46 billion, propelled by revitalized stockpiling efforts and strategic production control by leading manufacturers. Looking ahead to 1Q24, the intent to further enhance profitability is evident, with a projected near 20% increase in DRAM contract prices—albeit with a slight decrease in shipment volumes to the traditional off-season.

Samsung led the pack with the highest revenue growth among the top manufacturers in Q4 as it jumped 50% QoQ to hit $7.95 billion, largely due to a surge in 1alpha nm DDR5 shipments, boosting server DRAM shipments by over 60%. SK hynix saw a modest 1-3% rise in shipment volumes but benefited from the pricing advantage of HBM and DDR5, especially from high-density server DRAM modules, leading to a 17-19% increase in ASP and a 20.2% rise in revenue to $5.56 billion. Micron witnessed growth in both volume and price, with a 4-6% increase in each, resulting in a more moderate revenue growth of 8.9%, totaling $3.35 billion for the quarter due to its comparatively lower share of DDR5 and HBM.

Samsung Develops Industry-First 36GB HBM3E 12H DRAM

Samsung Electronics, a world leader in advanced memory technology, today announced that it has developed HBM3E 12H, the industry's first 12-stack HBM3E DRAM and the highest-capacity HBM product to date. Samsung's HBM3E 12H provides an all-time high bandwidth of up to 1,280 gigabytes per second (GB/s) and an industry-leading capacity of 36 gigabytes (GB). In comparison to the 8-stack HBM3 8H, both aspects have improved by more than 50%.

"The industry's AI service providers are increasingly requiring HBM with higher capacity, and our new HBM3E 12H product has been designed to answer that need," said Yongcheol Bae, Executive Vice President of Memory Product Planning at Samsung Electronics. "This new memory solution forms part of our drive toward developing core technologies for high-stack HBM and providing technological leadership for the high-capacity HBM market in the AI era."

Google's Gemma Optimized to Run on NVIDIA GPUs, Gemma Coming to Chat with RTX

NVIDIA, in collaboration with Google, today launched optimizations across all NVIDIA AI platforms for Gemma—Google's state-of-the-art new lightweight 2 billion- and 7 billion-parameter open language models that can be run anywhere, reducing costs and speeding innovative work for domain-specific use cases.

Teams from the companies worked closely together to accelerate the performance of Gemma—built from the same research and technology used to create the Gemini models—with NVIDIA TensorRT-LLM, an open-source library for optimizing large language model inference, when running on NVIDIA GPUs in the data center, in the cloud and on PCs with NVIDIA RTX GPUs. This allows developers to target the installed base of over 100 million NVIDIA RTX GPUs available in high-performance AI PCs globally.

SK Hynix Targets HBM3E Launch This Year, HBM4 by 2026

SK Hynix has unveiled ambitious High Bandwidth Memory (HBM) roadmaps at SEMICON Korea 2024. Vice President Kim Chun-hwan announced plans to mass produce the cutting-edge HBM3E within the first half of 2024, touting 8-layer stack samples already supplied to clients. This iteration makes major strides towards fulfilling surging data bandwidth demands, offering 1.2 TB/s per stack and 7.2 TB/s in a 6-stack configuration. VP Kim Chun-hwan cites the rapid emergence of generative AI, forecasted for 35% CAGR, as a key driver. He warns that "fierce survival competition" lies ahead across the semiconductor industry amidst rising customer expectations. With limits approaching on conventional process node shrinks, attention is shifting to next-generation memory architectures and materials to unleash performance.

SK Hynix has already initiated HBM4 development for sampling in 2025 and mass production the following year. According to Micron, HBM4 will leverage a wider 2048-bit interface compared to previous HBM generations to increase per-stack theoretical peak memory bandwidth to over 1.5 TB/s. To achieve these high bandwidths while maintaining reasonable power consumption, HBM4 is targeting a data transfer rate of around 6 GT/s. The wider interface and 6 GT/s speeds allow HBM4 to push bandwidth boundaries significantly compared to prior HBM versions, fueling the need for high-performance computing and AI workloads. But power efficiency is carefully balanced by avoiding impractically high transfer rates. Additionally, Samsung is aligned on a similar 2025/2026 timeline. Beyond pushing bandwidth boundaries, custom HBM solutions will become increasingly crucial. Samsung executive Jaejune Kim reveals that over half its HBM volume already comprises specialized products. Further tailoring HBM4 to individual client needs through logic integration presents an opportunity to cement leadership. As AI workloads evolve at breakneck speeds, memory innovation must keep pace. With HBM3E prepping for launch and HBM4 in the plan, SK Hynix and Samsung are gearing up for the challenges ahead.

NVIDIA Faces AI Chip Shortages, Turns to Intel for Advanced Packaging Services

NVIDIA's supply of AI chips remains tight due to insufficient advanced packaging production capacity from key partner TSMC. As per the UDN report, NVIDIA will add Intel as a provider of advanced packaging services to help ease the constraints. Intel is expected to start supplying NVIDIA with a monthly advanced packaging capacity of about 5,000 units in Q2 at the earliest. While TSMC will remain NVIDIA's primary packaging partner, Intel's participation significantly boosts NVIDIA's total production capacity by nearly 10%. Even after Intel comes online, TSMC will still account for the lion's share—about 90% of NVIDIA's advanced packaging needs. TSMC is also aggressively expanding capacity, with monthly production expected to reach nearly 50,000 units in Q1, a 25% increase over December 2023. Intel has advanced packaging facilities in the U.S. and is expanding its capacity in Penang. The company has an open model, allowing customers to leverage its packaging solutions separately.

The AI chip shortages stemmed from insufficient advanced packaging capacity, tight HBM3 memory supply, and overordering by some cloud providers. These constraints are now easing faster than anticipated. The additional supply will benefit AI server providers like Quanta, Inventec and GIGABYTE. Quanta stated that the demand for AI servers remains robust, with the main limitation being chip supply. Both Inventec and GIGABYTE expect strong AI server shipment growth this year as supply issues resolve. The ramping capacity from TSMC and Intel in advanced packaging and improvements upstream suggest the AI supply crunch may be loosening. This would allow cloud service providers to continue the rapid deployment of AI workloads.

SK hynix Reports Financial Results for 2023, 4Q23

SK hynix Inc. announced today that it recorded an operating profit of 346 billion won in the fourth quarter of last year amid a recovery of the memory chip market, marking the first quarter of profit following four straight quarters of losses. The company posted revenues of 11.31 trillion won, operating profit of 346 billion won (operating profit margin at 3%), and net loss of 1.38 trillion won (net profit margin at negative 12%) for the three months ended December 31, 2023. (Based on K-IFRS)

SK hynix said that the overall memory market conditions improved in the last quarter of 2023 with demand for AI server and mobile applications increasing and average selling price (ASP) rising. "We recorded the first quarterly profit in a year following efforts to focus on profitability," it said. The financial results of the last quarter helped narrow the operating loss for the entire year to 7.73 trillion won (operating profit margin at negative 24%) and net loss to 9.14 trillion won (with net profit margin at negative 28%). The revenues were 32.77 trillion won.

HBM Industry Revenue Could Double by 2025 - Growth Driven by Next-gen AI GPUs Cited

Samsung, SK hynix, and Micron are considered to be the top manufacturing sources of High Bandwidth Memory (HBM)—the HBM3 and HBM3E standards are becoming increasingly in demand, due to a widespread deployment of GPUs and accelerators by generative AI companies. Taiwan's Commercial Times proposes that there is an ongoing shortage of HBM components—but this presents a growth opportunity for smaller manufacturers in the region. Naturally, the big name producers are expected to dive in head first with the development of next generation models. The aforementioned financial news article cites research conducted by the Gartner group—they predict that the HBM market will hit an all-time high of $4.976 billion (USD) by 2025.

This estimate is almost double that of projected revenues (just over $2 billion) generated by the HBM market in 2023—the explosive growth of generative AI applications has "boosted" demand for the most performant memory standards. The Commercial Times report states that SK Hynix is the current HBM3E leader, with Micron and Samsung trailing behind—industry experts believe that stragglers will need to "expand HBM production capacity" in order to stay competitive. SK Hynix has shacked up with NVIDIA—the GH200 Grace Hopper platform was unveiled last summer; outfitted with the South Korean firm's HBM3e parts. In a similar timeframe, Samsung was named as AMD's preferred supplier of HBM3 packages—as featured within the recently launched Instinct MI300X accelerator. NVIDIA's HBM3E deal with SK Hynix is believed to extend to the internal makeup of Blackwell GB100 data-center GPUs. The HBM4 memory standard is expected to be the next major battleground for the industry's hardest hitters.

AI Datacenters Warming Up to Instinct CDNA Causes AMD Stock to Hit Near Record High

With NVIDIA's Ampere and Hopper GPUs enjoying a domination in the AI acceleration industry, compute companies are turning to AMD's Instinct CDNA series accelerators to look for alternatives. It seems like they've found one. This has financial market analysts excited, causing the AMD company stock to hit near record highs. AMD recently launched the Instinct MI300X and MI300A processors based on the CDNA 3 architecture, which the company claims beat NVIDIA's H100 "Hopper" processors at competitive prices, which has encouraged analysts from major financial institutions, including Barclays, KeyBanc Capital, and Susquehanna Financial Group, to increase their price targets for the AMD stock. As of market closure at Jan 17, 7:59:56 PM UTC, the AMD stock stood at $160.17, near its November 2021 record high of $164.46.

AMD's data center business looks to ramp up Instinct CDNA accelerators through 2024. These large chiplet-based GPUs are based on the same 5 nm TSMC foundry nodes to NVIDIA's H100 "Hopper," and to maximize the use of its foundry allocation, it's been reported that AMD might even forego large gaming GPUs based on its Radeon RX RDNA4 architecture, to maximize its allocation for high-margin CDNA3 chips. The Instinct MI300X features a colossal 304 compute units worth 19,456 stream processors capable of AI-relevant math formats, and 192 GB of 8192-bit HBM3 memory, with 5.2 TB/s of memory bandwidth on tap.

GIGABYTE Unveils Next-gen HPC & AI Servers with AMD Instinct MI300 Series Accelerators

GIGABYTE Technology: Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, and IT infrastructure, today announced the GIGABYTE G383-R80 for the AMD Instinct MI300A APU and two GIGABYTE G593 series servers for the AMD Instinct MI300X GPU and AMD EPYC 9004 Series processor. As a testament to the performance of AMD Instinct MI300 Series family of products, the El Capitan supercomputer at Lawrence Livermore National Laboratory uses the MI300A APU to power exascale computing. And these new GIGABYTE servers are the ideal platform to propel discoveries in HPC & AI at exascale.⁠

Marrying of a CPU & GPU: G383-R80
For incredible advancements in HPC there is the GIGABYTE G383-R80 that houses four LGA6096 sockets for MI300A APUs. This chip integrates a CPU that has twenty-four AMD Zen 4 cores with a powerful GPU built with AMD CDNA 3 GPU cores. And the chiplet design shares 128 GB of unified HBM3 memory for impressive performance for large AI models. The G383 server has lots of expansion slots for networking, storage, or other accelerators, with a total of twelve PCIe Gen 5 slots. And in the front of the chassis are eight 2.5" Gen 5 NVMe bays to handle heavy workloads such as real-time big data analytics and latency-sensitive workloads in finance and telecom. ⁠

AWS and NVIDIA Partner to Deliver 65 ExaFLOP AI Supercomputer, Other Solutions

Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), and NVIDIA (NASDAQ: NVDA) today announced an expansion of their strategic collaboration to deliver the most-advanced infrastructure, software and services to power customers' generative artificial intelligence (AI) innovations. The companies will bring together the best of NVIDIA and AWS technologies—from NVIDIA's newest multi-node systems featuring next-generation GPUs, CPUs and AI software, to AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability—that are ideal for training foundation models and building generative AI applications.

The expanded collaboration builds on a longstanding relationship that has fueled the generative AI era by offering early machine learning (ML) pioneers the compute performance required to advance the state-of-the-art in these technologies.
Return to Keyword Browsing
Jul 15th, 2024 23:06 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts