News Posts matching #HBM

Return to Keyword Browsing

TSMC Announces Breakthrough Set to Redefine the Future of 3D IC

TSMC today announced the new 3Dblox 2.0 open standard and major achievements of its Open Innovation Platform (OIP) 3DFabric Alliance at the TSMC 2023 OIP Ecosystem Forum. The 3Dblox 2.0 features early 3D IC design capability that aims to significantly boost design efficiency, while the 3DFabric Alliance continues to drive memory, substrate, testing, manufacturing, and packaging integration. TSMC continues to push the envelope of 3D IC innovation, making its comprehensive 3D silicon stacking and advanced packaging technologies more accessible to every customer.

"As the industry shifted toward embracing 3D IC and system-level innovation, the need for industry-wide collaboration has become even more essential than it was when we launched OIP 15 years ago," said Dr. L.C. Lu, TSMC fellow and vice president of Design and Technology Platform. "As our sustained collaboration with OIP ecosystem partners continues to flourish, we're enabling customers to harness TSMC's leading process and 3DFabric technologies to reach an entirely new level of performance and power efficiency for the next-generation artificial intelligence (AI), high-performance computing (HPC), and mobile applications."

Synopsys and TSMC Streamline Multi-Die System Complexity with Unified Exploration-to-Signoff Platform and Proven UCIe IP on TSMC N3E Process

Synopsys, Inc. today announced it is extending its collaboration with TSMC to advance multi-die system designs with a comprehensive solution supporting the latest 3Dblox 2.0 standard and TSMC's 3DFabric technologies. The Synopsys Multi-Die System solution includes 3DIC Compiler, a unified exploration-to-signoff platform that delivers the highest levels of design efficiency for capacity and performance. In addition, Synopsys has achieved first-pass silicon success of its Universal Chiplet Interconnect Express (UCIe) IP on TSMC's leading N3E process for seamless die-to-die connectivity.

"TSMC has been working closely with Synopsys to deliver differentiated solutions that address designers' most complex challenges from early architecture to manufacturing," said Dan Kochpatcharin, head of the Design Infrastructure Management Division at TSMC. "Our long history of collaboration with Synopsys benefits our mutual customers with optimized solutions for performance and power efficiency to help them address multi-die system design requirements for high-performance computing, data center, and automotive applications."

Q2 DRAM Industry Revenue Rebounds with a 20.4% Quarterly Increase, Q3 Operating Profit Margin Expected to Turn from Loss to Gains

TrendForce reports that rising demand for AI servers has driven growth in HBM shipments. Combined with the wave of inventory buildup for DDR5 on the client side, the second quarter saw all three major DRAM suppliers experience shipment growth. Q2 revenue for the DRAM industry reached approximately US$11.43 billion, marking a 20.4% QoQ increase and halting a decline that persisted for three consecutive quarters. Among suppliers, SK hynix saw a significant quarterly growth of over 35% in shipments. The company's shipments of DDR5 and HBM, both of which have higher ASP, increased significantly. As a result, SK hynix's ASP grew counter-cyclically by 7-9%, driving its Q2 revenue to increase by nearly 50%. With revenue reaching US$3.44 billion, SK hynix claimed the second spot in the industry, leading growth in the sector.

Samsung, with its DDR5 process still at 1Ynm and limited shipments in the second quarter, experienced a drop in its ASP by around 7-9%. However, benefitting from inventory buildup by module houses and increased demand for AI server setups, Samsung saw a slight increase in shipments. This led to an 8.6% QoQ increase in Q2 revenue, reaching US$4.53 billion, securing them the top position. Micron, ranking third, was a bit late in HBM development. However, DDR5 shipments held a significant proportion, keeping their ASP relatively stable. Boosted by shipments, its revenue was around US$2.95 billion, a quarterly increase of 15.7%. Both companies saw a reduction in their market share.

Strong Cloud AI Server Demand Propels NVIDIA's FY2Q24 Data Center Business to Surpass 76% for the First Time

NVIDIA's latest financial report for FY2Q24 reveals that its data center business reached US$10.32 billion—a QoQ growth of 141% and YoY increase of 171%. The company remains optimistic about its future growth. TrendForce believes that the primary driver behind NVIDIA's robust revenue growth stems from its data center's AI server-related solutions. Key products include AI-accelerated GPUs and AI server HGX reference architecture, which serve as the foundational AI infrastructure for large data centers.

TrendForce further anticipates that NVIDIA will integrate its software and hardware resources. Utilizing a refined approach, NVIDIA will align its high-end, mid-tier, and entry-level GPU AI accelerator chips with various ODMs and OEMs, establishing a collaborative system certification model. Beyond accelerating the deployment of CSP cloud AI server infrastructures, NVIDIA is also partnering with entities like VMware on solutions including the Private AI Foundation. This strategy extends NVIDIA's reach into the edge enterprise AI server market, underpinning steady growth in its data center business for the next two years.

Suppliers Amp Up Production, HBM Bit Supply Projected to Soar by 105% in 2024

TrendForce highlights in its latest report that memory suppliers are boosting their production capacity in response to escalating orders from NVIDIA and CSPs for their in-house designed chips. These efforts include the expansion of TSV production lines to increase HBM output. Forecasts based on current production plans from suppliers indicate a remarkable 105% annual increase in HBM bit supply by 2024. However, due to the time required for TSV expansion, which encompasses equipment delivery and testing (9 to 12 months), the majority of HBM capacity is expected to materialize by 2Q24.

TrendForce analysis indicates that 2023 to 2024 will be pivotal years for AI development, triggering substantial demand for AI Training chips and thereby boosting HBM utilization. However, as the focus pivots to Inference, the annual growth rate for AI Training chips and HBM is expected to taper off slightly. The imminent boom in HBM production has presented suppliers with a difficult situation: they will need to strike a balance between meeting customer demand to expand market share and avoiding a surplus due to overproduction. Another concern is the potential risk of overbooking, as buyers, anticipating an HBM shortage, might inflate their demand.

AMD & Xilinx Introduce the Versal HBM Series VHK158 Evaluation Kit

Introducing the Versal HBM Series VHK158 Evaluation Kit. This features the Versal HBM series VH1582 device, which integrates multi-Tbps High Bandwidth Memory (HBM), hardened connectivity IP, and adaptive compute in a single device, eliminating the bottlenecks between memory, I/O, and compute while delivering up to 6 times more memory bandwidth.

The VHK158 evaluation kit is an evaluation platform for the Versal HBM series VH1582 device designed to keep up with the higher memory needs of compute intensive, memory bound applications, providing adaptable acceleration for data center, wired networking, test & measurement, and aerospace & defense applications. The VHK158 board's primary focus is to enable demonstration and evaluation of the VH1582 silicon and support customer application development

Samsung Electronics Announces Second Quarter 2023 Results

Samsung Electronics today reported financial results for the second quarter ended June 30, 2023. The Company posted KRW 60.01 trillion in consolidated revenue, a 6% decline from the previous quarter, mainly due to a decline in smartphone shipments despite a slight recovery in revenue of the DS (Device Solutions) Division. Operating profit rose sequentially to KRW 0.67 trillion as the DS Division posted a narrower loss, while Samsung Display Corporation (SDC) and the Digital Appliances Business saw improved profitability.

The Memory Business saw results improve from the previous quarter as its focus on High Bandwidth Memory (HBM) and DDR5 products in anticipation of robust demand for AI applications led to higher-than-guided DRAM shipments. System semiconductors posted a decline in profit due to lower utilization rates on weak demand from major applications.

Micron Delivers Industry's Fastest, Highest-Capacity HBM to Advance Generative AI Innovation

Micron Technology, Inc. today announced it has begun sampling the industry's first 8-high 24 GB HBM3 Gen2 memory with bandwidth greater than 1.2 TB/s and pin speed over 9.2 Gb/s, which is up to a 50% improvement over currently shipping HBM3 solutions. With a 2.5 times performance per watt improvement over previous generations, Micron's HBM3 Gen2 offering sets new records for the critical artificial intelligence (AI) data center metrics of performance, capacity and power efficiency. These Micron improvements reduce training times of large language models like GPT-4 and beyond, deliver efficient infrastructure use for AI inference and provide superior total cost of ownership (TCO).

The foundation of Micron's high-bandwidth memory (HBM) solution is Micron's industry-leading 1β (1-beta) DRAM process node, which allows a 24Gb DRAM die to be assembled into an 8-high cube within an industry-standard package dimension. Moreover, Micron's 12-high stack with 36 GB capacity will begin sampling in the first quarter of calendar 2024. Micron provides 50% more capacity for a given stack height compared to existing competitive solutions. Micron's HBM3 Gen2 performance-to-power ratio and pin speed improvements are critical for managing the extreme power demands of today's AI data centers. The improved power efficiency is possible because of Micron advancements such as doubling of the through-silicon vias (TSVs) over competitive HBM3 offerings, thermal impedance reduction through a five-time increase in metal density, and an energy-efficient data path design.

BBCube 3D Could be the Future of Stacked DRAM

Scientists at the Tokyo Institute of Technology have developed a new type of stacked or 3D DRAM that the researchers call Bumpless Build Cube 3D or BBCube 3D, which relies on Through Silicon Vias or TSVs to connect the DRAM dies. This is a different approach to HBM which relies on micro bumps to connect the layers together and the Japanese scientists are saying that their bumpless wafer-on-wafer solution should allow not only for an easier manufacturing process, but more importantly, improved cooling, as the TSVs can channel the heat from the DRAM dies down into whatever substrate the BBCube 3D stack is finally mounted onto.

If that wasn't enough, the researchers believe that BBCube 3D will be able to deliver higher speeds than HBM courtesy of a combination of the TSVs being relatively short and "high-density signal parallelism". BBCube 3D is expected to deliver up to a 32 fold increase in bandwidth compared to DDR5 memory and a four fold increase compared to HBM2E memory, while at the same time, drawing less power. The research paper goes into a lot more details for those interested at taking a closer look at this potentially revolutionary shift in DRAM assembly. However, the question that remains unanswered is if this will end up as a real world product some time in the near future, which is all based on how manufacturable BBCube 3D memory will be.

Two-ExaFLOP El Capitan Supercomputer Starts Installation Process with AMD Instinct MI300A

When Lawrence Livermore National Laboratory (LLNL) announced the creation of a two-ExaFLOP supercomputer named El Capitan, we heard that AMD would power it with its Instinct MI300 accelerator. Today, LNLL published a Tweet that states, "We've begun receiving & installing components for El Capitan, @NNSANews' first #exascale #supercomputer. While we're still a ways from deploying it for national security purposes in 2024, it's exciting to see years of work becoming reality." As published images show, HPE racks filled with AMD Instinct MI300 are showing up now at LNLL's facility, and the supercomputer is expected to go operational in 2024. This could mean that November 2023 TOP500 list update wouldn't feature El Capitan, as system enablement would be very hard to achieve in four months until then.

The El Capitan supercomputer is expected to run on AMD Instinct MI300A accelerator, which features 24 Zen4 cores, CDNA3 architecture, and 128 GB of HBM3 memory. All paired together in a four-accelerator configuration goes inside each node from HPE, also getting water cooling treatment. While we don't have many further details on the memory and storage of El Capitan, we know that the system will exceed two ExFLOPS at peak and will consume close to 40 MW of power.

DRAM ASP Decline Narrows to 0~5% for 3Q23 Owing to Production Cuts and Seasonal Demand

TrendForce reports that continued production cuts by DRAM suppliers have led to a gradual quarterly decrease in overall DRAM supply. Seasonal demand, on the other hand, is helping to mitigate inventory pressure on suppliers. TrendForce projects that the third quarter will see the ASP for DRAM converging towards a 0~5% decline. Despite suppliers' concerted efforts, inventory levels persistently remain high, keeping prices low. While production cutbacks may help to curtail quarterly price declines, a tangible recovery in prices may not be seen until 2024.

PC DRAM: The benefits of consolidated production cuts on DDR4 by the top three suppliers are expected to become evident in the third quarter. Furthermore, inventory pressure on suppliers has been partially alleviated due to aggressive purchasing by several OEMs at low prices during 2Q23. Evaluating average price trends for PC DRAM products in 3Q23 reveals that DDR4 will continue to remain in a state of persistent oversupply, leading to an expected quarterly price drop of 3~8%. DDR5 prices—influenced by suppliers' efforts to maintain prices and unmet buyer demand—are projected to see a 0-5% quarterly decline. The overall ASP of PC DRAM is projected to experience a QoQ decline of 0~5% in the third quarter.

AI and HPC Demand Set to Boost HBM Volume by Almost 60% in 2023

High Bandwidth Memory (HBM) is emerging as the preferred solution for overcoming memory transfer speed restrictions due to the bandwidth limitations of DDR SDRAM in high-speed computation. HBM is recognized for its revolutionary transmission efficiency and plays a pivotal role in allowing core computational components to operate at their maximum capacity. Top-tier AI server GPUs have set a new industry standard by primarily using HBM. TrendForce forecasts that global demand for HBM will experience almost 60% growth annually in 2023, reaching 290 million GB, with a further 30% growth in 2024.

TrendForce's forecast for 2025, taking into account five large-scale AIGC products equivalent to ChatGPT, 25 mid-size AIGC products from Midjourney, and 80 small AIGC products, the minimum computing resources required globally could range from 145,600 to 233,700 Nvidia A100 GPUs. Emerging technologies such as supercomputers, 8K video streaming, and AR/VR, among others, are expected to simultaneously increase the workload on cloud computing systems due to escalating demands for high-speed computing.

Samsung Electronics Unveils Foundry Vision in the AI Era

Samsung Electronics, a world leader in advanced semiconductor technology, today announced its latest foundry technology innovations and business strategy at the 7th annual Samsung Foundry Forum (SFF) 2023. Under the theme "Innovation Beyond Boundaries," this year's forum delved into Samsung Foundry's mission to address customer needs in the artificial intelligence (AI) era through advanced semiconductor technology.

Over 700 guests, from customers and partners of Samsung Foundry, attended this year's event, of which 38 companies hosted their own booths to share the latest technology trends in the foundry industry.

NVIDIA Allegedly Preparing H100 GPU with 94 and 64 GB Memory

NVIDIA's compute and AI-oriented H100 GPU is supposedly getting an upgrade. The H100 GPU is NVIDIA's most powerful offering and comes in a few different flavors: H100 PCIe, H100 SXM, and H100 NVL (a duo of two GPUs). Currently, the H100 GPU comes with 80 GB of HBM2E, both in the PCIe and SXM5 version of the card. A notable exception if the H100 NVL, which comes with 188 GB of HBM3, but that is for two cards, making it 94 GB per each. However, we could see NVIDIA enable 94 and 64 GB options for the H100 accelerator soon, as the latest PCI ID Repository shows.

According to the PCI ID Repository listing, two messages are posted: "Kindly help to add H100 SXM5 64 GB into 2337." and "Kindly help to add H100 SXM5 94 GB into 2339." These two messages indicate that NVIDIA could prepare its H100 in more variations. In September 2022, we saw NVIDIA prepare an H100 variation with 120 GB of memory, but that still isn't official. These PCIe IDs could just come from engineering samples that NVIDIA is testing in the labs, and these cards could never appear on any market. So, we have to wait and see how it plays out.

Major CSPs Aggressively Constructing AI Servers and Boosting Demand for AI Chips and HBM, Advanced Packaging Capacity Forecasted to Surge 30~40%

TrendForce reports that explosive growth in generative AI applications like chatbots has spurred significant expansion in AI server development in 2023. Major CSPs including Microsoft, Google, AWS, as well as Chinese enterprises like Baidu and ByteDance, have invested heavily in high-end AI servers to continuously train and optimize their AI models. This reliance on high-end AI servers necessitates the use of high-end AI chips, which in turn will not only drive up demand for HBM during 2023~2024, but is also expected to boost growth in advanced packaging capacity by 30~40% in 2024.

TrendForce highlights that to augment the computational efficiency of AI servers and enhance memory transmission bandwidth, leading AI chip makers such as Nvidia, AMD, and Intel have opted to incorporate HBM. Presently, Nvidia's A100 and H100 chips each boast up to 80 GB of HBM2e and HBM3. In its latest integrated CPU and GPU, the Grace Hopper Superchip, Nvidia expanded a single chip's HBM capacity by 20%, hitting a mark of 96 GB. AMD's MI300 also uses HBM3, with the MI300A capacity remaining at 128 GB like its predecessor, while the more advanced MI300X has ramped up to 192 GB, marking a 50% increase. Google is expected to broaden its partnership with Broadcom in late 2023 to produce the AISC AI accelerator chip TPU, which will also incorporate HBM memory, in order to extend AI infrastructure.

Intel Falcon Shores is Initially a GPU, Gaudi Accelerators to Disappear

During the ISC High Performance 2023 international conference, Intel announced interesting roadmap updates to its high-performance computing (HPC) and artificial intelligence (AI). With the scrapping of Rialto Bridge and Lancaster Sound, Intel merged these accelerator lines into Falcon Shores processor for HPC and AI, initially claiming to be a CPU+GPU solution on a single package. However, during the ISC 2023 talk, the company forced a change of plans, and now, Falcon Shores is GPU only solution destined for a 2025 launch. Originally, Intel wanted to combine x86-64 cores with Xe GPU to form an "XPU" module that powers HPC and AI workloads. However, Intel did not see a point in forcing customers to choose between specific CPU-to-GPU core ratios that would need to be in an XPU accelerator. Instead, a regular GPU solution paired with a separate CPU is the choice of Intel for now. In the future, as workloads get more defined, XPU solutions are still a possibility, just delayed from what was originally intended.

Regarding Intel's Gaudi accelerators, the story is about to end. The company originally paid two billion US Dollars for Habana Labs and its Gaudi hardware. However, Intel now plans to stop the Gaudi development as a standalone accelerator and instead use the IP to integrate it into its Falcon Shores GPU. Using modular, tile-based architecture, the Falcon Shores GPU features standard ethernet switching, up to 288 GB of HBM3 running at 9.8 TB/s throughput, I/O optimized for scaling, and support for FP8 and FP16 floating point precision needed for AI and other workloads. As noted, the creation of XPU was premature, and now, the initial Falcon Shores GPU will become an accelerator for HPC, AI, and a mix of both, depending on a specific application. You can see the roadmap below for more information.

PMIC Issue with Server DDR5 RDIMMs Reported, Convergence of DDR5 Server DRAM Price Decline

TrendForce reports that mass production of new server platforms—such as Intel Sapphire Rapids and AMD Genoa—is imminent. However, recent market reports have indicated a PMIC compatibility issue for server DDR5 RDIMMs; DRAM suppliers and PMIC vendors are working to address the problem. TrendForce believes this will have two effects: First, DRAM suppliers will temporarily procure more PMICs from Monolithic Power Systems (MPS), which supplies PMICs without any issues. Second, supply will inevitably be affected in the short term as current DDR5 server DRAM production still uses older processes, which will lead to a convergence in the price decline of DDR5 server DRAM in 2Q23—from the previously estimated 15~20% to 13~18%.

As previously mentioned, PMIC issues and the production process relying on older processes are all having a short-term impact on the supply of DDR5 server DRAM. SK hynix has gradually ramped up production and sales of 1α-nm, which, unlike 1y-nm, has yet to be fully verified by consumers. Current production processes are still being dominated by Samsung and SK hynix's 1y-nm and Micron's 1z-nm; 1α and 1β-nm production is projected to increase in 2H23.

HBM Supply Leader SK Hynix's Market Share to Exceed 50% in 2023 Due to Demand for AI Servers

A strong growth in AI server shipments has driven demand for high bandwidth memory (HBM). TrendForce reports that the top three HBM suppliers in 2022 were SK hynix, Samsung, and Micron, with 50%, 40%, and 10% market share, respectively. Furthermore, the specifications of high-end AI GPUs designed for deep learning have led to HBM product iteration. To prepare for the launch of NVIDIA H100 and AMD MI300 in 2H23, all three major suppliers are planning for the mass production of HBM3 products. At present, SK hynix is the only supplier that mass produces HBM3 products, and as a result, is projected to increase its market share to 53% as more customers adopt HBM3. Samsung and Micron are expected to start mass production sometime towards the end of this year or early 2024, with HBM market shares of 38% and 9%, respectively.

AI server shipment volume expected to increase by 15.4% in 2023
NVIDIA's DM/ML AI servers are equipped with an average of four or eight high-end graphics cards and two mainstream x86 server CPUs. These servers are primarily used by top US cloud services providers such as Google, AWS, Meta, and Microsoft. TrendForce analysis indicates that the shipment volume of servers with high-end GPGPUs is expected to increase by around 9% in 2022, with approximately 80% of these shipments concentrated in eight major cloud service providers in China and the US. Looking ahead to 2023, Microsoft, Meta, Baidu, and ByteDance will launch generative AI products and services, further boosting AI server shipments. It is estimated that the shipment volume of AI servers will increase by 15.4% this year, and a 12.2% CAGR for AI server shipments is projected from 2023 to 2027.

AMD and JEDEC Create DDR5 MRDIMMs with 17,600 MT/s Speeds

AMD and JEDEC are collaborating to create a new industry standard for DDR5 memory called MRDIMMs (multi-ranked buffered DIMMs). The constant need for bandwidth in server systems provides trouble that can not easily be solved. Adding more memory is difficult, as motherboards can only get so big. Incorporating on-package memory solutions like HBM is expensive and can only scale to a specific memory capacity. However, engineers of JEDEC, with the help of AMD, have come to make a new standard that will try and solve this challenge using the new MRDIMM technology. The concept of MRDIMM is, on paper, straightforward. It combines two DDR5 DIMMs on a single module to effectively double the bandwidth. Specifically, if you take two DDR5 DIMMs running at 4,400 MT/s and connect them to create a single DIMM, you get 8,800 MT/s speeds on a single module. To efficiently use it, a special data mux or buffer will effectively take two Double Data Rate (DDR) DIMMs and convert them into Quad Data Rate (QDR) DIMMs.

The design also allows simultaneous access to both ranks of memory, thanks to the added mux. First-generation MRDIMMs can produce speeds of up to 8,800 MT/s, while the second and third generations modules can go to 12,800 MT/s and 17,600 MT/s, respectively. We expect third-generation MRDIMMs after 2030, so the project is still far away. Additionally, Intel has a similar solution called Multiplexer Combined Ranks DIMM (MCRDIMM) which uses a similar approach. However, Intel's technology is expected to see the light of the day as early as 2024/2025 and beyond the generation of servers, with Granite Rapids likely representing a contender for this technology. SK Hynix already makes MCRDIMMs, and you can see the demonstration of the approach below.

Intel Xeon Granite Rapids and Sierra Forest to Feature up to 500 Watt TDP and 12-Channel Memory

Today, thanks to Yuuki_Ans on the Chinese Bilibili forum, we have more information about the upcoming "Avenue City" platform that powers Granite Rapids and Sierra Forest. Intel's forthcoming Granite Rapids and Sierra Forest Xeon processors will diverge the Xeon family into two offerings: one optimized for performance/core equipped with P-cores and the other for power/core equipped with E-cores. The reference platform Intel designs and shares with OEMs internally is a 16.7" x 20" board with 20 PCB layers, made as a dual-socket solution. Featuring two massive LGA-7529 sockets, the reference design shows the basic layout for a server powered by these new Xeons.

Capable of powering Granite Rapids / Sierra Forest-AP processors of up to 500 Watts, the platform also accommodates next-generation I/O. Featuring 24 DDR5 DIMMs with support for 12-channel memory, with memory speeds of up to 6400 MT/s. The PCIe selection includes six PCIe Gen 5 x16 links supporting CXL cache coherent protocol and 6x24 UPI links. Additionally, we have another piece of information that Granite Rapids will come with up to 128 cores and 256 threads in both regular and HBM-powered Xeon Max flavoring. You can see storage and reference platform configuration details on the slides below.

Shipments of AI Servers Will Climb at CAGR of 10.8% from 2022 to 2026

According to TrendForce's latest survey of the server market, many cloud service providers (CSPs) have begun large-scale investments in the kinds of equipment that support artificial intelligence (AI) technologies. This development is in response to the emergence of new applications such as self-driving cars, artificial intelligence of things (AIoT), and edge computing since 2018. TrendForce estimates that in 2022, AI servers that are equipped with general-purpose GPUs (GPGPUs) accounted for almost 1% of annual global server shipments. Moving into 2023, shipments of AI servers are projected to grow by 8% YoY thanks to ChatBot and similar applications generating demand across AI-related fields. Furthermore, shipments of AI servers are forecasted to increase at a CAGR of 10.8% from 2022 to 2026.

AMD Envisions Stacked DRAM on top of Compute Chiplets in the Near Future

AMD in its ISSCC 2023 presentation detailed how it has advanced data-center energy-efficiency and managed to keep up with Moore's Law, even as semiconductor foundry node advances have tapered. Perhaps its most striking prediction for server processors and HPC accelerators is multi-layer stacked DRAM. The company has, for some time now, made logic products, such as GPUs, with stacked HBM. These have been multi-chip modules (MCMs), in which the logic die and HBM stacks sit on top of a silicon interposer. While this conserves PCB real-estate compared to discrete memory chips/modules; it is inefficient on the substrate, and the interposer is essentially a silicon die that has microscopic wiring between the chips stacked on top of it.

AMD envisions that the high-density server processor of the near-future will have many layers of DRAM stacked on top of logic chips. Such a method of stacking conserves both PCB and substrate real-estate, allowing chip-designers to cram even more cores and memory per socket. The company also sees a greater role of in-memory compute, where trivial simple compute and data-movement functions can be executed directly on the memory, saving round-trips to the processor. Lastly, the company talked about the possibility of an on-package optical PHY, which would simplify network infrastructure.

Giga Computing Announces Its GIGABYTE Server Portfolio for the 4th Gen Intel Xeon Scalable Processor

Giga Computing is an industry leader in high-performance servers and workstations, today announced the next-generation of GIGABYTE servers and server motherboards for the new 4th Gen Intel Xeon Scalable processor to achieve efficient performance gains with built-in accelerators. The new processors have the most built-in accelerators of any processor on the market to help maximize performance efficiency for emerging workloads; and do so while boosting virtualization and AI performance. Generational improvements make this platform ideal for AI, cloud computing, advanced analytics, HPC, networking, and storage applications. For these markets, Giga Computing has announced fourteen new series that constitute seventy-eight configurations for customers to choose from. And all these new GIGABYTE products support the full portfolio of 4th Gen Intel Xeon Scalable processors, including those with high bandwidth memory (HBM) in the Intel Xeon Max Series.

AMD Shows Instinct MI300 Exascale APU with 146 Billion Transistors

During its CES 2023 keynote, AMD announced its latest Instinct MI300 APU, a first of its kind in the data center world. Combining the CPU, GPU, and memory elements into a single package eliminates latency imposed by long travel distances of data from CPU to memory and from CPU to GPU throughout the PCIe connector. In addition to solving some latency issues, less power is needed to move the data and provide greater efficiency. The Instinct MI300 features 24 Zen4 cores with simultaneous multi-threading enabled, CDNA3 GPU IP, and 128 GB of HBM3 memory on a single package. The memory bus is 8192-bit wide, providing unified memory access for CPU and GPU cores. CLX 3.0 is also supported, making cache-coherent interconnecting a reality.

The Instinct MI300 APU package is an engineering marvel of its own, with advanced chiplet techniques used. AMD managed to do 3D stacking and has nine 5 nm logic chiplets that are 3D stacked on top of four 6 nm chiplets with HBM surrounding it. All of this makes the transistor count go up to 146 billion, representing the sheer complexity of a such design. For performance figures, AMD provided a comparison to Instinct MI250X GPU. In raw AI performance, the MI300 features an 8x improvement over MI250X, while the performance-per-watt is "reduced" to a 5x increase. While we do not know what benchmark applications were used, there is a probability that some standard benchmarks like MLPerf were used. For availability, AMD targets the end of 2023, when the "El Capitan" exascale supercomputer will arrive using these Instinct MI300 APU accelerators. Pricing is unknown and will be unveiled to enterprise customers first around launch.

AMD Explains the Economics Behind Chiplets for GPUs

AMD, in its technical presentation for the new Radeon RX 7900 series "Navi 31" GPU, gave us an elaborate explanation on why it had to take the chiplets route for high-end GPUs, devices that are far more complex than CPUs. The company also enlightened us on what sets chiplet-based packages apart from classic multi-chip modules (MCMs). An MCM is a package that consists of multiple independent devices sharing a fiberglass substrate.

An example of an MCM would be a mobile Intel Core processor, in which the CPU die and the PCH die share a substrate. Here, the CPU and the PCH are independent pieces of silicon that can otherwise exist on their own packages (as they do on the desktop platform), but have been paired together on a single substrate to minimize PCB footprint, which is precious on a mobile platform. A chiplet-based device is one where a substrate is made up of multiple dies that cannot otherwise independently exist on their own packages without an impact on inter-die bandwidth or latency. They are essentially what should have been components on a monolithic die, but disintegrated into separate dies built on different semiconductor foundry nodes, with a purely cost-driven motive.
Return to Keyword Browsing
Dec 19th, 2024 04:02 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts