News Posts matching #Chiplet

Return to Keyword Browsing

Tenstorrent Selects Samsung Foundry to Manufacture Next-Generation AI Chiplet

Tenstorrent, a company that sells AI processors and licenses AI and RISC-V IP, announced today that it selected Samsung Foundry to bring Tenstorrent's next generation of AI chiplets to market. Tenstorrent builds powerful RISC-V CPU and AI acceleration chiplets, aiming to push the boundaries of compute in multiple industries such as data center, automotive and robotics. These chiplets are designed to deliver scalable power from milliwatts to megawatts, catering to a wide range of applications from edge devices to data centers.

To ensure the highest quality and cutting-edge manufacturing capabilities for its chiplet, Tenstorrent has selected Samsung's Foundry Design Service team, known for their expertise in silicon manufacturing. The chiplets will be manufactured using Samsung's state-of-the-art SF4X process, which boasts an impressive 4 nm architecture.

Synopsys and TSMC Streamline Multi-Die System Complexity with Unified Exploration-to-Signoff Platform and Proven UCIe IP on TSMC N3E Process

Synopsys, Inc. today announced it is extending its collaboration with TSMC to advance multi-die system designs with a comprehensive solution supporting the latest 3Dblox 2.0 standard and TSMC's 3DFabric technologies. The Synopsys Multi-Die System solution includes 3DIC Compiler, a unified exploration-to-signoff platform that delivers the highest levels of design efficiency for capacity and performance. In addition, Synopsys has achieved first-pass silicon success of its Universal Chiplet Interconnect Express (UCIe) IP on TSMC's leading N3E process for seamless die-to-die connectivity.

"TSMC has been working closely with Synopsys to deliver differentiated solutions that address designers' most complex challenges from early architecture to manufacturing," said Dan Kochpatcharin, head of the Design Infrastructure Management Division at TSMC. "Our long history of collaboration with Synopsys benefits our mutual customers with optimized solutions for performance and power efficiency to help them address multi-die system design requirements for high-performance computing, data center, and automotive applications."

ITRI Leads Global Semiconductor Collaboration for Heterogeneous Integration to Pioneer Pilot Production Solutions

The introduction of Generative AI (GAI) has significantly increased the demand for advanced semiconductor chips, drawing increased attention to the development of complex calculations for large-scale AI models and high-speed transmission interfaces. To assist the industry in grasping the key to high-end semiconductor manufacturing and integration capabilities, the Heterogeneous Integrated Chiplet System Package (Hi-CHIP) Alliance brings together leading semiconductor companies from Taiwan and around the world to provide comprehensive services, spanning from packaging design, testing and verification, to pilot production. Since its establishment in 2021, the alliance has accumulated important industry players as its members, including EVG, Kulicke and Soffa (K&S), USI, Raytek Semiconductor, Unimicron, DuPont, and Brewer Science. Looking forward, the alliance is set to actively explore its global market potential.

Dr. Shih-Chieh Chang, General Director of Electronic and Optoelectronic System Research Laboratories at ITRI and Chairman of the Hi-CHIP Alliance, indicated that advanced manufacturing processes have led to a considerable increase in IC design cycles and costs. Multi-dimensional chip design and heterogeneous integrated packaging architecture are key tools to tackle this demand in semiconductors. On top of that, the advent of GAI such as ChatGPT, which demands substantial computing power and transmission speed, requires even higher levels of integration capacity in chip manufacturing. ITRI has been committed to developing manufacturing technologies and upgrading materials and equipment to enhance heterogeneous integration technologies. Achievements include the fan-out wafer level packaging (FOWLP), 2.5 and 3D chips, embedded interposer connections (EIC), and programmable packages. With both local and foreign semiconductor manufacturer members, the Hi-CHIP Alliance is establishing an advanced packaging process production line to provide an integrated one-stop service platform.

NVIDIA Blackwell GB100 Die Could Use MCM Packaging

NVIDIA's upcoming Blackwell GPU architecture, expected to succeed the current Ada Lovelace architecture, is gearing up to make some significant changes. While we don't have any microarchitectural leaks, rumors are circulating that Blackwell will have different packaging and die structures. One of the most intriguing aspects of the upcoming Blackwell is the mention of a Multi-Chip Module (MCM) design for the GB100 data-center GPU. This advanced packaging approach allows different GPU components to exist on separate dies, providing NVIDIA with more flexibility in chip customization. This could mean that NVIDIA can more easily tailor its chips to meet the specific needs of various consumer and enterprise applications, potentially gaining a competitive edge against rivals like AMD.

While Blackwell's release is still a few years away, these early tidbits paint a picture of an architecture that isn't just an incremental improvement but could represent a more significant shift in how NVIDIA designs its GPUs. NVIDIA's potential competitor is AMD's upcoming MI300 GPU, which utilized chiplets in its designs. Chiplets also provide ease of integration as smaller dies provide better wafer yields, meaning that it makes more sense to switch to smaller dies and utilize chiplets economically.

QuickLogic & YorChip Collaborate on Development of Low-Power, Low-Cost UCIe FPGA Chiplets

QuickLogic Corporation, a developer of embedded FPGA (eFPGA) IP, ruggedized FPGAs and Endpoint AI/ML solutions, and YorChip, a pioneering startup specializing in UCIe-compatible IP, have formed a strategic partnership to revolutionize the world of FPGA chiplets. The collaboration will result in a groundbreaking lineup of FPGA chiplets optimized for low power consumption and low cost, opening new possibilities for a wide range of applications, including the fast-growing edge IoT and AI/ML markets.

According to Yole Group, a market research company, by 2023, they expect chiplet adoption will lead to a TAM of chiplet-based integrated circuits in excess of $200B, across the consumer, automotive defense, aerospace, industrial, and medical markets. Since discrete FPGAs are already prevalent in those same markets, wide adoption of eFPGA-based UCIe (Unified Chiplet Interconnect Express) enabled chiplets is expected, and QuickLogic and YorChip are well-positioned to capitalize on this growth opportunity.

AMD "Navi 4C" GPU Detailed: Shader Engines are their own Chiplets

"Navi 4C" is a future high-end GPU from AMD that will likely not see the light of day, as the company is pivoting away from the high-end GPU segment with its next RDNA4 generation. For AMD to continue investing in the development of this GPU, the gaming graphics card segment should have posted better sales, especially in the high-end, which it didn't. Moore's Law is Dead scored details of what could have been a fascinating technological endeavor for AMD, in building a highly disaggregated GPU.

AMD's current "Navi 31" GPU sees a disaggregation of the main logic components of the GPU that benefit from the latest 5 nm foundry node to be located in a central Graphics Compute Die; surrounded by up to six little chiplets built on the older 6 nm foundry node, which contain segments of the GPU's Infinity Cache memory, and its memory interface—hence the name memory cache die. With "Navi 4C," AMD had intended to further disaggregate the GPU, identifying even more components on the GCD that can be spun out into chiplets; as well as breaking up the shader engines themselves into smaller self-contained chiplets (smaller dies == greater yields and lower foundry costs).

AMD Navi 32 RDNA 3 GPU Spotted in Forbes Video

Forbes published its video interview with AMD CEO and President Lisa Su at the end of May, but it has taken two weeks for hardware news sites to realize that unreleased silicon was in plain view within the spotlight piece. Folks likely regarded it as a simple puff piece due to the title reading "This CEO Made AMD Billions - Now She Wants To Dominate The Market With AI." Hoang Anh Phu, a Vietnamese technology enthusiast, managed to pay close attention to a curious segment in the Forbes video and uploaded AI-upscaled screengrabs to Twitter along with the comment/question: "Navi 32 die shot(?!)."

RDNA3 Navi 31 and Navi 33 GPU products have already reached the retail market—AMD's high-end (chiplet design) Radeon RX 7900-series is based on the former and it launched last December. The latter arrived in the (monolithic N33 XL) form of Radeon RX 7600 cards at the end of May 2023. Even board partners are seemingly becoming impatient about a lack of new offerings in the mid-range—Sapphire is very likely to release another previous gen Radeon RX 6750 XT custom card this week in China. Team Red has not publicly acknowledged that Navi 32 is a work in-progress, so it is slightly odd that an example sat next to EPYC Genoa, Raphael, and Raphael X3D dies on a table—as spotted in the Forbes feature. Screenshots show an Infinity Cache setup with four memory stacks on a previously unseen die. Leaks have indicated that Navi 32 will be a chiplet design with a GCD (200 mm²) in the middle, surrounded by the four MCDs (37.5 mm²). The full package area size is eyeball estimated to occupy around 350 mm² of space, which corroborates info uncovered in the past.

Intel Arrow Lake-HX Interposer Appears Online

The Intel Design tools webpage has this week once again provided an early preview of upcoming processors - following on from an LGA1851-MTL-S CPU interposer appearing on the site late last month - indicating that a Meteor Lake-S desktop CPU range was due at some point later in 2023. Intel's latest webpage entry features the "BGA2114-ARL-HX Interposer for the Gen 5 VR Test Tool" with an SKU code that reads: "Q6B2114ARLHX."

The BGA 2114 design points to a mobile processor platform, and industry analysts are fairly certain that Intel is preparing next generation high-end laptop CPUs in the form of its rumored Arrow Lake-HX lineup. This range is set to succeed the 13th generation Core-HX Raptor Lake family of mobile processors. The new BGA package looks to be slightly larger than the closest predecessor, possibly accommodating Intel's new "disaggregated" tile-based (tile is their term for chiplet) internal layout.

AMD's Dr. Lisa Su Thinks That Moore's Law is Still Relevant - Innovation Will Keep Legacy Going

Barron's Magazine has been on a technology industry kick this week and published their interview with AMD CEO Dr. Lisa Su on May 3. The interviewer asks Su about her views on Moore's Law and it becomes apparent that she remains a believer of Gordon Moore's (more than half-century old) prediction - Moore, an Intel co-founder passed away in late March. Su explains that her company's engineers will need to innovate in order to carry on with that legacy: "I would certainly say I don't think Moore's Law is dead. I think Moore's Law has slowed down. We have to do different things to continue to get that performance and that energy efficiency. We've done chiplets - that's been one big step. We've now done 3-D packaging. We think there are a number of other innovations, as well." Expertise in other areas is also key in hitting technological goals: "Software and algorithms are also quite important. I think you need all of these pieces for us to continue this performance trajectory that we've all been on."

When asked about the challenges involved in advancing CPU designs within limitations, Su responds with: "Yes. The transistor costs and the amount of improvement you're getting from density and overall energy reduction is less from each generation. But we're still moving (forward) generation to generation. We're doing plenty of work in 3 nanometer today, and we're looking beyond that to 2 nm as well. But we'll continue to use chiplets and these type of constructions to try to get around some of the Moore's Law challenges." AMD and Intel continue to hold firm with Moore's Law, even though slightly younger upstarts disagree (see NVIDIA). Dr. Lisa Su's latest thoughts stay consistent with her colleague's past statements - AMD CTO Mark Papermaster reckoned that the theory is pertinent for another six to eight years, although it could be a costly endeavor for AMD - the company believes that it cannot double transistor density every 18 to 24 months without incurring extra expenses.

Intel "Emerald Rapids" Doubles Down on On-die Caches, Divests on Chiplets

Finding itself embattled with AMD's EPYC "Genoa" processors, Intel is giving its 4th Gen Xeon Scalable "Sapphire Rapids" processor a rather quick succession in the form of the Xeon Scalable "Emerald Rapids," bound for Q4-2023 (about 8-10 months in). The new processor shares the same LGA4677 platform and infrastructure, and much of the same I/O, but brings about two key design changes that should help Intel shore up per-core performance, making it competitive to EPYC "Zen 4" processors with higher core-counts. SemiAnalysis compiled a nice overview of the changes, the two broadest points of it being—1. Intel is peddling back on the chiplet approach to high core-count CPUs, and 2., that it wants to give the memory sub-system and inter-core performance a massive performance boost using larger on-die caches.

The "Emerald Rapids" processor has just two large dies in its extreme core-count (XCC) avatar, compared to "Sapphire Rapids," which can have up to four of these. There are just three EMIB dies interconnecting these two, compared to "Sapphire Rapids," which needs as many as 10 of these to ensure direct paths among the four dies. The CPU core count itself doesn't see a notable increase. Each of the two dies on "Emerald Rapids" physically features 33 CPU cores, so a total of 66 are physically present, although one core per die is left unused for harvesting, the SemiAnalysis article notes. So the maximum core-count possible commercially is 32 cores per die, or 64 cores per socket. "Emerald Rapids" continues to be based on the Intel 7 process (10 nm Enhanced SuperFin), probably with a few architectural improvements for higher clock-speeds.

Synopsys, TSMC and Ansys Strengthen Ecosystem Collaboration to Advance Multi-Die Systems

Accelerating the integration of heterogeneous dies to enable the next level of system scalability and functionality, Synopsys, Inc. (Nasdaq: SNPS) has strengthened its collaboration with TSMC and Ansys for multi-die system design and manufacturing. Synopsys provides the industry's most comprehensive EDA and IP solutions for multi-die systems on TSMC's advanced 7 nm, 5 nm and 3 nm process technologies with support for TSMC 3DFabric technologies and 3Dblox standard. The integration of Synopsys implementation and signoff solutions and Ansys multi-physics analysis technology on TSMC processes allows designers to tackle the biggest challenges of multi-die systems, from early exploration to architecture design with signoff power, signal and thermal integrity analysis.

"Multi-die systems provide a way forward to achieve reduced power and area and higher performance, opening the door to a new era of innovation at the system-level," said Dan Kochpatcharin, head of Design Infrastructure Management Division at TSMC. "Our long-standing collaboration with Open Innovation Platform (OIP) ecosystem partners like Synopsys and Ansys gives mutual customers a faster path to multi-die system success through a full spectrum of best-in-class EDA and IP solutions optimized for our most advanced technologies."

Winbond Joins UCIe Consortium to Support High-performance Chiplet Interface Standardisation

Winbond has joined the UCIe (Universal Chiplet Interconnect Express) Consortium, the industry Consortium dedicated to advancing UCIe technology. This open industry standard defines interconnect between chiplets within a package, enabling an open chiplet ecosystem and facilitating the development of advanced 2.5D/3D devices.

A leader in high-performance memory ICs, Winbond is an established supplier of known good die (KGD) needed to assure end-of-line yield in 2.5D/3D assembly. 2.5D/3D multichip devices are needed to realize the exponential improvements in performance, power efficiency, and miniaturization, demanded by the explosion of technologies such as 5G, Automotive, and Artificial Intelligence (AI).

Open Compute Project Foundation and JEDEC Announce a New Collaboration

Today, the Open Compute Project Foundation (OCP), the nonprofit organization bringing hyperscale innovations to all, and JEDEC Solid State Technology Association, the global leader in the development of standards for the microelectronics industry, announce a new collaboration to establish a framework for the transfer of technology captured in an OCP-approved specification to JEDEC for inclusion in one of its standards. This alliance brings together members from both the OCP and JEDEC communities to share efforts in developing and maintaining global standards needed to advance the electronics industry.

Under this new alliance, the current effort will be to provide a mechanism to standardize Chiplet part descriptions leveraging OCP Chiplet Data Extensible Markup Language (CDXML) specification to become part of JEDEC JEP30: Part Model Guidelines for use with today's EDA tools. With this updated JEDEC standard, expected to be published in 2023, Chiplet builders will be able to provide electronically a standardized Chiplet part description to their customers paving the way for automating System in Package (SiP) design and build using Chiplets. The description will include information needed by SiP builders such as Chiplet thermal properties, physical and mechanical requirements, behavior specifications, power and signal integrity properties, testing the Chiplet in package, and security parameters.

Chinese Loongson Processor Uses Chiplet Design to Pack 32 Cores

Chinese processor designers need help creating a leading-edge design that satisfies their needs, with the imposed sanctions and restrictions of Western countries. However, designers are using creative ways to make a server processor to fulfill their needs. According to the latest Sina report, Chinese company Loongson has developed a 32-core processor using chiplet technology. Previously, the company announced its 16-core 3C5000 processor based on LA464 cores, which utilize LoongArch ISA. Loongson used chiplet technology to fuse two 3C5000 processors into a single-socket solution called 3D5000, which features 32 LA464 cores to create a higher-performing design. Based on the LGA-4129 package, the chip size is 75.4x58.5×6.5 mm.

The company claims that the typical power consumption is rated for 130 Watts at 2.0 GHz or 170 Watts at 2.2 GHz, with TDP power consumption not exceeding 300 Watts at 2.2 GHz even with peaks. The performance of the new 3D5000 processor, measured using SPEC2006, is 400 points and 800 points for single-socket and dual-socket servers, respectively. The four-socket server is expected to reach 1600 points in the same benchmark, so scaling is advertised as linear. Loongson hopes to provide samples to industry partners in the first half of 2023 with an unknown price tag.

Ventana Introduces Veyron, World's First Data Center Class RISC-V CPU Product Family

Ventana Micro Systems Inc. today announced its Veyron family of high performance RISC-V processors. The Veyron V1 is the first member of the family, and the highest performance RISC-V processor available today. It will be offered in the form of high performance chiplets and IP. Ventana Founder and CEO Balaji Baktha will make the public announcement during his RISC-V Summit keynote today.

The Veyron V1 is the first RISC-V processor to provide single thread performance that is competitive with the latest incumbent processors for Data Center, Automotive, 5G, AI, and Client applications. The Veyron V1 efficient microarchitecture also enables the highest single socket performance among competing architectures.

AMD Still Believes in Moore's Law, Unlike NVIDIA

Back in September, NVIDIA's Jensen Huang said that Moore's Law is dead, but it seems like AMD disagrees with NVIDIA, at least for now. According to an interview with AMD's CTO, Mark Papermaster, AMD still believes that Moore's Law will be alive for another six to eight years. However, AMD no longer believes that transistor density can be doubled every 18 to 24 months, while remaining in the same cost envelope. "I can see exciting new transistor technology for the next - as far as you can really plot these things out - about six to eight years, and it's very, very clear to me the advances that we're going to make to keep improving the transistor technology, but they're more expensive," Papermaster said.

AMD believes we'll see a change in how chips are being designed and put together, with chiplets being the future of semiconductors. Papermaster calls this "a Moore's Law equivalent, meaning that you continue to really double that capability every 18 to 24 months" although it's not exactly Moore's Law in the traditional sense. AMD also appears to be betting heavily on FPGA technology in some of its market segments, for something the company calls adaptive computing. As to how things will play out, time will tell, but with both AMD and Intel going down the chiplet route, albeit in slightly different ways, we should continue to see new innovations from both companies, with or without Moore's Law.

AMD Explains the Economics Behind Chiplets for GPUs

AMD, in its technical presentation for the new Radeon RX 7900 series "Navi 31" GPU, gave us an elaborate explanation on why it had to take the chiplets route for high-end GPUs, devices that are far more complex than CPUs. The company also enlightened us on what sets chiplet-based packages apart from classic multi-chip modules (MCMs). An MCM is a package that consists of multiple independent devices sharing a fiberglass substrate.

An example of an MCM would be a mobile Intel Core processor, in which the CPU die and the PCH die share a substrate. Here, the CPU and the PCH are independent pieces of silicon that can otherwise exist on their own packages (as they do on the desktop platform), but have been paired together on a single substrate to minimize PCB footprint, which is precious on a mobile platform. A chiplet-based device is one where a substrate is made up of multiple dies that cannot otherwise independently exist on their own packages without an impact on inter-die bandwidth or latency. They are essentially what should have been components on a monolithic die, but disintegrated into separate dies built on different semiconductor foundry nodes, with a purely cost-driven motive.

Eliyan Closes $40M Series A Funding Round and Unveils Industry's Highest Performance Chiplet Interconnect Technologies

Eliyan Corporation, credited for the invention of the semiconductor industry's highest-performance and most efficient chiplet interconnect, today announced two major milestones in the commercialization of its technology for multi-die chiplet integration: the close of its Series A $40M funding round, and the successful tapeout of its technology on an industry standard 5-nanometer (nm) process.

Eliyan's NuLink PHY and NuGear technologies address the critical need for a commercially viable approach to enabling high performance and cost-effectiveness in the connection of homogeneous and heterogenous architectures on a standard, organic chip substrate. It has proven to achieve similar bandwidth, power efficiency, and latency as die-to-die implementations using advanced packaging technologies, but without the other drawbacks of specialized approaches.

AMD Ready with Zen 4 3DV Cache Chiplet, Expects to Repeat 5800X3D Magic Versus Raptor Lake

AMD is allegedly ready with a working "Zen 4" chiplet that has stacked 3D Vertical Cache (3DV cache) memory, which supplements the on-die L3 cache, and is found to massively improve gaming performance. "Moore's Law is Dead" reports that the Zen 4 + 3DV Cache chiplet will be used with various Ryzen 7000X3D SKUs, as well as special EPYC "Genoa" SKUs.

The 3DV Cache deployed with the "Zen 4" chiplet is a second-generation to the one on the "Zen 3 + 3DV cache" chiplet, and AMD has worked on a number of bandwidth and latency improvements, so it performs in-sync with the generationally-faster on-die L3 cache of the "Zen 4" chiplet. Unlike the CCD below it that's built on TSMC N5 (5 nm EUV), the L3D (the stacked die with the 3DV cache) is possibly be built on an older node, such as N6 (6 nm), since it only contains a slab of memory and doesn't warrant N5. "Moore's Law is Dead" reports that AMD expects to repeat the magic of the 5800X3D when it comes to gaming performance, and expects Ryzen 7000X3D processors to dominate Intel's 13th Gen "Raptor Lake" processors. This was echoed by another reliable source, greymon55.

AMD RDNA3 Offers Over 50% Perf/Watt Uplift Akin to RDNA2 vs. RDNA; RDNA4 Announced

AMD in its 2022 Financial Analyst Day presentation claimed that it will repeat the over-50% generational performance/Watt uplift feat with the upcoming RDNA3 graphics architecture. This would be a repeat of the unexpected return to the high-end and enthusiast market-segments of AMD Radeon, thanks to the 50% performance/Watt uplift of the RDNA2 graphics architecture over RDNA. The company also broadly detailed the various new specifications of RDNA3 that make this possible.

To begin with, RDNA3 debuts on the TSMC N5 (5 nm) silicon fabrication node, and will debut a chiplet-based approach that's somewhat analogous to what AMD did with its 2nd Gen EPYC "Rome" and 3rd Gen Ryzen "Matisse" processors. Chiplets packed with the GPU's main number-crunching and 3D rendering machinery will make up chiplets, while the I/O components, such as memory controllers, display controllers, media engines, etc., will sit on a separate die. Scaling up the logic dies will result in a higher segment ASIC.

NVIDIA Opens NVLink for Custom Silicon Integration

Enabling a new generation of system-level integration in data centers, NVIDIA today announced NVIDIA NVLink -C2C, an ultra-fast chip-to-chip and die-to-die interconnect that will allow custom dies to coherently interconnect to the company's GPUs, CPUs, DPUs, NICs and SOCs. With advanced packaging, NVIDIA NVLink-C2C interconnect would deliver up to 25x more energy efficiency and be 90x more area-efficient than PCIe Gen 5 on NVIDIA chips and enable coherent interconnect bandwidth of 900 gigabytes per second or higher.

"Chiplets and heterogeneous computing are necessary to counter the slowing of Moore's law," said Ian Buck, vice president of Hyperscale Computing at NVIDIA. "We've used our world-class expertise in high-speed interconnects to build uniform, open technology that will help our GPUs, DPUs, NICs, CPUs and SoCs create a new class of integrated products built via chiplets."

Rumor: AMD Ryzen 7000 (Raphael) to Introduce Integrated GPU in Full Processor Lineup

The rumor mill keeps crushing away; in this case, regarding AMD's plans for their next-generation Zen designs. Various users have shared pieces of the same AMD roadmap, which apparently places AMD in an APU-focused landscape come their Ryzen 7000 series. we are currently on AMD's Ryzen 5000-series; Ryzen 6000 is supposed to materialize via a Zen 3+ design, with improved performance per watt obtained from improvements to its current Zen 3 family. However, Ryzen 7000-series is expected to debut on AMD's next-gen platform (let's call it AM5), which is also expected to introduce DDR5 support for AMD's mainstream computing platform. And now, the leaked, alleged roadmaps paint a Zen 4 + Navi 2 APU series in the works for AMD's Zen 4 debut with Raphael - roadmapped for manufacturing at the 5 nm process.

The inclusion of an iGPU chip with AMD's mainstream processors may signal a move by AMD to produce chiplets for all of its products, and then integrating them in the final product. You just have to think about it in the sense that AMD could "easily" pair one of the eight-core chiplets from the current Ryzen 5800X, for example, with an I/O die (which would likely still be fabricated with Global Foundries) an an additional Navi 2 GPU chiplet. It makes sense for AMD to start fabricating GPUs as chiplets as well - AMD's research on MCM (Multi-Chip Module) GPUs is pretty well-known at this point, and is a given for future development. It means that AMD needed only to develop one CPU chiplet and one GPU chiplet which they can then scale on-package by adding in more of the svelte pieces of silicon - something that Intel still doesn't do, and which results in the company's monolithic dies.

AMD Patents Chiplet-based GPU Design With Active Cache Bridge

AMD on April 1st published a new patent application that seems to show the way its chiplet GPU design is moving towards. Before you say it, it's a patent application; there's no possibility for an April Fool's joke on this sort of move. The new patent develops on AMD's previous one, which only featured a passive bridge connecting the different GPU chiplets and their processing resources. If you want to read a slightly deeper dive of sorts on what chiplets are and why they are important for the future of graphics (and computing in general), look to this article here on TPU.

The new design interprets the active bridge connecting the chiplets as a last-level cache - think of it as L3, a unifying highway of data that is readily exposed to all the chiplets (in this patent, a three-chiplet design). It's essentially AMD's RDNA 2 Infinity Cache, though it's not only used as a cache here (and for good effect, if the Infinity Cache design on RDNA 2 and its performance uplift is anything to go by); it also serves as an active interconnect between the GPU chiplets that allow for the exchange and synchronization of information, whenever and however required. This also allows for the registry and cache to be exposed as a unified block for developers, abstracting them from having to program towards a system with a tri-way cache design. There are also of course yield benefits to be taken here, as there are with AMD's Zen chiplet designs, and the ability to scale up performance without any monolithic designs that are heavy in power requirements. The integrated, active cache bridge would also certainly help in reducing latency and maintaining chiplet processing coherency.
AMD Chiplet Design Patent with Active Cache Hierarchy AMD Chiplet Design Patent with Active Cache Hierarchy AMD Chiplet Design Patent with Active Cache Hierarchy AMD Chiplet Design Patent with Active Cache Hierarchy

AMD Files Patent for Chiplet Machine Learning Accelerator to be Paired With GPU, Cache Chiplets

AMD has filed a patent whereby they describe a MLA (Machine Learning Accelerator) chiplet design that can then be paired with a GPU unit (such as RDNA 3) and a cache unit (likely a GPU-excised version of AMD's Infinity Cache design debuted with RDNA 2) to create what AMD is calling an "APD" (Accelerated Processing Device). The design would thus enable AMD to create a chiplet-based machine learning accelerator whose sole function would be to accelerate machine learning - specifically, matrix multiplication. This would enable capabilities not unlike those available through NVIDIA's Tensor cores.

This could give AMD a modular way to add machine-learning capabilities to several of their designs through the inclusion of such a chiplet, and might be AMD's way of achieving hardware acceleration of a DLSS-like feature. This would avoid the shortcomings associated with implementing it in the GPU package itself - an increase in overall die area, with thus increased cost and reduced yields, while at the same time enabling AMD to deploy it in other products other than GPU packages. The patent describes the possibility of different manufacturing technologies being employed in the chiplet-based design - harkening back to the I/O modules in Ryzen CPUs, manufactured via a 12 nm process, and not the 7 nm one used for the core chiplets. The patent also describes acceleration of cache-requests from the GPU die to the cache chiplet, and on-the-fly usage of it as actual cache, or as directly-addressable memory.

AMD is Allegedly Preparing Navi 31 GPU with Dual 80 CU Chiplet Design

AMD is about to enter the world of chiplets with its upcoming GPUs, just like it has been doing so with the Zen generation of processors. Having launched a Radeon RX 6000 series lineup based on Navi 21 and Navi 22, the company is seemingly not stopping there. To remain competitive, it needs to be in the constant process of innovation and development, which is reportedly true once again. According to the current rumors, AMD is working on an RDNA 3 GPU design based on chiplets. The chiplet design is supposed to feature two 80 Compute Unit (CU) dies, just like the ones found inside the Radeon RX 6900 XT graphics card.

Having two 80 CU dies would bring the total core number to exactly 10240 cores (two times 5120 cores on Navi 21 die). Combined with the RDNA 3 architecture, which brings better perf-per-watt compared to the last generation uArch, Navi 31 GPU is going to be a compute monster. It isn't exactly clear whatever we are supposed to get this graphics card, however, it may be coming at the end of this year or the beginning of the following year 2022.
Return to Keyword Browsing
Jul 5th, 2025 22:56 CDT change timezone

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts