Monday, December 9th 2024
Intel 18A Yields Are Actually Okay, And The Math Checks Out
A few days ago, we published a report about Intel's 18A yields being at an abysmal 10%. This sparked quite a lot of discussion among the tech community, as well as responses from industry analysts and Intel's now ex-CEO Pat Gelsinger. Today, we are diving into known information about Intel's 18A node and checking out what the yields of possible products could be, using tools such as Die Yield Calculator from SemiAnalysis. First, we know that the defect rate of the 18A node is 0.4 defects per cm². This information is from August, and up-to-date defect rates could be much lower, especially since semiconductor nodes tend to evolve even when they are production-ready. To measure yields, manufacturers use various yield models based on the information they have, like the aforementioned 0.4 defect density. Expressed in defects per square centimeter (def/cm²), it measures manufacturing process quality by quantifying the average number of defects present in each unit area of a semiconductor wafer.
Measuring yields is a complex task. Manufacturers design some smaller chips for mobile and some bigger chips for HPC tasks. Thus, these two would have different yields, as bigger chips require more silicon area and are more prone to defects. Smaller mobile chips occupy less silicon area, and defects occurring on the wafer often yield more usable chips than wasted silicon. Stating that a node only yields x% of usable chips is only one side of the story, as the size of the test production chip is not known. For example, NVIDIA's H100 die is measuring at 814 mm²—a size that is pushing modern manufacturing to its limits. The size of a modern photomask, the actual pattern mask used in printing the design of a chip to silicon wafer, is only 858 mm² (26x33 mm). Thus, that is the limit before exceeding the mask and needing a redesign. At that size, nodes are yielding much less usable chips than something like a 100 mm² mobile chip, where defects don't wreak havoc on the yield curve.Next, the problem occurs with the actual design of a chip. Each silicon print for each specific chip will carry its signature pattern to be etched onto the silicon wafer. Each design carries its problems and defect rates, where specific placement of wires and transistors onto the silicon wafer can accumulate a lot more defects than other specific designs ready for manufacturing. Even when a chip is designed and ready, it still sometimes gets small redesigns to improve final yields. Remember that makers like NVIDIA pay TSMC the same regardless of its yields, so increasing the yields and improving the design is saving NVIDIA silicon that would otherwise go to waste or trickle down to non-flagship SKUs.
To calculate yields, manufacturers use various yield models—mathematical equations that help fabs better understand and predict yield loss by translating defect density distributions into yield predictions. While several models exist (including Murphy, Exponential, and Seeds models), the fundamental Poisson Yield Model (Y = e^(-AD)) assumes randomly distributed point defects, where Y is yield, A is chip area, and D is defect density. This model is derived from Poisson's probability distribution, which calculates the likelihood of zero defects occurring on a chip. However, the Poisson model is often considered pessimistic because real-world defects tend to cluster rather than distribute randomly, typically resulting in higher actual yields than the model predicts. Manufacturers choose their yield model by comparing various models' predictions against their actual fab data, selecting the one that best fits their specific manufacturing process and conditions.When using the older data of the Intel 18A node, that d0=0.4, we must check a few different designs and compare their yields before drawing any conclusions. We are today measuring yields using the SemiAnalysis Die Yield Calculator tool, integrating all known yield models, Murphy, Exponential, Seeds, and Poisson. At the EUV reticle size limit of 858 mm² and with an applied 0.4 defect rate of 18A node, the most pessimistic estimate is the Poisson model, which gives a yield of 3.23%. However, the most optimistic (Seeds) model yields 22.56%. The default Murphy's model is yielding only 7.95% of usable chips. That is five good dies on a 300 mm wafer with 59 leftover dies.A while back, we covered a leak of "Panther Lake," Intel's next-generation Core Ultra 300 series CPUs. The leaked package thankfully included information about die sizes, which we can input into the calculator to find out the yield, assuming Intel is manufacturing Panther Lake compute tiles on the 18A node, using the aforementioned d0 of 0.4. The die number four with a CPU and NPU on it, measuring 8.004x14.288 mm, a 114.304 mm² silicon die, gives a yield of 64.4% on the default Murphy's model. Moore's model is the most pessimistic, with only about 50% of usable dies. The die number five of Panther Lake, housing the Xe3 GPU, is even smaller and measures only 53.6 mm², yielding an impressive 81% of usable dies. Even with the most pessimistic assumptions, the yield curve drops to 60%.If we assume that Intel has refined its 18A node more, we can conclude that even with some larger designs hitting the EUV machine reticle limit of 858 mm², Intel's yields could be hitting the 50% mark. If we assume that the best player TSMC achieves a 0.1 defect rate, yields of chips at 858 mm² size are barely above 50% using all available models. That is the fully functioning silicon die, of course. Non fully functional dies are later repurposed for lower-end SKUs. With modern Foveros and EMIB packaging, Intel could use smaller dies and interconnect them on a larger interposer to act as a single uniform chip, saving costs and boosting its yields. However, this is only a part of the silicon manufacturing story, as manufacturers use other techniques to save costs.
The initial reported Broadcom disappointment with Intel 18A node was prior to the PDK 1.0, which we assume is now in customer's hands to optimize their designs for version 1.0. Additionally, Intel should be enabling a few more optimizations based on the initial defect rate reported in August. Indeed, the revamp of Intel Foundry has been difficult, and the ousting of CEO Pat Gelsinger may have been a premature move. The semiconductor manufacturing supply chain takes years to fix; it isn't exactly a kid's toy manufacturing supply chain but rather the world's most advanced and complex industry. We assume that the Intel 18A node is fully functional and that we will see external customers pick up Intel's business significantly. Of course, in the beginning, Intel will be its own biggest customer until more fabless designers start rolling in.
Measuring yields is a complex task. Manufacturers design some smaller chips for mobile and some bigger chips for HPC tasks. Thus, these two would have different yields, as bigger chips require more silicon area and are more prone to defects. Smaller mobile chips occupy less silicon area, and defects occurring on the wafer often yield more usable chips than wasted silicon. Stating that a node only yields x% of usable chips is only one side of the story, as the size of the test production chip is not known. For example, NVIDIA's H100 die is measuring at 814 mm²—a size that is pushing modern manufacturing to its limits. The size of a modern photomask, the actual pattern mask used in printing the design of a chip to silicon wafer, is only 858 mm² (26x33 mm). Thus, that is the limit before exceeding the mask and needing a redesign. At that size, nodes are yielding much less usable chips than something like a 100 mm² mobile chip, where defects don't wreak havoc on the yield curve.Next, the problem occurs with the actual design of a chip. Each silicon print for each specific chip will carry its signature pattern to be etched onto the silicon wafer. Each design carries its problems and defect rates, where specific placement of wires and transistors onto the silicon wafer can accumulate a lot more defects than other specific designs ready for manufacturing. Even when a chip is designed and ready, it still sometimes gets small redesigns to improve final yields. Remember that makers like NVIDIA pay TSMC the same regardless of its yields, so increasing the yields and improving the design is saving NVIDIA silicon that would otherwise go to waste or trickle down to non-flagship SKUs.
To calculate yields, manufacturers use various yield models—mathematical equations that help fabs better understand and predict yield loss by translating defect density distributions into yield predictions. While several models exist (including Murphy, Exponential, and Seeds models), the fundamental Poisson Yield Model (Y = e^(-AD)) assumes randomly distributed point defects, where Y is yield, A is chip area, and D is defect density. This model is derived from Poisson's probability distribution, which calculates the likelihood of zero defects occurring on a chip. However, the Poisson model is often considered pessimistic because real-world defects tend to cluster rather than distribute randomly, typically resulting in higher actual yields than the model predicts. Manufacturers choose their yield model by comparing various models' predictions against their actual fab data, selecting the one that best fits their specific manufacturing process and conditions.When using the older data of the Intel 18A node, that d0=0.4, we must check a few different designs and compare their yields before drawing any conclusions. We are today measuring yields using the SemiAnalysis Die Yield Calculator tool, integrating all known yield models, Murphy, Exponential, Seeds, and Poisson. At the EUV reticle size limit of 858 mm² and with an applied 0.4 defect rate of 18A node, the most pessimistic estimate is the Poisson model, which gives a yield of 3.23%. However, the most optimistic (Seeds) model yields 22.56%. The default Murphy's model is yielding only 7.95% of usable chips. That is five good dies on a 300 mm wafer with 59 leftover dies.A while back, we covered a leak of "Panther Lake," Intel's next-generation Core Ultra 300 series CPUs. The leaked package thankfully included information about die sizes, which we can input into the calculator to find out the yield, assuming Intel is manufacturing Panther Lake compute tiles on the 18A node, using the aforementioned d0 of 0.4. The die number four with a CPU and NPU on it, measuring 8.004x14.288 mm, a 114.304 mm² silicon die, gives a yield of 64.4% on the default Murphy's model. Moore's model is the most pessimistic, with only about 50% of usable dies. The die number five of Panther Lake, housing the Xe3 GPU, is even smaller and measures only 53.6 mm², yielding an impressive 81% of usable dies. Even with the most pessimistic assumptions, the yield curve drops to 60%.If we assume that Intel has refined its 18A node more, we can conclude that even with some larger designs hitting the EUV machine reticle limit of 858 mm², Intel's yields could be hitting the 50% mark. If we assume that the best player TSMC achieves a 0.1 defect rate, yields of chips at 858 mm² size are barely above 50% using all available models. That is the fully functioning silicon die, of course. Non fully functional dies are later repurposed for lower-end SKUs. With modern Foveros and EMIB packaging, Intel could use smaller dies and interconnect them on a larger interposer to act as a single uniform chip, saving costs and boosting its yields. However, this is only a part of the silicon manufacturing story, as manufacturers use other techniques to save costs.
The initial reported Broadcom disappointment with Intel 18A node was prior to the PDK 1.0, which we assume is now in customer's hands to optimize their designs for version 1.0. Additionally, Intel should be enabling a few more optimizations based on the initial defect rate reported in August. Indeed, the revamp of Intel Foundry has been difficult, and the ousting of CEO Pat Gelsinger may have been a premature move. The semiconductor manufacturing supply chain takes years to fix; it isn't exactly a kid's toy manufacturing supply chain but rather the world's most advanced and complex industry. We assume that the Intel 18A node is fully functional and that we will see external customers pick up Intel's business significantly. Of course, in the beginning, Intel will be its own biggest customer until more fabless designers start rolling in.
38 Comments on Intel 18A Yields Are Actually Okay, And The Math Checks Out
I am not at all opposed to mocking intel. And that post was old technology, intel has had much meme materials since then (anyone watching gn would have couple ideas)
timesofindia.indiatimes.com/technology/tech-news/fired-intel-ceo-will-fast-this-week-for-100000-intel-employees-as-they-/articleshow/116126521.cms
diit-cz.translate.goog/clanek/vyteznost-v-procentech-nebo-denzita-defektu?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=cs&_x_tr_pto=wapp
Clock speeds, leak voltages and other parameters are usually handled and phrased separately from yield. The 10% yield number matches very well with defect rate Intel has publicly stated, if the die size is very large. Broadcom does have enormous dies for the chips like the one the news bit said was being evaluated. Lacking other details this is the easiest conclusion to come to.
Lunar Lake - same as Arrow Lake - came now, at a time when by Intel's own claim 18A is just starting to become ready. Given the delays from starting production to launching a product, neither 20A nor 18A were simply not possible for either of those. The questions for Lunar Lake and Arrow Lake might be what is going on with Intel 3 but that one most likely has a very simple answer - Xeons.
But if the yields were fine Intel would be using 18A instead of TSMC for Arrow Lake.
Remember that in August alongside news they did show Panther Lake with at least compute die supposedly produced on 18A. If I remember correctly Panther Lake is on roadmap for 2025H2. Production is only starting and it is quite a few quarters to go before we see a launched product.
Stakeholders don't give a **** about booting. They don't even know what word "to boot" stands for.
All they want to know is information on their investments, meaning how many chips can be produced at what cost and what will they sell for.
Everything is about money, after all. Delivering on technology is not the end goal. The end goal is being financially successful.
With such pile of shit Intel is dealing right now, I think their press release dept. would release anything just to mitigate this desperate situation.
Huge layoffs and firing Gelsinger in middle of process in nothing else but a proof to the fact that Intel is encountering serious troubles right now.
There's a saying: A drowning man will clutch at a straw.