Tuesday, October 8th 2024

NVIDIA cuLitho Computational Lithography Platform is Moving to Production at TSMC

TSMC, the world leader in semiconductor manufacturing, is moving to production with NVIDIA's computational lithography platform, called cuLitho, to accelerate manufacturing and push the limits of physics for the next generation of advanced semiconductor chips. A critical step in the manufacture of computer chips, computational lithography is involved in the transfer of circuitry onto silicon. It requires complex computation - involving electromagnetic physics, photochemistry, computational geometry, iterative optimization and distributed computing. A typical foundry dedicates massive data centers for this computation, and yet this step has traditionally been a bottleneck in bringing new technology nodes and computer architectures to market.

Computational lithography is also the most compute-intensive workload in the entire semiconductor design and manufacturing process. It consumes tens of billions of hours per year on CPUs in the leading-edge foundries. A typical mask set for a chip can take 30 million or more hours of CPU compute time, necessitating large data centers within semiconductor foundries. With accelerated computing, 350 NVIDIA H100 Tensor Core GPU-based systems can now replace 40,000 CPU systems, accelerating production time, while reducing costs, space and power.
NVIDIA cuLitho brings accelerated computing to the field of computational lithography. Moving cuLitho to production is enabling TSMC to accelerate the development of next-generation chip technology, just as current production processes are nearing the limits of what physics makes possible.

"Our work with NVIDIA to integrate GPU-accelerated computing in the TSMC workflow has resulted in great leaps in performance, dramatic throughput improvement, shortened cycle time and reduced power requirements," said Dr. C.C. Wei, CEO of TSMC, at the GTC conference earlier this year.

NVIDIA has also developed algorithms to apply generative AI to enhance the value of the cuLitho platform. A new generative AI workflow has been shown to deliver an additional 2x speedup on top of the accelerated processes enabled through cuLitho.

The application of generative AI enables creation of a near-perfect inverse mask or inverse solution to account for diffraction of light involved in computational lithography. The final mask is then derived by traditional and physically rigorous methods, speeding up the overall optical proximity correction process by 2x.

The use of optical proximity correction in semiconductor lithography is now three decades old. While the field has benefited from numerous contributions over this period, rarely has it seen a transformation quite as rapid as the one provided by the twin technologies of accelerated computing and AI. These together allow for the more accurate simulation of physics and the realization of mathematical techniques that were once prohibitively resource-intensive.

This enormous speedup of computational lithography accelerates the creation of every single mask in the fab, which speeds the total cycle time for developing a new technology node. More importantly, it makes possible new calculations that were previously impractical.

For example, while inverse lithography techniques have been described in the scientific literature for two decades, an accurate realization at full chip scale has been largely precluded because the computation takes too long. With cuLitho, that's no longer the case. Leading-edge foundries will use it to ramp up inverse and curvilinear solutions that will help create the next generation of powerful semiconductors.
Source: NVIDIA
Add your own comment

9 Comments on NVIDIA cuLitho Computational Lithography Platform is Moving to Production at TSMC

#1
Wirko
Wow.

(That's my comment to the PR, not to the first comment.)
Posted on Reply
#2
chodaboy19
Intel and Samsung foundries could also benefit...
Posted on Reply
#3
Wirko
I don't understand why it needs such a huge amount of computation. A single photomask is a one-bit very high resolution image, and processing involves some special kind of sharpening and blurring, right? But the image contains a huge number of repeating patterns (such as SRAM cells), so it should be possible to reuse the results very many times.
Posted on Reply
#4
OSdevr
WirkoAbout the technology itself, I don't understand why it needs such a huge amount of computation. A single photomask is a one-bit very high resolution image, and processing involves some special kind of sharpening and blurring, right? But the image contains a huge number of repeating patterns (such as SRAM cells), so it should be possible to reuse the results very many times.
The problem is that lithography today is well below the wavelength of light used to expose the photoresist. As the post suggests, you normally need to perform full electromagnetic wave simulations to get usable lithography masks. What's on the masks is very different from what's on the GDSII files that's given to the fab. And while repeating patterns like SRAM can in principal help, chips contain more than that and you need to generate very, very high resolution masks and with multiple patterning you may need to generate multiple masks for every layer (of which there are quite a few these days).
Posted on Reply
#5
Minus Infinity
WirkoAbout the technology itself, I don't understand why it needs such a huge amount of computation. A single photomask is a one-bit very high resolution image, and processing involves some special kind of sharpening and blurring, right? But the image contains a huge number of repeating patterns (such as SRAM cells), so it should be possible to reuse the results very many times.
How do you design the mask in the first place and then you have to write the mask which requires ludicrously complex optics which only gets more ludicrous for High NA EUV.

Anandtech had a nce little article on cuLitho last year: Here's short excerpt

Modern process technologies push wafer fab equipment to its limits and often require finer resolution than is physically possible, which is where computational lithography comes into play. The primary purpose of computational lithography is to enhance the achievable resolution in photolithography processes without modifying the tools. To do so, CL employs algorithms that simulate the production process, incorporating crucial data from ASML's equipment and shuttle (test) wafers. These simulations aid in refining the reticle (photomask) by deliberately altering the patterns to counteract the physical and chemical influences that arise throughout the lithography and patterning steps.
Posted on Reply
#6
Dr. Dro
Neo_MorpheusI recall they did something similar with Hairworks and it was proven it was done on purpose since it would bog down other gpus but not theirs for no good reason. So maybe something similar now?
This is an advanced chipmaking tool, despite the funny name. Sabotaging this in any way sabotages only themselves.

Hairworks was never designed intentionally to cripple Radeon cards, instead it was the Radeons who had (and to an extent, still have if we're talking DirectX 11) poor instancing performance.

TressFX was then designed to largely solve the problem, although if you recall the few games that use it (such as Tomb Raider 2013) still perform relatively poorly on hardware like Tahiti and Hawaii with TressFX on.
Posted on Reply
#7
Neo_Morpheus
Dr. DroThis is an advanced chipmaking tool, despite the funny name. Sabotaging this in any way sabotages only themselves.

Hairworks was never designed intentionally to cripple Radeon cards, instead it was the Radeons who had (and to an extent, still have if we're talking DirectX 11) poor instancing performance.

TressFX was then designed to largely solve the problem, although if you recall the few games that use it (such as Tomb Raider 2013) still perform relatively poorly on hardware like Tahiti and Hawaii with TressFX on.
You are correct, shouldn't apply on this scenario.

About Hairworks, knowing Ngreedia, we cannot say so firmly that they didn't have a second intention. Please remember, since they bought Aegia and removed the access to PhysX, they have been trying to get people locked into their hardware with such shenanigans.

Anyways, did a quick search and it was refreshing and frustrating to recall what I normally say here, back then, consumers and reviewers would call out such proprietary tech, instead of how we now see the blind worshipping of DLSS.

Here are a couple of links, not to expand much, since is definitely off topic:

witcher/comments/36jpe9
pcgaming/comments/36xyst
Funny how history seems to repeat itself when people used to say that Hairworks and even TressFX was just a gimmick not worth the performance hit. Sounds like RT to me today.:D:D
Posted on Reply
#8
Wirko
OSdevrThe problem is that lithography today is well below the wavelength of light used to expose the photoresist. As the post suggests, you normally need to perform full electromagnetic wave simulations to get usable lithography masks. What's on the masks is very different from what's on the GDSII files that's given to the fab. And while repeating patterns like SRAM can in principal help, chips contain more than that and you need to generate very, very high resolution masks and with multiple patterning you may need to generate multiple masks for every layer (of which there are quite a few these days).
Photoshop with a specific sharpening plug-in would easily handle that on a single PC with a GPU ... for a single 10-megapixel image. That's after the simulations of physical and chemical effects are done, including pesky non-linear effects, which requires much more than a PC - but needs to be done only once.
But I understand this is on another scale. The real images here could be about 10 megapixels big in each direction, so a hundred terapixels or so, and there are a few masks for EUV plus a few tens for DUV. Waiting weeks for the computations alone is a lot of money wasted. Therefore a few hundred H100 accelerators are probably a good investment. So much better if TSMC had a very inefficient system until now. 40,000 CPUs, huh? Does than mean processing on CPUs, not GPUs?
It's unclear what the role of AI is in this processing. Maybe it can identify many more repeating patterns than other methods, so it can reduce the amount of computation by 2x.
Posted on Reply
Jan 16th, 2025 03:46 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts