Wednesday, May 22nd 2024

Qualcomm's Success with Windows AI PC Drawing NVIDIA Back to the Client SoC Business

NVIDIA is eying a comeback to the client processor business, reveals a Bloomberg interview with the CEOs of NVIDIA and Dell. For NVIDIA, all it takes is a simple driver update that exposes every GeForce GPU with tensor cores as an NPU to Windows 11, with translation layers to get popular client AI apps to work with TensorRT. But that would need you to have a discrete NVIDIA GPU. What about the vast market of Windows AI PCs powered by the likes of Qualcomm, Intel, and AMD, who each sell 15 W-class processors with integrated NPUs capable of 50 AI TOPS, which is all that Copilot+ needs? NVIDIA held an Arm license for decades now, and makes Arm-based CPUs to this day, with the NVIDIA Grace, however, that is a large server processor meant for its AI GPU servers.

NVIDIA already made client processors under the Tegra brand targeting smartphones, which it winded down last decade. It's since been making Drive PX processors for its automotive self-driving hardware division; and of course there's Grace. NVIDIA hinted that it might have a client CPU for the AI PC market in 2025. In the interview Bloomberg asked NVIDIA CEO Jensen Huang a pointed question on whether NVIDIA has a place in the AI PC market. Dell CEO Michael Dell, who was also in the interview, interjected "come back next year," to which Jensen affirmed "exactly." Dell would be in a front-and-center position to know if NVIDIA is working on a new PC processor for launch in 2025, and Jensen's nod almost confirms this
NVIDIA has both the talent and the IP to whip up a PC processor—its teams behind Grace and Drive can create the Arm CPU cores, NVIDIA is already the big daddy of consumer graphics and should have little problem with the iGPU, and the NPU shouldn't be hard to create, either. It wouldn't surprise us if the NPU on NVIDIA's chip isn't a physical component, but a virtual device that uses the AI acceleration capabilities of the iGPU with its tensor cores, as a hardware backend.

NVIDIA's journey to the AI PC has one little hurdle, and that is the exclusivity Qualcomm enjoys with Microsoft for the current crop of Windows-on-Arm notebooks, with its Snapdragon X series chips. NVIDIA would have to work with Microsoft to have the same market access as Qualcomm.

If all goes well, the NVIDIA PC processor powering AI PCs will launch in 2025.
Sources: Bloomberg (YouTube), Videocardz
Add your own comment

34 Comments on Qualcomm's Success with Windows AI PC Drawing NVIDIA Back to the Client SoC Business

#26
Eternit
Anyway. ChatGPT is now dead and so is Copilot and many services using Bing/ChatGPT API.
Posted on Reply
#27
human_error
Nvidia could always make an NPU PCIE add-in card/chip that Dell or whoever else could add to their products. Would be decoupled from the GPU so could work on an Intel or AMD or ARM based system, and be substantially faster than the NPUs on the CPU package (or even let Dell use non-NPU CPU packages).
Posted on Reply
#28
hsew
human_errorNvidia could always make an NPU PCIE add-in card/chip that Dell or whoever else could add to their products. Would be decoupled from the GPU so could work on an Intel or AMD or ARM based system, and be substantially faster than the NPUs on the CPU package (or even let Dell use non-NPU CPU packages).
If Nvidia did this it would 1000% be a datacenter only product.
Posted on Reply
#29
human_error
hsewIf Nvidia did this it would 1000% be a datacenter only product.
No need for basic NPUs in data centers - that's where they'll sell you a GPU farm. This could be low power, low cost, high volume part for laptops/handhelds. If they make the NPU API accessible and it matched their GPU NPU APIs could be a way of building a new ecosystem for them, similar to CUDA.
Posted on Reply
#30
hsew
human_errorNo need for basic NPUs in data centers - that's where they'll sell you a GPU farm. This could be low power, low cost, high volume part for laptops/handhelds. If they make the NPU API accessible and it matched their GPU NPU APIs could be a way of building a new ecosystem for them, similar to CUDA.
Nah, the consumer space already knows Nvidia as the gaming graphics company (GeForce/Now), whereas the datacenter knows them for compute/cuda.

Besides, NPUs are only breaking into the client space as an SOC/GPU integration (keyword). Given that 99.9999% of today’s client devices are primarily SOC with optional dGPU, selling yet another discrete, dis-integrated component to a consumer is just not a bet I’d make as Nvidia.

In short, not saying Nvidia would do a discrete NPU when they have GPUs, but if they did, it would almost certainly NOT be a consumer product.
Posted on Reply
#31
R0H1T
human_errorlow cost, high volume part for laptops/handhelds.
When was the last time you saw that from Green goblins? Switch doesn't count.
Posted on Reply
#32
ikjadoon
Darmok N JaladWon't things like GPUs need to have ARM-specific drivers? We have one ARM-based desktop with standard PCIe expansion slots that I know of, the Mac Pro. Unlike the x86 Mac Pro that it replaced, it doesn't support standard GPUs, and many other kinds of PCIe cards are not compatible on the ARM-Mac versus the x86 Mac. I don't know the ins-and-outs of hardware level drivers, but wouldn't WOA desktops have a similar problem?

And yeah, I don't know that NVIDIA needs to go full-custom. They could pull the architecture off the shelf and probably get more out of it by using advanced nodes like Apple does. It sure seems like they could easily answer Snapdragon if they wanted to, and now there's a window of opportunity for such devices. It makes me wonder if MS hasn't already asked NVIDIA, and NVIDIA wasn't interested. Or maybe MS didn't want to deal with NVIDIA, I dunno.
Exactly: NVIDIA will need to produce / develop / test / ship ARM64 WoA drivers for their GPUs, which they have never done. Presumably, if they are making "AI PCs" as Jensen alludes to, they'll need to port their drivers to WoA.

Apps can be emulated, but drivers really do need to be native. NVIDIA has GPU drivers for Linux on Arm, but not Windows on Arm.

Many Arm-based systems have PCIe (e.g., datacenter), so it's not a hardware limitation (e.g., PCIe is much more abstracted vs the CPU ISA). The Ampere Altra desktop is also Arm-based with PCIe expansion. Interestingly, this may the system Linus Torvalds now uses.

//
R0H1TFirst of all that's just an estimate, it's also missing FP numbers so barely half the story.

Meanwhile in the real world we have ~


www.phoronix.com/review/nvidia-gh200-amd-threadripper
openbenchmarking.org/result/2402191-NE-GH200THRE98&sgm=1&ppd_U3lzdGVtNzYgVGhlbGlvIE1ham9yIHI1IC0gVGhyZWFkcmlwcGVyIDc5ODBY=15076&ppd_SFAgWjYgRzUgQSAtIFRocmVhZHJpcHBlciBQUk8gNzk5NVdY=30041&ppd_R1BUc2hvcC5haSAtIE5WSURJQSBHSDIwMA=42500&ppt=D&sor

It's easy to forget how bandwidth starved regular zen4 chips are, I think I saw that analysis on Chips & Cheese. With more memory channels &/or higher speed memory they easily pull way past Grace Hopper & Emerald (Sapphire?) Rapids as well. This is why Strix point & Halo would be interesting to watch & whether AMD can at least feed zen5 better on desktops/mobile platforms!
It's easy to forget that most SPEC testing is an "estimate". ;) We shouldn't worry: plenty non-SPEC benchmarks are far less reliable than a well-done SPEC estimate. Very few people submit their benchmark + methodology for independent validation for a validated SPEC score.

You seem to not understand the actual parameters of "the real world": first, Grace uses Cortex-X3-based (Neoverse V2) cores, so this comparison is moot: NVIDIA is rumored to use the Cortex-X5. Second, much of Phoronix's testing is heavily nT, so the significantly-higher-core-count 7995WX (96-cores) is also rather irrelevant, especially with the next point. Third, the 7980X and 7995WX have 350W TDPs (and consume about that); without actual & comparable data on the GH200, this is not an interesting comparison when power draw is a key limiting factor in consumer SoCs. Fourth, Phoronix notes many times some of their Linux benchmarks in these tests weren't optimized for AArch64 yet, so it is not much to stand on.

In the end, it's a nonsense comparison: this rumor isn't saying NVIDIA isn't trying to replace Zen4 workstations with GH200. NVIDIA is claimed to be making consumer APUs for Windows on Arm. Linux perf, enterprise workloads, developer workloads, scientific workloads, 300W+ TDP perf, nT performance beyond 8-12 cores: all irrelevant here. SPEC was a much better estimate, even with only int, IMO.

But, if we want to measure a current Arm uArches vs Zen4 on 100% native code, fp & int, phones vs desktops, etc. Geekbench is the last man standing. The Cortex-X4 does fine and it's more than enough for Windows on Arm & consumer workloads, even if it's a generation behind what NVIDIA will ship: it is only available on phones, so you won't get much reliable cross-platform data.

1T Cortex-X4 smartphone: 2,287 pts - 100%
1T 7995WX workstation: 2,720 pts (or 2,702 pts) - 119%

It's a good thing AMD uses Geekbench for CPU perf estimates on consumer workloads, so I can happily avoid all the usual disclaimers. We'll have to see how Cortex-X5 lands, but I don't think NVIDIA's value prop. depends on "fastest 1T CPU ever for WoA": it just needs to be good enough versus Intel & AMD in 2025.

//

TL;DR: We were discussing uArches for a future NVIDIA SoC on Windows for consumers, which Phoronix is miles away from capturing.
Posted on Reply
#33
R0H1T
ikjadoonGrace uses Cortex-X3-based (Neoverse V2) cores, so this comparison is moot: NVIDIA is rumored to use the Cortex-X5.
Right but as it stands now zen4 is way ahead in those workloads. I wouldn't put it past the realms of possibility that Nvidia can catch up but they'll also be competing against zen5 at the time, it's a moving target.
ikjadoonPhoronix's testing is heavily nT, so the significantly-higher-core-count 7995WX (96-cores) is also rather irrelevant
I'm not really bothered by that since both AMD & Intel use basically "factory OC" to get great ST numbers ~ which of course makes their efficiency look bad, like in that Phoronix test. For any consumer platform Nvidia will not only have to do something about the opposition's massive clock advantage but also their massive core advantage as well. I'm willing to bet they won't sell 12-16 cores cheaper than AMD at the start.
ikjadoonThird, the 7980X and 7995WX have 350W TDPs (and consume about that); without actual & comparable data on the GH200, this is not an interesting comparison when power draw is a key limiting factor in consumer SoCs.
Check my last point about OC. AMD's most "efficient" chips right now are either zen4c or have x3d cache.
ikjadoonFourth, Phoronix notes many times some of their Linux benchmarks in these tests weren't optimized for AArch64 yet, so it is not much to stand on.
The workloads can be optimized further on Intel & AMD so it's a two way street, although for ARM gains should generally be higher.
Posted on Reply
#34
Random_User
TristanXIf NV join PC CPU pack, than Intel and AMD are in serious problems
Looks like it. They've tried with CPUs before. And since they couldn't get x86 licence, they've saw the potential in Qualcomm's accomplishments, and might want to give another try. So nothing prevents nVidia from entering the desktop and mobile CPU race, and steadily grow their share in this market. Especially with their gazzillion, they've got with the AI surge. Making the products that suit this trend, helps them even more. A CPU is a CPU, after all. There's no law, that says it should be x86 only.

This seems to be serious, even if it doesn't appers to be so for now. nVidia might want to outplay both Intel and AMD, at their of duopoly game. Time will tell.
BwazeBut do they really want to? The market is:

"Windows AI PCs powered by the likes of Qualcomm, Intel, and AMD, who each sell 15 W-class processors with integrated NPUs capable of 50 AI TOPS"

But Nvidia clearly outgrew catering to lowly penny pinching peasants - they practically don't offer low end GPUs, and with every generation they delay their lower end offerings more and more. And we can understand why - their server, AI mainframes are what's driving the stellar growth, not home users. Why would all of a sudden they want to deal with market that requires low margins and vast volume?
This might happen not for a volume first. Knowing nVidia, they will put strong marketing at their products, so the reputation precedes, creating the steady ground for their establishment. That's how nVidia have beaten Radeon with clearly inferior products, back in the day.

They might create the image, even without the actual product stock at first, so that might shift the attention from x86 desktop dominance. Even if this is going to be flop, this might shake the current desktop situation, as nVidia now controls the flow in the industry, and mindshare. Who cares if the actual product is shit, if the brand's recognition and stock price is spilling over the edge.
hsewUmm no. Tegra powers the Nintendo Switch (and likely Switch 2). Selling 100M+ units is not a failure…
Indeed. This might actually be the most probable begining. As nVidia's ARM seems to be not powerful enough for desktop yet. The portable/mobile/handheld, might be the best start. And everything ARM can't handle, may be achieved by nVidia GPU's proprietary compute power. Of course if they'd manange to scale it to the portable form factor. And if it will accuire enough success in handheld, with experience gained, this might transfer to the desktop as well. And then...
EternitThere is a big assumption there will be Qualcomm's success with Windows AI PC. For now it is just a big hype and no one knows how many units will be sold and even if it will be a success for this generation, if it will continue with the future generations.
I think this is obvious direction of MS Windows. They made it looks as a mobile OS, and run of mobile as well. They got huge repository of Linux oriented stuff, and they even made WIndows a bit compatible with Linux, as it's the core of most ARM based OSs. Why does it matter? Because MS goes after user data, and everyone be dependant on their cloud services. The best way to do it, is to move everyone to the client mobile device. And as x86 did some moves into the portable market, the ARM seems to get success at tesktop much sooner. Especially if Windows will get native support for ARM CPUs.
londisteSwitch is also from 2017. With a SoC from 2015 :D

They don't need to. They are relying on ARM for CPU cores for now.
Also, Denver was pretty good back when Nvidia was trying to cook their own. Pretty sure they have the know-how.
Tegra did not fail spectacularly, it kind of slid out of our view. They pivoted from consumer stuff to automotive and industrial. Most likely due to profit margins.
Indeed. If that's out of view, it doesn't mean to be abandoned. nVidia is after data center/cloud business. They are both, HW maker for these, and also the provider of cloud services. And having portable client device, locked ("certified for best experience") into their GeForce Now infrastructure, seems logical. And it doesn't have to be powerful desktop, either. Much like many tasks run on it, do not require huge powerhouse. An office/entertainment PC can be basically ran on ARM. Only heavy workloads and gaming require it.
Posted on Reply
Add your own comment
Jan 29th, 2025 12:58 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts