Monday, March 3rd 2025

NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support
PassMark Software has identified the root cause behind unexpectedly low compute performance in NVIDIA's new GeForce RTX 5090, RTX 5080, and RTX 5070 Ti GPUs. The culprit: NVIDIA has silently discontinued support for 32-bit OpenCL and CUDA in its "Blackwell" architecture, causing compatibility issues with existing benchmarking tools and applications. The issue manifested when PassMark's DirectCompute benchmark returned the error code "CL_OUT_OF_RESOURCES (-5)" on RTX 5000 series cards. After investigation, developers confirmed that while the benchmark's primary application has been 64-bit for years, several compute sub-benchmarks still utilize 32-bit code that previously functioned correctly on RTX 4000 and earlier GPUs. This architectural change wasn't clearly documented by NVIDIA, whose developer website continues to display 32-bit code samples and documentation despite the removal of actual support.
The impact extends beyond benchmarking software. Applications built on legacy CUDA infrastructure, including technologies like PhysX, will experience significant performance degradation as computational tasks fall back to CPU processing rather than utilizing the GPU's parallel architecture. While this fallback mechanism allows older applications to run on the RTX 40 series and prior hardware, the RTX 5000 series handles these tasks exclusively through the CPU, resulting in substantially lower performance. PassMark is currently working to port the affected OpenCL code to 64-bit, allowing proper testing of the new GPUs' compute capabilities. However, they warn that many existing applications containing 32-bit OpenCL components may never function properly on RTX 5000 series cards without source code modifications. The benchmark developer also notes this change doesn't fully explain poor DirectX9 performance, suggesting additional architectural changes may affect legacy rendering pathways. PassMark updated its software today, but legacy benchmarks could still suffer. Below is an older benchmark run without the latest PassMark V11.1 build 1004 patches, showing just how much the newest generations suffers without a proper software support.
Sources:
PassMark on X, via Tom's Hardware
The impact extends beyond benchmarking software. Applications built on legacy CUDA infrastructure, including technologies like PhysX, will experience significant performance degradation as computational tasks fall back to CPU processing rather than utilizing the GPU's parallel architecture. While this fallback mechanism allows older applications to run on the RTX 40 series and prior hardware, the RTX 5000 series handles these tasks exclusively through the CPU, resulting in substantially lower performance. PassMark is currently working to port the affected OpenCL code to 64-bit, allowing proper testing of the new GPUs' compute capabilities. However, they warn that many existing applications containing 32-bit OpenCL components may never function properly on RTX 5000 series cards without source code modifications. The benchmark developer also notes this change doesn't fully explain poor DirectX9 performance, suggesting additional architectural changes may affect legacy rendering pathways. PassMark updated its software today, but legacy benchmarks could still suffer. Below is an older benchmark run without the latest PassMark V11.1 build 1004 patches, showing just how much the newest generations suffers without a proper software support.
74 Comments on NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support
Everything that worked before, still works. Just not with newer drivers. It's really nothing new, this is how Nvidia has always handled deprecation: they park stuff in a legacy branch, the branch still receives security and other fixes for years to come. Just not as often.
Edit: If anything, the news is that software that pretends to be 64bit, is still 32bit under the hood.
It may just be trimming down their stack, nothing more complicated than that. We are not specifically talking about the GPU itself here. If you try to write CUDA code in any device, even a windows QCOM laptop, and try to build it with the latest NVCC, it won't cross compile 32-bit software at all. Keep in mind that this "compiled" code usually turns into PTX, which is Nvidia's IR (similar to LLVM's IR), which is not directly executed by the GPU.
If you have software that was built with a previous version, so that 32-bit was still working, the runtime won't execute it if you're targeting to run it with Blackwell. Now I may be wrong, but AFAIK this is a limitation within the runtime SDK even before it gets to do any call upon the GPU, since the runtime is the part responsible for turning the above PTX into actual code that the GPU can run, so it's mostly a total lack of support from the software stack itself.
Reminder that everything has to get turned into a compatible binary, even a regular shader, so why is there a compiler that can turn regular shaders in 32bit machine code (presumably, maybe it promotes everything to 64bit but I doubt it) but can't do the same for 32bit PTX ? Perhaps some incompatible instructions may be issued through 32bit PTX which can't be issued through shaders, hence lack of support, rather than "they couldn't be bothered" which seems implausible.
I mean nothing really encourages development of 32bit applications in general these days, why not drop support for literally everything 32bit running on the GPU, why just 32bit PTX ?
Passmark should update their benchmark suite to be fully optimized to 64-bit environments and drop 32-bit support for their application entirely, since Windows 32-bit support was dropped more than half a decade ago.
Just capitalize on all the negative press (not just this software stuff). That's it
From design to manufacture, a new generation of GPUs takes A LOT OF TIME to be made, so i would ASSUME nVidia knew about whatever it is that prevents them from enabling 32bit support A LOT EARLIER than when they disclosed the information to the software companies: WHY WAIT UNTIL NOW to do it?
Windows 95 - first full 32 bit OS.
Windows 98 - 32 bit only
Windows 2000 - 32 bit only
Windows XP - 2001 - 32 bit and 64 bit
Windows Vista - 32 bit and 64 bit
Windows 7 - 32 bit and 64 bit
Windows 10 - 2024 - 32 bit and 64 bit
Windows 11 - 64 bit only
We've been living in a 32 bit and 64 bit windows world for 23 full years. I can absolutely see people who want to push technology ever forward saying that you've obsoleted the 32 bit instructions 8 years ago...but at that time there were plenty of people still on a 32 bit system that couldn't even use your 64 bit stuff...and it's why people basically assumed that without good reason these things would not disappear. Heck, simply translating the 32 bit instructions to run in 64 bit at basically half the efficiency (given half the bits would be filler) makes sense given that there are still people who can and do rock Windows XP...let alone those rocking 7 and 10...both of which had very popular 32 bit versions.
So it's amusing to me to see people want to fight about a benchmark having issues because they cut off code...when most of the time we hear about Nvidia and AMD optimizing for certain loads and benchmarks, because it means Nvidia either didn't understand what damage this would do or didn't care. If the former, it's weird to conceive of them optimizing for said benchmark. If the later, both AMD and Intel will bet artificially higher relative scores because they just didn't stop doing older things. That's...well, I take it as amusement given they don't otherwise seem to be having a good time after basically treating gamers as second class consumers because AI is splashing so much cash that they literally won't care about us until the AI bubble collapses. Payback's pretty frustration, especially if it's something you never concern yourself with. Hopefully Nvidia's decision to fully drop gets at least partially reversed into a translation layer, so this isn't a uniform damaging of performance in older games for no reason.
No developers can really pull the card of "we didn't know it was coming", IMHO Missed the news?
And, I hate having to repeat my self, but..
HAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA.....GOTCHA Suckas !
In the same vein, we can't even be sure that it's the actual hardware that doesn't support that, or just the software stack. Not really the case with CUDA since, as far as I know, the PTX is actually jitted into the actual SASS used by the underlying hardware. Iirc, this also applies to most (if not all) other APIs on top of GPUs.
The above is a bit pedantic, so feel free to ignore it if what you meant was something in a more broader sense. Can it? I know that on linux you can't do so with Vulkan, not sure about Windows.
Nonetheless, in case it can, it corroborates with what I had said: the support was dropped from the software stack. The Vulkan/DirectX/whatever API you're running still has support to JIT 32-bit code into the proper SASS. The CUDA driver, however, has no such support for blackwell specifically. The GPU itself is not even aware of that.
Reminder that for each API, you have a specific driver that handles submissions to the GPU.
WIth the above said, there are other runtimes/compilers that allow you to emit 32-bit code and will support in some way. With open source drivers nothing stops you from emitting SPIR-V (not PTX) in any way you want and jitting those in the appropriate manner. Idk, but I believe they are slowly moving towards that. But they have been saying so for years, that's what a deprecation notice is. Once something is deprecated, killing it may happen anytime, and nvidia announced so as soon as the new CUDA version supporting blackwell was released.
If you were to do the same, and give the maximum benefit of doubt, 20+ years when you could buy a 64 bit or 32 bit OS means that for some people's entire lives where they've operated a computer they had the choice...until very recently. And we're not talking toddler, we are talking legal to vote and drive cars (or drink and vote depending upon your region). That's kinda some silly long time to maintain something, that you always say is no longer supported, but keep supporting. I cannot make judgements, but it seems a bit silly to gimp performance eight years later...almost like somebody decided to finally do spring cleaning of code "that nobody used, obviously." They then find out somebody did still use it...which makes them either incompetent enough to not understand, incompetent enough to not have cleaned to now, or otherwise simply silly enough to add more fuel to the 20 car pile-up wreck that Blackwell has been.
I...like to think this was a low level developer finally cleaning out old code, looking to boost efficiency at no cost, and their dream of fixing things backfired because they don't understand the reason that nobody bothered to fix things was because the fix would hurt more than letting that stupid black box of code continue to fester as a virtually negligible overhead cost...because it's a beautiful tribute to "don't fix what ain't broken." An axiom that more engineers should live by.
You want to play old PhysX games? Keep an older PC around to play game from 15-20+ years ago or stick to the 40 series. Life goes on. I'm not complaining they stopped manufacturing PS2's...
developer.download.nvidia.com/compute/cuda/6_0/rel/docs/CUDA_Toolkit_Release_Notes.pdf