Monday, March 3rd 2025

NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

Mar 3rd, 2025 10:52 Discuss (85 Comments)

PassMark Software has identified the root cause behind unexpectedly low compute performance in NVIDIA's new GeForce RTX 5090, RTX 5080, and RTX 5070 Ti GPUs. The culprit: NVIDIA has silently discontinued support for 32-bit OpenCL and CUDA in its "Blackwell" architecture, causing compatibility issues with existing benchmarking tools and applications. The issue manifested when PassMark's DirectCompute benchmark returned the error code "CL_OUT_OF_RESOURCES (-5)" on RTX 5000 series cards. After investigation, developers confirmed that while the benchmark's primary application has been 64-bit for years, several compute sub-benchmarks still utilize 32-bit code that previously functioned correctly on RTX 4000 and earlier GPUs. This architectural change wasn't clearly documented by NVIDIA, whose developer website continues to display 32-bit code samples and documentation despite the removal of actual support.

The impact extends beyond benchmarking software. Applications built on legacy CUDA infrastructure, including technologies like PhysX, will experience significant performance degradation as computational tasks fall back to CPU processing rather than utilizing the GPU's parallel architecture. While this fallback mechanism allows older applications to run on the RTX 40 series and prior hardware, the RTX 5000 series handles these tasks exclusively through the CPU, resulting in substantially lower performance. PassMark is currently working to port the affected OpenCL code to 64-bit, allowing proper testing of the new GPUs' compute capabilities. However, they warn that many existing applications containing 32-bit OpenCL components may never function properly on RTX 5000 series cards without source code modifications. The benchmark developer also notes this change doesn't fully explain poor DirectX9 performance, suggesting additional architectural changes may affect legacy rendering pathways. PassMark updated its software today, but legacy benchmarks could still suffer. Below is an older benchmark run without the latest PassMark V11.1 build 1004 patches, showing just how much the newest generations suffers without a proper software support.

Sources: PassMark on X, via Tom's Hardware

Add your own comment

85 Comments on NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

#51

Visible Noise

bonehead123Anutha day, anutha big, fat F.U. from nGreediya to consumer GPU buyers everywhere...:mad:..:eek:..:twitch:

And, I hate having to repeat my self, but..

HAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA.....GOTCHA Suckas !

Username fits.

Hecate91This whataboutism isn't mature either, I'm so tired of seeing "but but other brand doesn't have the feature" because fans refuse to criticize their beloved brand no matter how shitty they're being to consumers and developers.

Poor AMD. Maybe someday they will get RocM running on Windows, right.

Until then, keep up the good fight.

#52

R-T-B

Visible NoiseOpenCL?

OpenCL is a minority player for sure but it's far from dead.

#53

Visible Noise

R-T-BOpenCL is a minority player for sure but it's far from dead.

It‘s been gone for a decade. Do you have any apps that use it?

#54

R-T-B

Visible NoiseIt‘s been gone for a decade. Do you have any apps that use it?

It was huge in the crypto boom which was less than a decade ago.

It's also pretty much the only game in town on AMD, and they do have market share. So, not dead.

If you are asking me, I've toyed with it yes. Can't really discuss what I use it for (my job is quite sensitive these days).

#55

Visible Noise

R-T-BIt's also pretty much the only game in town on AMD, and they do have market share. So, not dead.

Did you miss my post showing that AMD dropped support after 2015?

#56

Dr. Dro

OpenCL still runs on AMD (and NV), it just hasn't been updated in forever iirc

#57

igormp

R-T-BIt was huge in the crypto boom which was less than a decade ago.

It's also pretty much the only game in town on AMD, and they do have market share. So, not dead.

If you are asking me, I've toyed with it yes. Can't really discuss what I use it for (my job is quite sensitive these days).

AMD nowadays favor their ROCm stack instead of opencl.

#58

ScaLibBDP

Vya DomusThat error can come up for many different reasons, not to mention that if this was the case it would also crash on other cards which do have support for 32bit.

>>...That error can come up for many different reasons...

Of course it could and that is why any processsing, benchmarking in that case, needs to be done after a set of verifications.

In OpenCL programming world the set of verifications need to be completed during initialization and This is how it looks like in my codes:
...
iOk = OclGetDeviceInfo( clDeviceId[ uiCurrentDevice ], CL_DEVICE_MAX_COMPUTE_UNITS,
sizeof( CLuint ), ( CLvoid * )&ulPropValue, &uiRetValue );
OclPrintf2( OTU("\t\tCL_DEVICE_MAX_COMPUTE_UNITS : %12u\n"), ( CLuint )ulPropValue );
...
iOk = OclGetDeviceInfo( clDeviceId[ uiCurrentDevice ], CL_DEVICE_MAX_MEM_ALLOC_SIZE,
sizeof( CLulong ), ( CLvoid * )&ulPropValue, &uiRetValue );
OclPrintf2( OTU("\t\tCL_DEVICE_MAX_MEM_ALLOC_SIZE: %12.0f bytes\n"), ( CLfloat )ulPropValue );
...
iOk = OclGetDeviceInfo( clDeviceId[ uiCurrentDevice ], CL_DEVICE_GLOBAL_MEM_SIZE,
sizeof( CLulong ), ( CLvoid * )&ulPropValue, &uiRetValue );
OclPrintf2( OTU("\t\tCL_DEVICE_GLOBAL_MEM_SIZE : %12.0f bytes\n"), ( CLfloat )ulPropValue );
...
iOk = OclGetDeviceInfo( clDeviceId[ uiCurrentDevice ], CL_DEVICE_LOCAL_MEM_SIZE,
sizeof( CLulong ), ( CLvoid * )&ulPropValue, &uiRetValue );
OclPrintf2( OTU("\t\tCL_DEVICE_LOCAL_MEM_SIZE : %12.0f bytes\n"), ( CLfloat )ulPropValue );
...
In my OpenCL codes as soon as these steps completed some Run-Time values updated and Only After That processing continues.

It is very important to pay attention to all memory size related values because they are different for 32-bit and 64-bit OpenCL drivers for an OpenCL platform!

32-bit memory related values are usually lower than 64-bit values for the OpenCL platform.

As I've already mentioned the OpenCL device initialization is a Multi-Step process and memory is allocated after all these steps successfully completed:
...
iOk = OclGetPlatformIDs( 0, RTnull, &uiNumOfPlatforms );
if( iOk != CL_SUCCESS )
break;
if( uiNumOfPlatforms > _RTNUMBER_OF_PLATFORMS )
break;

iOk = OclGetPlatformIDs( uiNumOfPlatforms, &clPlatformId[0], RTnull );
if( iOk != CL_SUCCESS )
break;
iOk = OclGetPlatformInfo( clPlatformId[ iPlatformId ], CL_PLATFORM_NAME, 64, &g_szPlatformName[0], RTnull );
if( iOk != CL_SUCCESS )
break;
OclPrintf2( OTU("\tPlatform Name : %s\n"), &g_szPlatformName[0] );

iOk = OclGetDeviceIDs( clPlatformId[ iPlatformId ], iDeviceType, 1, &clDeviceId, RTnull );
if( iOk != CL_SUCCESS )
{
OclPrintf2( OTU("\tDevice of selected type is Not supported: %d\n"), iOk );
break;
}
iOk = OclGetDeviceInfo( clDeviceId, CL_DEVICE_NAME, 64, &g_szDeviceName[0], RTnull );
if( iOk != CL_SUCCESS )
break;
RTint n = 0;
while( g_szDeviceName[n] == ' ' )
n += 1;
OclPrintf2( OTU("\tDevice Name : %s\n"), &g_szDeviceName[n] );

clContext = OclCreateContext( RTnull, 1, &clDeviceId, RTnull, RTnull, &iOk );
if( iOk != CL_SUCCESS )
break;
if( clContext == RTnull )
break;

CLCommandQueueProperties clQueueProps = 0;
clQueueProps |= CL_QUEUE_PROFILING_ENABLE;

clCommandQueue = OclCreateCommandQueue( clContext, clDeviceId, clQueueProps, &iOk );
if( iOk != CL_SUCCESS )
break;
if( clCommandQueue == RTnull )
break;

clProgram = OclCreateProgramWithSource( clContext, 1, &szKernelFunction02I, RTnull, &iOk );
if( iOk != CL_SUCCESS )
break;
if( clProgram == RTnull )
break;

iOk = OclBuildProgram( clProgram, 1, &clDeviceId, RTnull, RTnull, RTnull );
if( iOk != CL_SUCCESS )
break;

clKernel = OclCreateKernel( clProgram, "KernelMemSetI", &iOk );
if( iOk != CL_SUCCESS )
break;
if( clKernel == RTnull )
break;

if( iDataSetSize == 0 )
break;

piDataSet1 = ( CLint * )CrtMalloc( iDataSetSize * sizeof( CLint ) );
if( piDataSet1 == RTnull )
break;

for( i = 0; i < iDataSetSize; i += 1 )
piDataSet1 = 0;
...
I remember that error CL_OUT_OF_RESOURCES ( -5 ) was always related to an attempt to allocate the device memory that exceeds numbers for CL_DEVICE_MAX_MEM_ALLOC_SIZE, or CL_DEVICE_GLOBAL_MEM_SIZE, or CL_DEVICE_LOCAL_MEM_SIZE params.

#59

chrcoluk

Visible NoiseThis says more about Passmark than Nvidia. 32 bit code was deprecated seven years ago by Nvidia, and apparently Passmark didn’t know their own code base well enough to realize they were still running 32 bit code.

Did they say when they are going to fix their software ?

Edit: OpenCL? Even more irrelevant. Is anyone updating their OpenCL drivers? OpenCL was dead a decade ago.

Did passmark developed all the individual tools? But regardless 32bit is still present today used by many developers, personally I wish that wasnt the case and all current developed software was 64bit, but no we still have 32bit, including games.
Not to mention people still play older games and use older software.
Really its Nvidia who thought this out poorly.
They cut it off based on hardware generation, and seemed to have not provided any warning it was going to happen.

lilhasselhofferI think looking at the history this is funny.

Windows 95 - first full 32 bit OS.
Windows 98 - 32 bit only
Windows 2000 - 32 bit only
Windows XP - 2001 - 32 bit and 64 bit
Windows Vista - 32 bit and 64 bit
Windows 7 - 32 bit and 64 bit
Windows 10 - 2024 - 32 bit and 64 bit
Windows 11 - 64 bit only

Difference is that Windows still has Wow64 system, Nvidia need to do a translation layer, which you did mention at end of your post.

#60

Visible Noise

Dr. DroOpenCL still runs on AMD (and NV), it just hasn't been updated in forever iirc

Latest AMD certified driver

Latest Nvidia certified driver

Latest Intel certified driver

#61

R-T-B

Visible NoiseDid you miss my post showing that AMD dropped support after 2015?

Then how the hell am I using it? Color me amused.

What exactly DO you think people are developing with on AMD cards? Because if theres an alternative maybe you could actually teach me something (it may be that by virtue of me being in gentoo linux now, OSS mesa props it up, come to think of it).

#62

Visible Noise

chrcolukDid passmark developed all the individual tools?

Yes

chrcolukThey cut it off based on hardware generation, and seemed to have not provided any warning it was going to happen.

Only eight years of warning.

R-T-BThen how the hell am I using it? Color me amused.

Supported and exists are different things. What COTS application is using it today?

#63

R-T-B

igormpAMD nowadays favor their ROCm stack instead of opencl.

Oh I knew that, but doesn't it implement opencl as well? Or is that mesa?

Visible NoiseWhat COTS application is using it today?

Fair point. My custom habits don't dictate the industry. Have a like for an excellent point. Sometimes I get lost in my own little world lol.

#64

igormp

R-T-BOh I knew that, but doesn't it implement opencl as well? Or is that mesa?

Both do, you have the ROCm driver for opencl and also the mesa ones (plural, there are many drivers within mesa for you to chose from)
But still, they recommend using hip instead of opencl for ROCm.

#65

R-T-B

igormpBoth do, you have the ROCm driver for opencl and also the mesa ones (plural, there are many drivers within mesa for you to chose from)
But still, they recommend using hip instead of opencl for ROCm.

Pretty sure just ROCm but mesa is installed of course (amdgpu needs it). Will have to double check my use flags this evening.

Tech. Just when you think you learned something cool, its obsolete lol.

#66

Visible Noise

R-T-BThen how the hell am I using it? Color me amused.

What exactly DO you think people are developing with on AMD cards? Because if theres an alternative maybe you could actually teach me something (it may be that by virtue of me being in gentoo linux now, OSS mesa props it up, come to think of it).

You use ROCm to (kinda) get OpenCL 2.0 on AMD. Note there is no OpenCL 3.0 or later support on AMD.

#67

R-T-B

Visible NoiseYou use ROCm to (kinda) get OpenCL 2.0 on AMD. Note there is no OpenCL 3.0 or later support on AMD.

Yeah I am understanding that now, appologies for the confusion and even to a degree, misinformation I was stating.

#68

Visible Noise

R-T-BYeah I am understanding that now, appologies for the confusion and even to a degree, misinformation I was stating.

No worries, it’s my job to stay current on technology. I live this stuff everyday. I’ve lost count of the number of application frameworks that have come and gone in the last 40 years.

#69

R-T-B

Visible NoiseNo worries, it’s my job to stay current on technology. I live this stuff everyday. I’ve lost count of the number of application frameworks that have come and gone in the last 40 years.

Same but my speciality is more security than anything else these days. Thanks for keeping me sharp in my blindspots.

#70

_roman_

ScaLibBDPIn OpenCL progr

You may use code tags. Same syntax as in other forums

End of the code block [/code]
Start of the code block is without /

example

fancy code easy to read

I'm not sure if that opencl language is proper used with all those /break statements.

I hardly see any additional information in posting a huge wall of code without using the code blocks. Where is the additional information? Without explanation of what is being shown?

Finding maybe the user or ai generated cooments in such a text wall with code is not easy. Grasping why it is posted in the first place is not that easy for myself with decent C knowledge.

Hecate91This whataboutism isn't mature either, I'm so tired of seeing "but but other brand doesn't have the feature" because fans refuse to criticize their beloved brand no matter how shitty they're being to consumers and developers.

I think nvidia is against it to broadly support CUDA. I remember some software which can do cuda. I think there are legal issues from NVIDIA. I think I read it here in past months.

Just as a starting point.
www.tomshardware.com/pc-components/gpus/amd-asks-developer-to-take-down-open-source-zluda-dev-vows-to-rebuild-his-project

When someone knows the details better, please fill in teh details.

www.tomshardware.com/pc-components/gpus/nvidia-bans-using-translation-layers-for-cuda-software-to-run-on-other-chips-new-restriction-apparently-targets-zluda-and-some-chinese-gpu-makers

Note: I could just read a few minutes ago more comments on my android tablet. Here they are on the well deserved ignore list. Just don't. Stick to the topic please

--

You may buy a nvidia graphic card and than you may expect very old software to be supported? You can not be serious! Closed - source - binary - windows - blob

#71

ScaLibBDP

@_roman_
>>...I'm not sure if that opencl language is proper used with all those /break statements....

Do Not go personal and do Not teach experienced Software Engineers how to implement some functionality.

#72

eidairaman1

The Exiled Airman

Shit product

#73

Visible Noise

eidairaman1Shit product

Thank you for your contribution to the discussion. It’s truly insightful and informative.

#74

eidairaman1

The Exiled Airman

Visible NoiseThank you for your contribution to the discussion. It’s truly insightful and informative.

You're welcome

#75

sbacc

Nvidia is just following Apple, Apple always provide past architecture support to their customer but only to a degree and deprecate relatively abruptly (reading this cuda 32bit story, Mac OSX Lion drop of support for pre Intel code came to mind).
That was the big difference with Microsoft that tend to keep support for much longer, not caring much about bloating their soft, but even them seem to have change and start taking a similar approach.
I would argue that, developing and releasing a translation layer to keep those old codes running would be a more cavalier approach. But this cost money for no real return and considering the un-charity level of big US corp today, this is a max-level pipe dream at this point...

Add your own comment

NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

85 Comments on NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

Related News

85 Comments on NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts