Monday, March 3rd 2025

NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

PassMark Software has identified the root cause behind unexpectedly low compute performance in NVIDIA's new GeForce RTX 5090, RTX 5080, and RTX 5070 Ti GPUs. The culprit: NVIDIA has silently discontinued support for 32-bit OpenCL and CUDA in its "Blackwell" architecture, causing compatibility issues with existing benchmarking tools and applications. The issue manifested when PassMark's DirectCompute benchmark returned the error code "CL_OUT_OF_RESOURCES (-5)" on RTX 5000 series cards. After investigation, developers confirmed that while the benchmark's primary application has been 64-bit for years, several compute sub-benchmarks still utilize 32-bit code that previously functioned correctly on RTX 4000 and earlier GPUs. This architectural change wasn't clearly documented by NVIDIA, whose developer website continues to display 32-bit code samples and documentation despite the removal of actual support.

The impact extends beyond benchmarking software. Applications built on legacy CUDA infrastructure, including technologies like PhysX, will experience significant performance degradation as computational tasks fall back to CPU processing rather than utilizing the GPU's parallel architecture. While this fallback mechanism allows older applications to run on the RTX 40 series and prior hardware, the RTX 5000 series handles these tasks exclusively through the CPU, resulting in substantially lower performance. PassMark is currently working to port the affected OpenCL code to 64-bit, allowing proper testing of the new GPUs' compute capabilities. However, they warn that many existing applications containing 32-bit OpenCL components may never function properly on RTX 5000 series cards without source code modifications. The benchmark developer also notes this change doesn't fully explain poor DirectX9 performance, suggesting additional architectural changes may affect legacy rendering pathways. PassMark updated its software today, but legacy benchmarks could still suffer. Below is an older benchmark run without the latest PassMark V11.1 build 1004 patches, showing just how much the newest generations suffers without a proper software support.
Sources: PassMark on X, via Tom's Hardware
Add your own comment

74 Comments on NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

#51
Visible Noise
bonehead123Anutha day, anutha big, fat F.U. from nGreediya to consumer GPU buyers everywhere...:mad:..:eek:..:twitch:

And, I hate having to repeat my self, but..

HAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA.....GOTCHA Suckas !
Username fits.
Hecate91This whataboutism isn't mature either, I'm so tired of seeing "but but other brand doesn't have the feature" because fans refuse to criticize their beloved brand no matter how shitty they're being to consumers and developers.
Poor AMD. Maybe someday they will get RocM running on Windows, right.

Until then, keep up the good fight.
Posted on Reply
#52
R-T-B
Visible NoiseOpenCL?
OpenCL is a minority player for sure but it's far from dead.
Posted on Reply
#53
Visible Noise
R-T-BOpenCL is a minority player for sure but it's far from dead.
It‘s been gone for a decade. Do you have any apps that use it?
Posted on Reply
#54
R-T-B
Visible NoiseIt‘s been gone for a decade. Do you have any apps that use it?
It was huge in the crypto boom which was less than a decade ago.

It's also pretty much the only game in town on AMD, and they do have market share. So, not dead.

If you are asking me, I've toyed with it yes. Can't really discuss what I use it for (my job is quite sensitive these days).
Posted on Reply
#55
Visible Noise
R-T-BIt's also pretty much the only game in town on AMD, and they do have market share. So, not dead.
Did you miss my post showing that AMD dropped support after 2015?
Posted on Reply
#56
Dr. Dro
OpenCL still runs on AMD (and NV), it just hasn't been updated in forever iirc
Posted on Reply
#57
igormp
R-T-BIt was huge in the crypto boom which was less than a decade ago.

It's also pretty much the only game in town on AMD, and they do have market share. So, not dead.

If you are asking me, I've toyed with it yes. Can't really discuss what I use it for (my job is quite sensitive these days).
AMD nowadays favor their ROCm stack instead of opencl.
Posted on Reply
#58
ScaLibBDP
Vya DomusThat error can come up for many different reasons, not to mention that if this was the case it would also crash on other cards which do have support for 32bit.
>>...That error can come up for many different reasons...

Of course it could and that is why any processsing, benchmarking in that case, needs to be done after a set of verifications.

In OpenCL programming world the set of verifications need to be completed during initialization and This is how it looks like in my codes:
...
iOk = OclGetDeviceInfo( clDeviceId[ uiCurrentDevice ], CL_DEVICE_MAX_COMPUTE_UNITS,
sizeof( CLuint ), ( CLvoid * )&ulPropValue, &uiRetValue );
OclPrintf2( OTU("\t\tCL_DEVICE_MAX_COMPUTE_UNITS : %12u\n"), ( CLuint )ulPropValue );
...
iOk = OclGetDeviceInfo( clDeviceId[ uiCurrentDevice ], CL_DEVICE_MAX_MEM_ALLOC_SIZE,
sizeof( CLulong ), ( CLvoid * )&ulPropValue, &uiRetValue );
OclPrintf2( OTU("\t\tCL_DEVICE_MAX_MEM_ALLOC_SIZE: %12.0f bytes\n"), ( CLfloat )ulPropValue );
...
iOk = OclGetDeviceInfo( clDeviceId[ uiCurrentDevice ], CL_DEVICE_GLOBAL_MEM_SIZE,
sizeof( CLulong ), ( CLvoid * )&ulPropValue, &uiRetValue );
OclPrintf2( OTU("\t\tCL_DEVICE_GLOBAL_MEM_SIZE : %12.0f bytes\n"), ( CLfloat )ulPropValue );
...
iOk = OclGetDeviceInfo( clDeviceId[ uiCurrentDevice ], CL_DEVICE_LOCAL_MEM_SIZE,
sizeof( CLulong ), ( CLvoid * )&ulPropValue, &uiRetValue );
OclPrintf2( OTU("\t\tCL_DEVICE_LOCAL_MEM_SIZE : %12.0f bytes\n"), ( CLfloat )ulPropValue );
...
In my OpenCL codes as soon as these steps completed some Run-Time values updated and Only After That processing continues.

It is very important to pay attention to all memory size related values because they are different for 32-bit and 64-bit OpenCL drivers for an OpenCL platform!

32-bit memory related values are usually lower than 64-bit values for the OpenCL platform.

As I've already mentioned the OpenCL device initialization is a Multi-Step process and memory is allocated after all these steps successfully completed:
...
iOk = OclGetPlatformIDs( 0, RTnull, &uiNumOfPlatforms );
if( iOk != CL_SUCCESS )
break;
if( uiNumOfPlatforms > _RTNUMBER_OF_PLATFORMS )
break;

iOk = OclGetPlatformIDs( uiNumOfPlatforms, &clPlatformId[0], RTnull );
if( iOk != CL_SUCCESS )
break;
iOk = OclGetPlatformInfo( clPlatformId[ iPlatformId ], CL_PLATFORM_NAME, 64, &g_szPlatformName[0], RTnull );
if( iOk != CL_SUCCESS )
break;
OclPrintf2( OTU("\tPlatform Name : %s\n"), &g_szPlatformName[0] );

iOk = OclGetDeviceIDs( clPlatformId[ iPlatformId ], iDeviceType, 1, &clDeviceId, RTnull );
if( iOk != CL_SUCCESS )
{
OclPrintf2( OTU("\tDevice of selected type is Not supported: %d\n"), iOk );
break;
}
iOk = OclGetDeviceInfo( clDeviceId, CL_DEVICE_NAME, 64, &g_szDeviceName[0], RTnull );
if( iOk != CL_SUCCESS )
break;
RTint n = 0;
while( g_szDeviceName[n] == ' ' )
n += 1;
OclPrintf2( OTU("\tDevice Name : %s\n"), &g_szDeviceName[n] );

clContext = OclCreateContext( RTnull, 1, &clDeviceId, RTnull, RTnull, &iOk );
if( iOk != CL_SUCCESS )
break;
if( clContext == RTnull )
break;

CLCommandQueueProperties clQueueProps = 0;
clQueueProps |= CL_QUEUE_PROFILING_ENABLE;

clCommandQueue = OclCreateCommandQueue( clContext, clDeviceId, clQueueProps, &iOk );
if( iOk != CL_SUCCESS )
break;
if( clCommandQueue == RTnull )
break;

clProgram = OclCreateProgramWithSource( clContext, 1, &szKernelFunction02I, RTnull, &iOk );
if( iOk != CL_SUCCESS )
break;
if( clProgram == RTnull )
break;

iOk = OclBuildProgram( clProgram, 1, &clDeviceId, RTnull, RTnull, RTnull );
if( iOk != CL_SUCCESS )
break;

clKernel = OclCreateKernel( clProgram, "KernelMemSetI", &iOk );
if( iOk != CL_SUCCESS )
break;
if( clKernel == RTnull )
break;

if( iDataSetSize == 0 )
break;

piDataSet1 = ( CLint * )CrtMalloc( iDataSetSize * sizeof( CLint ) );
if( piDataSet1 == RTnull )
break;

for( i = 0; i < iDataSetSize; i += 1 )
piDataSet1 = 0;
...

I remember that error CL_OUT_OF_RESOURCES ( -5 ) was always related to an attempt to allocate the device memory that exceeds numbers for CL_DEVICE_MAX_MEM_ALLOC_SIZE, or CL_DEVICE_GLOBAL_MEM_SIZE, or CL_DEVICE_LOCAL_MEM_SIZE params.
Posted on Reply
#59
chrcoluk
Visible NoiseThis says more about Passmark than Nvidia. 32 bit code was deprecated seven years ago by Nvidia, and apparently Passmark didn’t know their own code base well enough to realize they were still running 32 bit code.

Did they say when they are going to fix their software ?

Edit: OpenCL? Even more irrelevant. Is anyone updating their OpenCL drivers? OpenCL was dead a decade ago.
Did passmark developed all the individual tools? But regardless 32bit is still present today used by many developers, personally I wish that wasnt the case and all current developed software was 64bit, but no we still have 32bit, including games.
Not to mention people still play older games and use older software.
Really its Nvidia who thought this out poorly.
They cut it off based on hardware generation, and seemed to have not provided any warning it was going to happen.
lilhasselhofferI think looking at the history this is funny.

Windows 95 - first full 32 bit OS.
Windows 98 - 32 bit only
Windows 2000 - 32 bit only
Windows XP - 2001 - 32 bit and 64 bit
Windows Vista - 32 bit and 64 bit
Windows 7 - 32 bit and 64 bit
Windows 10 - 2024 - 32 bit and 64 bit
Windows 11 - 64 bit only
Difference is that Windows still has Wow64 system, Nvidia need to do a translation layer, which you did mention at end of your post.
Posted on Reply
#60
Visible Noise
Dr. DroOpenCL still runs on AMD (and NV), it just hasn't been updated in forever iirc
Latest AMD certified driver

Latest Nvidia certified driver

Latest Intel certified driver
Posted on Reply
#61
R-T-B
Visible NoiseDid you miss my post showing that AMD dropped support after 2015?
Then how the hell am I using it? Color me amused.

What exactly DO you think people are developing with on AMD cards? Because if theres an alternative maybe you could actually teach me something (it may be that by virtue of me being in gentoo linux now, OSS mesa props it up, come to think of it).
Posted on Reply
#62
Visible Noise
chrcolukDid passmark developed all the individual tools?
Yes
chrcolukThey cut it off based on hardware generation, and seemed to have not provided any warning it was going to happen.
Only eight years of warning.
R-T-BThen how the hell am I using it? Color me amused.
Supported and exists are different things. What COTS application is using it today?
Posted on Reply
#63
R-T-B
igormpAMD nowadays favor their ROCm stack instead of opencl.
Oh I knew that, but doesn't it implement opencl as well? Or is that mesa?
Visible NoiseWhat COTS application is using it today?
Fair point. My custom habits don't dictate the industry. Have a like for an excellent point. Sometimes I get lost in my own little world lol.
Posted on Reply
#64
igormp
R-T-BOh I knew that, but doesn't it implement opencl as well? Or is that mesa?
Both do, you have the ROCm driver for opencl and also the mesa ones (plural, there are many drivers within mesa for you to chose from)
But still, they recommend using hip instead of opencl for ROCm.
Posted on Reply
#65
R-T-B
igormpBoth do, you have the ROCm driver for opencl and also the mesa ones (plural, there are many drivers within mesa for you to chose from)
But still, they recommend using hip instead of opencl for ROCm.
Pretty sure just ROCm but mesa is installed of course (amdgpu needs it). Will have to double check my use flags this evening.

Tech. Just when you think you learned something cool, its obsolete lol.
Posted on Reply
#66
Visible Noise
R-T-BThen how the hell am I using it? Color me amused.

What exactly DO you think people are developing with on AMD cards? Because if theres an alternative maybe you could actually teach me something (it may be that by virtue of me being in gentoo linux now, OSS mesa props it up, come to think of it).
You use ROCm to (kinda) get OpenCL 2.0 on AMD. Note there is no OpenCL 3.0 or later support on AMD.

Posted on Reply
#67
R-T-B
Visible NoiseYou use ROCm to (kinda) get OpenCL 2.0 on AMD. Note there is no OpenCL 3.0 or later support on AMD.

Yeah I am understanding that now, appologies for the confusion and even to a degree, misinformation I was stating.
Posted on Reply
#68
Visible Noise
R-T-BYeah I am understanding that now, appologies for the confusion and even to a degree, misinformation I was stating.
No worries, it’s my job to stay current on technology. I live this stuff everyday. I’ve lost count of the number of application frameworks that have come and gone in the last 40 years.
Posted on Reply
#69
R-T-B
Visible NoiseNo worries, it’s my job to stay current on technology. I live this stuff everyday. I’ve lost count of the number of application frameworks that have come and gone in the last 40 years.
Same but my speciality is more security than anything else these days. Thanks for keeping me sharp in my blindspots.
Posted on Reply
#70
_roman_
ScaLibBDPIn OpenCL progr
You may use code tags. Same syntax as in other forums

End of the code block [/code]
Start of the code block is without /

example
fancy code easy to read

I'm not sure if that opencl language is proper used with all those /break statements.

I hardly see any additional information in posting a huge wall of code without using the code blocks. Where is the additional information? Without explanation of what is being shown?

Finding maybe the user or ai generated cooments in such a text wall with code is not easy. Grasping why it is posted in the first place is not that easy for myself with decent C knowledge.
Hecate91This whataboutism isn't mature either, I'm so tired of seeing "but but other brand doesn't have the feature" because fans refuse to criticize their beloved brand no matter how shitty they're being to consumers and developers.
I think nvidia is against it to broadly support CUDA. I remember some software which can do cuda. I think there are legal issues from NVIDIA. I think I read it here in past months.

Just as a starting point.
www.tomshardware.com/pc-components/gpus/amd-asks-developer-to-take-down-open-source-zluda-dev-vows-to-rebuild-his-project

When someone knows the details better, please fill in teh details.

www.tomshardware.com/pc-components/gpus/nvidia-bans-using-translation-layers-for-cuda-software-to-run-on-other-chips-new-restriction-apparently-targets-zluda-and-some-chinese-gpu-makers

Note: I could just read a few minutes ago more comments on my android tablet. Here they are on the well deserved ignore list. Just don't. Stick to the topic please

--

You may buy a nvidia graphic card and than you may expect very old software to be supported? You can not be serious! Closed - source - binary - windows - blob
Posted on Reply
#71
ScaLibBDP
@_roman_
>>...I'm not sure if that opencl language is proper used with all those /break statements....

Do Not go personal and do Not teach experienced Software Engineers how to implement some functionality.
Posted on Reply
#73
Visible Noise
eidairaman1Shit product
Thank you for your contribution to the discussion. It’s truly insightful and informative.
Posted on Reply
Add your own comment
Mar 3rd, 2025 23:03 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts