Monday, February 12th 2024
AMD Develops ROCm-based Solution to Run Unmodified NVIDIA's CUDA Binaries on AMD Graphics
AMD has quietly funded an effort over the past two years to enable binary compatibility for NVIDIA CUDA applications on their ROCm stack. This allows CUDA software to run on AMD Radeon GPUs without adapting the source code. The project responsible is ZLUDA, which was initially developed to provide CUDA support on Intel graphics. The developer behind ZLUDA, Andrzej Janik, was contracted by AMD in 2022 to adapt his project for use on Radeon GPUs with HIP/ROCm. He spent two years bringing functional CUDA support to AMD's platform, allowing many real-world CUDA workloads to run without modification. AMD decided not to productize this effort for unknown reasons but did open-source it once funding ended per their agreement. Over at Phoronix, there were several benchmarks testing AMD's ZLUDA implementation over a wide variety of benchmarks.
Benchmarks found that proprietary CUDA renderers and software worked on Radeon GPUs out-of-the-box with the drop-in ZLUDA library replacements. CUDA-optimized Blender 4.0 rendering now runs faster on AMD Radeon GPUs than the native ROCm/HIP port, reducing render times by around 10-20%, depending on the scene. The implementation is surprisingly robust, considering it was a single-developer project. However, there are some limitations—OptiX and PTX assembly codes still need to be fully supported. Overall, though, testing showed very promising results. Over the generic OpenCL runtimes in Geekbench, CUDA-optimized binaries produce up to 75% better results. With the ZLUDA libraries handling API translation, unmodified CUDA binaries can now run directly on top of ROCm and Radeon GPUs. Strangely, the ZLUDA port targets AMD ROCm 5.7, not the newest 6.x versions. Only time will tell if AMD continues investing in this approach to simplify porting of CUDA software. However, the open-sourced project now enables anyone to contribute and help improve compatibility. For a complete review, check out Phoronix tests.
Sources:
Phoronix, ZLUDA
Benchmarks found that proprietary CUDA renderers and software worked on Radeon GPUs out-of-the-box with the drop-in ZLUDA library replacements. CUDA-optimized Blender 4.0 rendering now runs faster on AMD Radeon GPUs than the native ROCm/HIP port, reducing render times by around 10-20%, depending on the scene. The implementation is surprisingly robust, considering it was a single-developer project. However, there are some limitations—OptiX and PTX assembly codes still need to be fully supported. Overall, though, testing showed very promising results. Over the generic OpenCL runtimes in Geekbench, CUDA-optimized binaries produce up to 75% better results. With the ZLUDA libraries handling API translation, unmodified CUDA binaries can now run directly on top of ROCm and Radeon GPUs. Strangely, the ZLUDA port targets AMD ROCm 5.7, not the newest 6.x versions. Only time will tell if AMD continues investing in this approach to simplify porting of CUDA software. However, the open-sourced project now enables anyone to contribute and help improve compatibility. For a complete review, check out Phoronix tests.
54 Comments on AMD Develops ROCm-based Solution to Run Unmodified NVIDIA's CUDA Binaries on AMD Graphics
I mean it's a fantastic eco-system in regards of software - look at the malware that was resident for years in Google's appstore. Not a thing found through apple.
What does PhsyX have to do with CUDA?
Since Radeon cards are now capable of running CUDA and PhysX code, the GPU-Z application must show the CUDA and PhysX checkboxes checked for AMD graphics cards as well.
If the GPU-Z application does not show these checkboxes marked for AMD cards, it will be showing wrong information.
So yes, you're joking..
It would be reasonable that AMD pours silently some money on this guy to maintain development of ZLUDA for "educational" purposes, but certainly they can't go full swing on it, is also reasonable that they want to push their own solution.
My point stands: there's no need to push for any hardware-dependent feature when a hardware-agnostic competitor feature is available, minor differences in quality aside.
But NV does not have open features for a reason and the most important question here is, are the features NV offer indeed hardware dependent or is it just advertising scheme to charge more for it? I'm sure NV could go the same route AMD did but they didn't. Do you see my point now? It would seem you are focusing on something here you would want not what it is. I'm not saying i dont support what you said.
One can argue that DLSS is better, but it's not so much better that paying extra for it would be a viable option, imo.
The other thing is that all upscaling is basically a crutch to help out with performance issues, unless you're targeting 4K + Ultra graphics. It's shouldn't be advertised as something extra to pay for. What I mean is that whether Nvidia artificially locked DLSS to RTX GPUs, or you really need Tensor cores to run it doesn't change the fact that you need an RTX card for DLSS, but you don't need a Radeon for FSR.
I very much agree with your notion that one should pay for performance, and not for features that we never even asked for in the first place.
On topic: CUDA is possibly a different cattle of fish, as some people use it for work, so paying for it kind of makes sense. I, on the other hand, don't work on my GPU, so I couldn't care less.