NVIDIA Explains How CUDA Libraries Bolster Cybersecurity With AI

T0@st · 2025-02-28T17:12:22+0000

Traditional cybersecurity measures are proving insufficient for addressing emerging cyber threats such as malware, ransomware, phishing and data access attacks. Moreover, future quantum computers pose a security risk to today's data through "harvest now, decrypt later" attack strategies. Cybersecurity technology powered by NVIDIA accelerated computing and high-speed networking is transforming the way organizations protect their data, systems and operations. These advanced technologies not only enhance security but also drive operational efficiency, scalability and business growth.

Accelerated AI-Powered Cybersecurity
Modern cybersecurity relies heavily on AI for predictive analytics and automated threat mitigation. NVIDIA GPUs are essential for training and deploying AI models due to their exceptional computational power.

NVIDIA GPUs offer:

Faster AI model training: GPUs reduce the time required to train machine learning models for tasks like fraud detection or phishing prevention.
Real-time inference: AI models running on GPUs can analyze network traffic in real time to identify zero-day vulnerabilities or advanced persistent threats.
Automation at scale: Businesses can automate repetitive security tasks such as log analysis or vulnerability scanning, freeing up human resources for strategic initiatives. For example, AI-driven intrusion detection systems powered by NVIDIA GPUs can analyze billions of events per second to detect anomalies that traditional systems might miss. Learn more about NVIDIA AI cybersecurity solutions.

NVIDIA Editor's note: this is the next topic in our new CUDA Accelerated news series, which showcases the latest software libraries, NVIDIA NIM microservices and tools that help developers, software makers and enterprises use GPUs to accelerate their applications.

Real-Time Threat Detection and Response
GPUs excel at parallel processing, making them ideal for handling the massive computational demands of real-time cybersecurity tasks such as intrusion detection, malware analysis and anomaly detection. By combining them with high-performance networking software frameworks like NVIDIA DOCA and NVIDIA Morpheus, businesses can:

Detect threats faster: GPUs process large datasets in real time, enabling immediate identification of suspicious activities.
Respond proactively: High-speed networking ensures rapid communication between systems, allowing for swift containment of threats.
Minimize downtime: Faster response times reduce the impact of cyberattacks on business operations.

This capability is particularly beneficial for industries like finance and healthcare, where even a few seconds of downtime can result in significant losses or risks to public safety.

Scalability for Growing Infrastructure Cybersecurity Needs
As businesses grow and adopt more connected devices and cloud-based services, the volume of network traffic increases exponentially. Traditional CPU-based systems often struggle to keep up with these demands. GPUs and high-speed networking software provide massive scalability, capable of handling large-scale data processing effortlessly, either on premises or in the cloud.

For example, NVIDIA's cybersecurity solutions can help future-proof cybersecurity technologies and improve cost efficiency via centralized control.

Enhanced Data Security Across Distributed Environments
With remote work becoming the norm, businesses must secure sensitive data across a growing number of distributed locations. Distributed computing systems enhance the overall resilience of cybersecurity infrastructure by providing redundancy and fault tolerance, reduced downtime and data protection for continuous operation and minimum interruption, even during cyber attacks.

NVIDIA's high-speed data management and networking software paired with GPU-powered cybersecurity solutions offers consistent protection with automated updates, improved encryption and isolated threat zones. This is especially crucial for industries handling sensitive customer data, such as retail or e-commerce, where breaches can severely damage brand reputation. Learn more about NVIDIA's GPU cloud computing technologies.

Improved Regulatory Compliance
Regulatory frameworks such as GDPR, HIPAA, PCI DSS and SOC 2 require businesses to implement stringent security measures. GPU-powered cybersecurity solutions and high-speed networking software make compliance easier by ensuring data integrity, providing audit trails and reducing risk exposure.

Accelerating Post-Quantum Cryptography
Sufficiently large quantum computers can crack the Rivest-Shamir-Adleman (RSA) encryption algorithm underpinning today's data security solutions. Even though such devices have not yet been built, governing agencies around the world are recommending the use of post-quantum cryptography (PQC) algorithms to protect against attackers that might hoard sensitive data for decryption in the future.

PQC algorithms are based on mathematical operations more sophisticated than RSA, which are expected to be secure against attacks even by future quantum computers. The National Institute of Standards and Technology (NIST) has standardized a number of PQC algorithms and recommended that organizations should begin phasing out existing encryption methods by 2030—and transition entirely to PQC by 2035.

Widespread adoption of PQC requires ready access to highly performant and flexible implementations of these complex algorithms. NVIDIA cuPQC accelerates the most popular PQC algorithms, granting enterprises high throughputs of sensitive data to remain secure now and in the future.

Essentiality of Investing in Modern Cybersecurity Infrastructure
The integration of GPU-powered cybersecurity technology with high-speed networking software represents a paradigm shift in how businesses approach digital protection. By adopting these advanced solutions, businesses can stay ahead of evolving cyber threats while unlocking new opportunities for growth in an increasingly digital economy. Whether for safeguarding sensitive customer data or ensuring uninterrupted operations across global networks, investing in modern cybersecurity infrastructure is no longer optional but essential.

NVIDIA provides over 400 libraries for a variety of use cases, including building cybersecurity infrastructure. New updates continue to be added to the CUDA platform roadmap.

GPUs can't simply accelerate software written for general-purpose CPUs. Specialized algorithm software libraries, solvers and tools are needed to accelerate specific workloads, especially on computationally intensive distributed computing architectures. Strategically tighter integration between CPUs, GPUs and networking helps provide the right platform focus for future applications and business benefits.

Learn more about NVIDIA CUDA libraries and microservices for AI.

View at TechPowerUp Main Site | Source

evernessince · 2025-02-28T17:50:48+0000

I would absolutely avoid integrating CUDA libraries in anything as critical as Cyber-security after Nvidia pulled the rug on all 32-bit CUDA applications, libraries, plugins, etc. on 5000 series and later without providing a fallback. Absolutely destroys trust for companies developing long term solutions.

igormp · 2025-02-28T18:25:06+0000

evernessince said:
I would absolutely avoid integrating CUDA libraries in anything as critical as Cyber-security after Nvidia pulled the rug on all 32-bit CUDA applications, libraries, plugins, etc. on 5000 series and later without providing a fallback. Absolutely destroys trust for companies developing long term solutions.

32-bit CUDA support has been announced as deprecated since around cuda 9, over 7 years ago.
5000 series are pretty irrelevant in this regard, and Ada-based products and older should still have a long support window.

evernessince · 2025-02-28T18:53:11+0000

igormp said:
32-bit CUDA support has been announced as deprecated since around cuda 9, over 7 years ago.

The 5000 series are the first Nvidia cards unable to run 32-bit CUDA code since 32-bit support was added. The fact of the matter is they just recently removed a feature from the GPU without providing any fallback. Even Microsoft knows better than this, hence why you can't still run 32-bit windows apps despite the OS not natively supporting them since windows 10.

igormp said:
5000 series are pretty irrelevant in this regard, and Ada-based products and older should still have a long support window.

Given that the 5000 series is the first generation unable to run 32-bit CUDA code, I'd say it's 100% relevant. You seem to be under the impression that the last change to 32-bit CUDA status was years ago but as pointed out, that's not the case.

It's common good practice to provide some sort of fallback to ensure that you aren't pulling the rug out from any of your users. Aside from gaming, certain industries such as medical imaging, finance, and manufacturing use specialized 32-bit CUDA software that has been stable for years. They may not have the budget, time, or ability to switch to new software. This could put them in a crunch where they are unable to obtain hardware to replace potential failures (as limiting to older hardware only means they can only acquire from a constantly shrinking pool) or cost them a lot of capital in replacing the system unnecessarily. In some industries, it's illogical to replace if it's doing it's job as replacement can often be a massive product and introduce bugs to critical systems.

In addition, it breaks dependencies chains where you have one or more Library or Plugin that's 32-bit CUDA. Loosing PhysX support in some games is a lucky example where the only thing dropping 32-bit CUDA support did was forgo PhysX hardware acceleration but in other applications it may result in the entire application breaking or impact every feature down the dependency chain that relies on 32-bit CUDA.

unwind-protect · 2025-02-28T19:00:22+0000

Real-time inference: AI models running on GPUs can analyze network traffic in real time to identify zero-day vulnerabilities

How do you train ML models on exploits for holes you don't know about yet?

igormp · 2025-02-28T19:14:18+0000

evernessince said:
The 5000 series are the first Nvidia cards unable to run 32-bit CUDA code since 32-bit support was added. The fact of the matter is they just recently removed a feature from the GPU without providing any fallback. Even Microsoft knows better than this, hence why you can't still run 32-bit windows apps despite the OS not natively supporting them since windows 10.

That "feature" has been deprecated for quite some time, as I said already. Within CUDA 9 it was already not recommended to build 32-bit stuff.
If for some reason you have stuff that relies on CUDA 9 and 32-bit, the fact that 5000 series do not support 32-bit is actually irrelevant, because the newest GPU you can get CUDA9 stuff running is Volta. It simply won't work with Turing and newer.

And by the point your newer GPU with the newer current compute capability is not supported by your previous CUDA version, you'll likely need to rebuild it against a newer CUDA version, which also makes it a non-issue to jump into 64-bit.

evernessince said:
Given that the 5000 series is the first generation unable to run 32-bit CUDA code, I'd say it's 100% relevant. You seem to be under the impression that the last change to 32-bit CUDA status was years ago but as pointed out, that's not the case.

You seem to be forgetting about support contracts, and all the intertwining between CUDA version, driver version and your GPU's compute capability.

evernessince said:
Aside from gaming, certain industries such as medical imaging, finance, and manufacturing use specialized 32-bit CUDA software that has been stable for years.

Yes, and those will be running the appropriate GPU with the CC specific to that CUDA version. Even if it were a 64-bit software, it likely would not work on a newer GPU. That has been the case since... always.
If for some reason they were to update the GPUs, they would also need to update their software stack, period.

evernessince said:
(as limiting to older hardware only means they can only acquire from a constantly shrinking pool) or cost them a lot of capital in replacing the system unnecessarily.

Those are the only alternatives, and why support contracts are a thing.

unwind-protect said:
How do you train ML models on exploits for holes you don't know about yet?

You analyze on patterns, even a new exploit often has similar usage/packet patterns found in other, previous exploits.
You can also look for outliers, basically seeing if something deviates from the norm.

evernessince · 2025-02-28T19:55:51+0000

unwind-protect said:
How do you train ML models on exploits for holes you don't know about yet?

Inference is running an already trained AI model so in this case they are not talking about training.

Things covered by this article would be running the AI to detect known threats / exploits in addition to analyzing traffic for IDS and IPS. It's pretty much replacing existing algorithms that do that with AI.

igormp said:
That "feature" has been deprecated for quite some time, as I said already. Within CUDA 9 it was already not recommended to build 32-bit stuff.

You don't seem to be in tune with how software development works, we've already seen examples of modern software still using 32-bit CUDA code. Case in point, it's the reason the 4090 is faster than the 5090 in PassMark Direct Compute:

igormp said:
If for some reason you have stuff that relies on CUDA 9 and 32-bit, the fact that 5000 series do not support 32-bit is actually irrelevant, because the newest GPU you can get CUDA9 stuff running is Volta. It simply won't work with Turing and newer.

No, you can in fact run 32-bit code on up to the 4000 series. Have you not been following this issue at all? There are dozen of examples of 32-bit PhsyX running on the 4000 series.

igormp said:
And by the point your newer GPU with the newer current compute capability is not supported by your previous CUDA version, you'll likely need to rebuild it against a newer CUDA version, which also makes it a non-issue to jump into 64-bit.

Ok so you clearly have not developed software at a professional capacity if you think jumping from 32-bit to 64-bit is easy. Only in rare cases would it be trivial to recompile your code with minimal work for 64-bit CUDA. First off, you are making the assumption that every bit of code, every library, and every plugin that utilizes 32-bit CUDA; the devs have the source code for. That is exceedingly rare. You are also assuming that the toolchains being used also support 64 bit CUDA. Their build pipelines, testing frameworks, and deployment strategies need to accommodate 64-bit binaries.

Last but not least, the code needs to be combed through and tested for the following issues (and this is not an all-inclusive list):

Pointer size differences, Data Structure Alignment, Casting Issues, Kernel optimization changes, Register pressure (64 bit operations tend to consume more registers), Mixed precision issues (if you mix different data precisions, behavior may change in a 64-bit environment), etc.

All the fun little bugs that cause big headaches, particularly if we are talking about well aged software. Software of which may very well have been written by now retired wizards (aged developers) using a variety of programming languages or approaches whose knowledge has been lost to the ages, thus further complicating anything short of starting from scratch.

There's really no telling how many bespoke applications are out there that rely one way or another on code that has now been deprecated. It's beyond naive and harmful that you think it's a simple matter to update the code in these instances, especially considering we are talking about the medical, science and engineering fields that use CUDA.

Hence why fallbacks are typically provided for deprecated APIs, middleware, etc. That's just standard practice. Imagine if Microsoft had just told people to take a hike or to reprogram every 32-bit app when it officially dropped 32-bit support instead of what they did in including an emulation layer. They would have hemorrhaged businesses and customers.

igormp · 2025-02-28T20:27:49+0000

evernessince said:
You don't seem to be in tune with how software development works

You shouldn't be projecting this way.

evernessince said:
we've already seen examples of modern software still using 32-bit CUDA code.

Mind saying one that's NOT Physx?

evernessince said:
Case in point, it's the reason the 4090 is faster than the 5090 in PassMark Direct Compute:

That has NOTHING to do with CUDA.

evernessince said:
No, you can in fact run 32-bit code on up to the 4000 series. Have you not been following this issue at all? There are dozen of examples of 32-bit PhsyX running on the 4000 series.

Is all your complaints about Physx? I have no idea how it interacts with CUDA, but again, take a look at the support matrix for CUDA and see for yourself.
4000 series (Ada, CC 8.9) requires a minimum CUDA version of 11.0. Previous versions of CUDA won't even run.

evernessince said:
Ok so you clearly have not developed software at a professional capacity if you think jumping from 32-bit to 64-bit is easy. Only in rare cases would it be trivial to recompile your code with minimal work for 64-bit CUDA. First off, you are making the assumption that every bit of code, every library, and every plugin that utilizes 32-bit CUDA; the devs have the source code for. That is exceedingly rare. You are also assuming that the toolchains being used also support 64 bit CUDA. Their build pipelines, testing frameworks, and deployment strategies need to accommodate 64-bit binaries.

I take that you have no experience whatsoever dealing with anything-CUDA, so let me make it clear to you:
- Each µArch from Nvidia comes with a compute capability number (which you can check here).
- Each compute capability has a specific Driver and CUDA version that supports it. Older version won't run at all.
- Each CUDA version has a range of supported driver versions, and vice-versa. Newer drivers won't work with older CUDA versions outside of this range, nor will new CUDA versions work with older drivers outside of this range.

That means that, if you want to run your software with a new Nvidia product, you have to build it with a newer CUDA version that has support for it, period.
As an example, Blackwell 2.0 (RTX 5000) was only supported from CUDA 12.8 onwards, so any application built with previous version will not work at all with it, no matter if 64 or 32-bit.
The 4090 will also only work with cuda 11.8 onwards. CUDA 10 applications won't work at all with it, no matter if 32-bit or 64-bit either.
You can read more about it in the following links:

1. Why CUDA Compatibility — CUDA Compatibility

docs.nvidia.com

Which Compute Capability is supported by which CUDA versions?

What are compute capabilities supported by each of: CUDA 5.5? CUDA 6.0? CUDA 6.5?

stackoverflow.com

Going back to what I was saying: if you get a new GPU model for your older CUDA-based application, you WILL need to rebuild it with a newer CUDA version, at which point moving to 64-bit is trivial compared to actually changing CUDA versions.
If you are not able to do the above, then simply do not use a new GPU and find an older model. It has been like that since always.

evernessince said:
Last but not least, the code needs to be combed through and tested for the following issues (and this is not an all-inclusive list):

Pointer size differences, Data Structure Alignment, Casting Issues, Kernel optimization changes, Register pressure (64 bit operations tend to consume more registers), Mixed precision issues (if you mix different data precisions, behavior may change in a 64-bit environment), etc.

All the fun little bugs that cause big headaches, particularly if we are talking about well aged software. Software of which may very well have been written by now retired wizards (aged developers) using a variety of programming languages or approaches whose knowledge has been lost to the ages, thus further complicating anything short of starting from scratch.

There's really no telling how many bespoke applications are out there that rely one way or another on code that has now been deprecated. It's beyond naive and harmful that you think it's a simple matter to update the code in these instances, especially considering we are talking about the medical, science and engineering fields that use CUDA.

Congrats on saying the most generic stuff known, but still shows that you've never had any hands-on experience maintaining CUDA software, or any other evergreen project.

evernessince said:
Hence why fallbacks are typically provided for deprecated APIs, middleware, etc. That's just standard practice. Imagine if Microsoft had just told people to take a hike or to reprogram every 32-bit app when it officially dropped 32-bit support instead of what they did in including an emulation layer. They would have hemorrhaged businesses and customers.

ABIs and backwards compatibility break all the time. If you're mostly a MS developer I can see what you're getting at, but tons of other software deals with breaking changes on a daily basis.
Each python minor version introduces breaking changes. Most machine learning frameworks introduce breaking changes. Heck, glibc recently broke their API and caused lots of applications to fail, I had to submit patches all over the place to make up for that.
Even the Linux ABI breaks once in a while.

If you want new hardware, then it's reasonable to assume you'll be doing the required effort to update your code base as well.

unwind-protect · 2025-02-28T20:33:40+0000

evernessince said:
Inference is running an already trained AI model so in this case they are not talking about training.

Things covered by this article would be running the AI to detect known threats / exploits in addition to analyzing traffic for IDS and IPS. It's pretty much replacing existing algorithms that do that with AI.

I *am* talking about training. I am asking how NVidia trains for as-of-now unknown exploits.

igormp · 2025-02-28T20:37:19+0000

unwind-protect said:
I *am* talking about training. I am asking how NVidia trains for as-of-now unknown exploits.

Nvidia doesn't train anything specific for that, the whole article is how they can help others train their models to do so.
Anyhow, the keyword you're looking for is "anomaly detection". Many research papers in this area.

evernessince · 2025-02-28T21:56:29+0000

igormp said:
take a look at the support matrix for CUDA and see for yourself.
4000 series (Ada, CC 8.9) requires a minimum CUDA version of 11.0. Previous versions of CUDA won't even run.

Previous versions of the Nvidia toolkit and other Nvidia provided tools won't run but 32-bit CUDA code will: https://nvidia.custhelp.com/app/answers/detail/a_id/5615/~/support-plan-for-32-bit-cuda

You are conflating running the development toolkit / other tools provided by Nvidia with the ability to run 32-bit CUDA code. While 32-bit CUDA was deprecated in CUDA 9, it doesn't mean that 32-bit applications or libraries are automatically incompatible with modern GPUs. Many of these older applications can still run in a 64-bit environment, as the 64-bit CUDA runtime supports backward compatibility for running 32-bit code.

igormp said:
I take that you have no experience whatsoever dealing with anything-CUDA, so let me make it clear to you:
- Each µArch from Nvidia comes with a compute capability number (which you can check here).
- Each compute capability has a specific Driver and CUDA version that supports it. Older version won't run at all.
- Each CUDA version has a range of supported driver versions, and vice-versa. Newer drivers won't work with older CUDA versions outside of this range, nor will new CUDA versions work with older drivers outside of this range.

That means that, if you want to run your software with a new Nvidia product, you have to build it with a newer CUDA version that has support for it, period.
As an example, Blackwell 2.0 (RTX 5000) was only supported from CUDA 12.8 onwards, so any application built with previous version will not work at all with it, no matter if 64 or 32-bit.
The 4090 will also only work with cuda 11.8 onwards. CUDA 10 applications won't work at all with it, no matter if 32-bit or 64-bit either.
You can read more about it in the following links:

1. Why CUDA Compatibility — CUDA Compatibility

docs.nvidia.com

Which Compute Capability is supported by which CUDA versions?

What are compute capabilities supported by each of: CUDA 5.5? CUDA 6.0? CUDA 6.5?

stackoverflow.com

Read above, again you are confusing the ability to run 32-bit application vs the ability to run the tools provided by Nvidia.

igormp said:
Going back to what I was saying: if you get a new GPU model for your older CUDA-based application, you WILL need to rebuild it with a newer CUDA version, at which point moving to 64-bit is trivial compared to actually changing CUDA versions.
If you are not able to do the above, then simply do not use a new GPU and find an older model. It has been like that since always.

I've already pointed out the issues with this but you are incapable or intentionally not reading.

igormp said:
ABIs and backwards compatibility break all the time. If you're mostly a MS developer I can see what you're getting at, but tons of other software deals with breaking changes on a daily basis.
Each python minor version introduces breaking changes. Most machine learning frameworks introduce breaking changes. Heck, glibc recently broke their API and caused lots of applications to fail, I had to submit patches all over the place to make up for that.
Even the Linux ABI breaks once in a while.

Machine Learning is the worst example to try to prove your point. It's a cluster and frequently causes issues and errors.

The AI market is rapidly evolving, so it's a necessary evil to enable that but it's absolutely not the way you'd want all software backwards compatability to work, especially when you are talking about software that has to remain in service for a long time or that is critical.

You seem to be arguing from a very specific niche where that isn't a problem but for the vast majority of the market compatibility is extraordinarily important.

igormp said:
You shouldn't be projecting this way.

Mind saying one that's NOT Physx?

That has NOTHING to do with CUDA.

I just provided you one in the last comment, the PassMark Devs put out a statement pointing out that in fact the reason 5090 gets a lower score is because a portion of the test uses 32-bit CUDA on Nvidia cards: "We found out a few hours ago that nvidia removed OpenCL 32bit support. Seems it depended on CUDA 32bit"

https://www.reddit.com/r/hardware/comments/1iz8v8p
Circles right back around to my earlier point that removing the ability for cards to run 32-bit CUDA code breaks dependency chains, among other potential issues. I've no idea why you are clinging so tightly to the idea that 32-bit code cannot be run on the 3000 and 4000 series. Both Nvidia and Passmark prove that it can in addition to every game that integrates 32-bit PhysX. It was only until the recently released 5000 series release, which dropped it, did they have to make changes.

igormp said:
Is all your complaints about Physx? I have no idea how it interacts with CUDA, but again,

Only about 5% of the content of my comments even talks about it.

igormp · 2025-02-28T22:58:40+0000

evernessince said:
Previous versions of the Nvidia toolkit and other Nvidia provided tools won't run but 32-bit CUDA code will: https://nvidia.custhelp.com/app/answers/detail/a_id/5615/~/support-plan-for-32-bit-cuda

I'm talking about CUDA versions. The 32-bit idea becomes moot when the entire version is not supported.

evernessince said:
You are conflating running the development toolkit / other tools provided by Nvidia with the ability to run 32-bit CUDA code.

And you seem to be totally ignoring what I'm saying about the runtime versions. The CUDA runtime version has direct relationship with the toolkit version.

evernessince said:
While 32-bit CUDA was deprecated in CUDA 9, it doesn't mean that 32-bit applications or libraries are automatically incompatible with modern GPUs.

32-bit per se, no. But it means that no software since CUDA 9 should be built with 32-bit. Developers doing so should be aware they were using something that would soon not work.
And, as I said before, a CUDA 9 application will not run on something like a 4090, no matter of 32 or 64-bit, because that's how it works. So yes, they are incompatible with modern GPUs.

evernessince said:
Many of these older applications can still run in a 64-bit environment, as the 64-bit CUDA runtime supports backward compatibility for running 32-bit code.

On windows*. On linux it has been deprecated for quite some time as well.
And those still require the correct CUDA runtime version, as I've said many times already.

evernessince said:
Read above, again you are confusing the ability to run 32-bit application vs the ability to run the tools provided by Nvidia.

It's not ability to run the tools, the runtime is tied to the tooling version.
If I build something with CUDA 12, it won't run at all with a 780ti.
If something was built with CUDA 10, it won't run at all with a 4090.

evernessince said:
I've already pointed out the issues with this but you are incapable or intentionally not reading.

I could say the same back at you.

evernessince said:
Machine Learning is the worst example to try to prove your point. It's a cluster and frequently causes issues and errors.

I totally agree with that, but that was not the only example I gave.
Or are you going to say that something like linux's ABI or glibc are also bad examples?

evernessince said:
You seem to be arguing from a very specific niche where that isn't a problem but for the vast majority of the market compatibility is extraordinarily important.

Yes, and for those you also keep the same hardware for as long as possible.

evernessince said:
I just provided you one in the last comment, the PassMark Devs put out a statement pointing out that in fact the reason 5090 gets a lower score is because a portion of the test uses 32-bit CUDA on Nvidia cards: "We found out a few hours ago that nvidia removed OpenCL 32bit support. Seems it depended on CUDA 32bit"

I wasn't aware of that OpenCL issue on top of 32-bit, that does bring bigger issues, thanks for the link.

evernessince said:
I've no idea why you are clinging so tightly to the idea that 32-bit code cannot be run on the 3000 and 4000 series.

I have never said that. I'm trying to make it clear to you that 32-bit code is just a minor detail within the overall CUDA ecosystem.
Your GPU arch is tied to some specific CUDA versions, and those break way more often than one having to worry about 32 vs 64-bit.

evernessince said:
Only about 5% of the content of my comments even talks about it.

Your entire talk seemed to start-off from the Physx issue, but now you showed that passmark was a concrete example.
However, this brings another issue that OpenCL was never properly supported for Nvidia devices, but that's an entirely different issue.

Still, would you be able to list any other CUDA (actual CUDA, not something like OpenCL or VK on top of CUDA) 32-bit application that's still in use? I would be really curious to see any modern-ish application still targeting 32-bit CUDA, given it has been deprecated for almost 10 years now.

Nonetheless, my point still stands, new nvidia hardware is not backwards compatible CUDA-wise. For each new generation of products, you NEED to have the updated CUDA runtimes and the applications have to support it as well. To your initial point:

evernessince said:
I would absolutely avoid integrating CUDA libraries in anything as critical as Cyber-security after Nvidia pulled the rug on all 32-bit CUDA applications, libraries, plugins, etc. on 5000 series and later without providing a fallback. Absolutely destroys trust for companies developing long term solutions.

There's no fallback nor there has ever been when it comes to CUDA. As a simple example, most tools had to be updated with CUDA 12.8 to support blackwell:

NVIDIA GeForce RTX 5070 Ti Content Creation Review

NVIDIA's GeForce RTX 5070 Ti may be the budget king of the 50-series or may end up severeley over-priced.. How does it perform in content creation applications?

www.pugetsystems.com

In terms of applications, the new NVIDIA Blackwell cards have some lingering compatibility issues at present as we await developers’ integration of the new CUDA 12.8 and TensorRT 10.8 toolkits. As a result, the RTX 50-series of graphics cards is not supported in Redshift (Cinebench) or Octanebench—though the latest version of Octane renderer does support them—and has performance issues in V-Ray CUDA rendering.

So the fact that 32-bit support has been dropped is moot. Your CUDA application would need to be rebuilt against the newest version to support the newest product.

Processor	Ryzen 7800X3D
Motherboard	ASRock X670E Taichi
Cooling	Noctua NH-D15 Chromax
Memory	32GB DDR5 6000 CL30
Video Card(s)	MSI RTX 4090 Trio
Storage	P5800X 1.6TB 4x 15.36TB Micron 9300 Pro 4x WD Black 8TB M.2
Display(s)	Acer Predator XB3 27" 240 Hz
Case	Thermaltake Core X9
Audio Device(s)	JDS Element IV, DCA Aeon II
Power Supply	Seasonic Prime Titanium 850w
Mouse	PMM P-305
Keyboard	Wooting HE60
VR HMD	Valve Index
Software	Win 10

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

Processor	Ryzen 7800X3D
Motherboard	ASRock X670E Taichi
Cooling	Noctua NH-D15 Chromax
Memory	32GB DDR5 6000 CL30
Video Card(s)	MSI RTX 4090 Trio
Storage	P5800X 1.6TB 4x 15.36TB Micron 9300 Pro 4x WD Black 8TB M.2
Display(s)	Acer Predator XB3 27" 240 Hz
Case	Thermaltake Core X9
Audio Device(s)	JDS Element IV, DCA Aeon II
Power Supply	Seasonic Prime Titanium 850w
Mouse	PMM P-305
Keyboard	Wooting HE60
VR HMD	Valve Index
Software	Win 10

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

Processor	Ryzen 7800X3D
Motherboard	ASRock X670E Taichi
Cooling	Noctua NH-D15 Chromax
Memory	32GB DDR5 6000 CL30
Video Card(s)	MSI RTX 4090 Trio
Storage	P5800X 1.6TB 4x 15.36TB Micron 9300 Pro 4x WD Black 8TB M.2
Display(s)	Acer Predator XB3 27" 240 Hz
Case	Thermaltake Core X9
Audio Device(s)	JDS Element IV, DCA Aeon II
Power Supply	Seasonic Prime Titanium 850w
Mouse	PMM P-305
Keyboard	Wooting HE60
VR HMD	Valve Index
Software	Win 10

NVIDIA Explains How CUDA Libraries Bolster Cybersecurity With AI

News Editor