- Joined
- Apr 24, 2020
- Messages
- 2,713 (1.61/day)
I really doubt it, things like MI300 look to be much faster in general purpose compute and likely a lot cheaper, if demand for ML drops off a cliff you don't want these on your hand, it will take ages till you break ROI.
Nvidia really doesn't treat these as anything more than ML accelerators despite them still being "GPUs" technically, they have far inferior FP64/FP16 performance compared to MI300 for example.
AMD is making good hardware, but as usual the question is if AMD's software can keep up.
For the most part, people don't want to port off CUDA for minor gains that AMD's hardware represents. The HPC / Supercomputer guys probably aren't even using ROCm for the most part, but are instead writing programs at higher-levels and relying upon a smaller team of specialists to port just elements of their kernel to ROCm one step at a time. (A structure only possible because National Labs have much more $$$ to afford specialist programmers like this).
I think AMD is making good progress. They've found that NVidia is lagging on traditional SIMD compute and have carved out a niche for themselves. But NVidia still "wins" because of the overall software package in practice.