Monday, July 3rd 2023
AMD "Vega" Architecture Gets No More ROCm Updates After Release 5.6
AMD's "Vega" graphics architecture powering graphics cards such as the Radeon VII, Radeon PRO VII, sees a discontinuation of maintenance with ROCm GPU programming software stack. The release notes of ROCm 5.6 states that the AMD Instinct MI50 accelerator, Radeon VII client graphics card, and Radeon PRO VII pro-vis graphics card, collectively referred to as "gfx906," will reach EOM (end of maintenance) starting Q3-2023, which aligns with the release of ROCm 5.7. Developer "EwoutH" on GitHub, who discovered this, remarks gfx906 is barely 5 years old, with the Radeon PRO VII and Instinct MI50 accelerator currently being sold in the market. The most recent AMD product powered by "Vega" has to be the "Cezanne" desktop processor, which uses an iGPU based on the architecture. This chip was released in Q2-2021.
Source:
EwoutH (ROCm GitHub)
42 Comments on AMD "Vega" Architecture Gets No More ROCm Updates After Release 5.6
Vega had a good run, I honestly don't see a problem with this. It's just that it should have died off long ago, instead of being continuously shoehorned into new products in the past 3 years.
And here we are. That's also why I think they are much more focused and strategically positioned better today; the whole business is chiplet-focused now, moving to unification rather than trying weird shit left and right. Its also why I don't mind them not pushing the RT button too hard. Less might be more.
UPDATE 1-AMD's AI chips could match Nvidia's offerings, software firm says
So maybe their are fixing their software problems. Probably dropping the older architectures is to make their job easier. But they do send the wrong message to professionals. Long term support should be also a priority.
I think its a bit of a system that feeds itself. I think that if the ROCm ecosystem was more popular, AMD would have been incentivized to make their GPUs train faster. If you want to generate a model, you are better with an RTX 3070 Ti than an RX 7900 XTX at this point.
The only thing I believe AMD's RDNA2-3 cards are decent at is inference.
In any case AMD GPUs do find their way in super computers meant to be used also for AI and ML. That probably means something. Also Nvidia is having so much difficulty fulfilling orders that I believe I read about 6 months waiting list. If AMD's options can be at 80% performance and at 80% price, I would expect many turning to AMD solutions instead of waiting 6 months. And there is a paragraph in the above article that does seems to implie that something changed about the AMD software
You're right about the 7030 chips though
It does still illustrate the utter disaster that is the 7000 mobile naming scheme. AMD seriously wants people to view Mendocino, Barcelo, Rembrandt, Phoenix and Dragon Range as equals in terms of technology :roll: As far as I can tell ROCm support on APUs (even if they are "Vega") is kinda pepega and a clear answer/proper documentation is scarce. Still, why not? I can think of plenty of people very interested in running stuff like stable diffusion - it doesn't mean they have the funds to smash on a high end GPU.
I also understand your point about hobbyists getting used to CUDA and then probably doing some studies on CUDA to get jobs. But again, where Nvidia and AMD and everyone else is targeting, it's not about "what was your hobby, what did you learn in university?". If that was the case, then EVERYTHING ELSE than CUDA would being DOA.
Again, if it was CUDA and only CUDA, EVERYTHING else would have being DOA. Not just anything AMD, but anything Intel, anything Google, anything Amazon, anything Tenstorrent, anything Apple, anything Microsoft, anything different than CUDA. Am I right? Am I wrong?
Now I do agree that for companies and universities with limited resources for probably limited projects, where limited I mean projects that are still huge in my eyes or some other individual's eyes throwing random thoughts in a forum, will just go out and buy a few RTX 4090s. Granted. But again, I doubt Nvidia's lattest amazing success is based on GeForce cards.
My point was no pro uses Cezanne for AI work and even if that Apu had an Nvidia GPU it would still be irrelevant it's a consumer part.
RocM is irrelevant to consumer part's so no need to mention them at all or discuss consumer part's in this thread.
What's next should we talk. AMD driver issues on consumer part's here too , someone will no doubt.
Of course you can have your own implementation from scratch (possibly even more performant than CUDA, if you're in a specific niche), but at this point, the entry barrier is quite high.
Mind you, I'm not saying CUDA is better (I haven't used it). But I know at least two guys who tried to dabble in AI/ML using AMD/OpenCL and they both said "screw it" in the end and went CUDA. One of them was doing it for a hobby, the other one for his PhD. TL;DR CUDA is everywhere and it sells hardware while AMD keeps finding ways to shoot themselves in the foot.
This economy is crazy. Startups that want to start and train models in-house will buy often 15-25 high end GPUs and put them in racks or rigs to get their initial versions ready.
It's an exaggeration, of course. But it shows why you can sell even when there's a considerable gap between you and competition.
We have already seen that with most of the popular available tools for developers, someone better get two RTX 4090's in a machine than four RX 7900 XTXs or whatever Radeon instinct equivelant to it is. The situation is extremely skewed towards NVIDIA in the ML ecosystem today. At this point, im pretty sure that two zeroes won't make AMD's sales for ML reach NVIDIA's. It really is quite an astronomical difference.
I don't even want to open up on the GTCs, the courses and the academies that exist on NVIDIA's side to enrich the CUDA based ML world today. This is a losing game for anyone else in this industry so far, Intel included and their less than great solutions. These DGX servers NVIDIA offer are just the cherry on top
In any case MosaicML seems to be a company working with Nvidia for a long time and only now coming out with a press release saying "Hey, you know something? AMD's options can be a real alternative NOW, because Maybe, being in the business, you need to update your info.
If such things weren't important to them, they wouldn't give as many software tools and grow as large community using accessible and affordable hardware to such clients. They would force them to buy server-grade hardware only and unlock those features there. They wouldn't sell you on Xaviar NX / Orin products that you can buy for a couple of hundred dollars and develop for, including in hardware integration level to boards
These products exist especially for startups and small businesses. Here's our little Xaviar we built a board to accomodate. very cute
People really stay under large rocks. Time to lift them up, you missed how NVIDIA exponentially grew their ML outreach since 2017
Anyway, you're looking at this the wrong way. The fact that CUDA doesn't command 100% market share is not a guarantee ROCm is just as serviceable. Case in point: this year OpenAI has announced they will buy en-masse from Nvidia. Have they, or any of their competitors, announced something similar for AMD hardware? Another case in point: open up a cloud admin interface and try to create an OpenCL/ROCm instance. Everybody offers CUDA, but otoh, I can't name a provider that also offers ROCm (I'm sure there are some, I just don't recall who).