Monday, July 3rd 2023
AMD "Vega" Architecture Gets No More ROCm Updates After Release 5.6
AMD's "Vega" graphics architecture powering graphics cards such as the Radeon VII, Radeon PRO VII, sees a discontinuation of maintenance with ROCm GPU programming software stack. The release notes of ROCm 5.6 states that the AMD Instinct MI50 accelerator, Radeon VII client graphics card, and Radeon PRO VII pro-vis graphics card, collectively referred to as "gfx906," will reach EOM (end of maintenance) starting Q3-2023, which aligns with the release of ROCm 5.7. Developer "EwoutH" on GitHub, who discovered this, remarks gfx906 is barely 5 years old, with the Radeon PRO VII and Instinct MI50 accelerator currently being sold in the market. The most recent AMD product powered by "Vega" has to be the "Cezanne" desktop processor, which uses an iGPU based on the architecture. This chip was released in Q2-2021.
Source:
EwoutH (ROCm GitHub)
42 Comments on AMD "Vega" Architecture Gets No More ROCm Updates After Release 5.6
The market will not suffer one supplier.
See Tenstorrent, Nvidia may have a strangle hold but others will come along to relieve that, Tesla didn't stick with Nvidia long.
El capitan and frontier also exist.
As do other AI systems with no sign of Cuda or AMD on them.
Seriously. AI and ML are in a totally different level today. Nvidia was building the compute ecosystem from the introduction of their first GPU and latter with CUDA and everything else. Today someone with huge resources can go any hardware they choose. And everyone is targeting that market. CUDA is the de facto option for individuals, but I was never talking about individuals. You keep pointing at me what someone can do with a GeForce card or a $200 board. And still you haven't commented on MosaicML's announcement, while being in the business as you say. Maybe you are just tech support in that business? Nothing to do with AI and ML programming?
PS We are same timezone. ;) You are looking it at the wrong way. Didn't said that magically CUDA will be replaced. It just seems they managed to fix a couple of things those last months and probably one of the reasons for getting serious about it, or should I say another reason, the main reason is probably those $11 billions Nvidia announced for this quarter, is this
Lisa Su Reaffirms Commitment To Improving AMD ROCm Support, Engaging The Community - Phoronix
OpenAI is the best most recent example.
This is exactly why governments and market regulators are fighting giants like Google and MS on noncompetitive practices.
Its the same strategy as is used in education. Why do you think Apple and Google deliver stuff for that purpose to be used in schools? Its simple: they want that Microsoft pie, where people grew up with Windows and then land in Enterprise Windows on their day job. People get what they know because the barrier of entry is lower. Despite that, yes, we have macOS alongside Windows just like we have CUDA alongside whatever else. Ah yes, they are going to engage the community again. Lovely, but pointless.
This the AMD vibe all over again, its like they company works like a bad employee: manager says to work better, bad employee puts full focus in the next work week, and then he's back to the old ways. It echoes everywhere, RDNA2 > 3 is another such example. It could've been so much more.
If I were to be mean, I would highlight how Ms Su "reaffirms commitment" and a month later ROCm announces more hardware will go unsupported. But I won't do that. This guy/gal gets it. :rockout:
To both of you. Future isn't always a continuation of the past. Some times change happens. And we are not talking about individuals, again. Those who would be bought from bigger corporations will have to play unter different rules. If the bigger corporation that bought them says "We need a super computer and we need it now to use your solutions" and Nvidia reply "You will have to wait 6 months and pay X" they will have a problem. They can just wait 6 months, or check for alternatives. MosaicML did just that. Checked if AMD today can be an alternative, because in the resent past, it wasn't or at least it was a problematic alternative. So if those go to AMD (or Intel or someone else) and that someone tells them "We can offer you 80% of the performance at 3/4 X the cost today and we warranty that the software you need is provided by us", they might go that direction. They only need third party verification that the software provided will not be garbage.
As for the bad employee, if the good employee next to him gets a 50% raise because of being a good employee, that could be a good reason to turn to a good employee and focus for more than a week.
Explain to me why would any business invest thousands in dedicated machines when they can get access to more capable hardware at a fraction of the cost per month with cloud with the exception that they just didn't know any better.
This is also, by the way, why both Nvidia and AMD stopped making their highest end compute oriented hardware available to consumers, for example the last GPU truly dedicated for compute Nvidia sold to consumers was Titan V. It makes no sense from a financial point of view. The hardware has diverged as well, Hoper and CDNA are clearly dedicated for compute and ML and will probably never make their way even to professional products, the gap between what you can do with consumer hardware will widen even more with time.
As far as businesses are concerned, a 5000 dollar machine containing two RTX 4090 cards (say 3200 dollars, the rest is relatively simple with an 8-12 core CPU, some mild amount of RAM) is worth its weight in pure gold for model training. you get your ROI quite quickly. Semantics. Are you terminally online to argue with people in forums? grow up, have fuitful discussions like an adult. Throughout my 6 years in working in this field I have seen at least 5 nearby businesses making training machines as a first response force to their ideas. the reasons stated above. Its a form of speech. It implies of current actions being done by businesses, like the one ive described above. I'm not a business man. I'm a system integration and product engineer
Lol, talk about being terminally online and not having fruitful discissions like an adult.
Not sure how consequential this fully is, but seeing support of any kind being dropped on a still-actively-sold architecture, is troubling.
ROCm is partly coming to windows for the 7900xtx/xt and workstation cards this year... sometime. It was originally in the ROCm 5.6 alpha but didn't make it this release.
I am disappointed the mi50/60 is getting support dropped, I found it odd that they had dropped mi25 at ROCm 4.5 and left the other Vegas enabled.
That said, they are wanting to have a unified architecture for compute and now only tensor/matrix math enabled cards will exist moving forward.
I will have to do some benchmarks to compare resnet50 on mi60 vs mi100 to show matrix accelerations.