Monday, July 8th 2024
AMD is Becoming a Software Company. Here's the Plan
Just a few weeks ago, AMD invited us to Barcelona as part of a roundtable, to share their vision for the future of the company, and to get our feedback. On site, were prominent AMD leadership, including Phil Guido, Executive Vice President & Chief Commercial Officer and Jack Huynh, Senior VP & GM, Computing and Graphics Business Group. AMD is making changes in a big way to how they are approaching technology, shifting their focus from hardware development to emphasizing software, APIs, and AI experiences. Software is no longer just a complement to hardware; it's the core of modern technological ecosystems, and AMD is finally aligning its strategy accordingly.
The major difference between AMD and NVIDIA is that AMD is a hardware company that makes software on the side to support its hardware; while NVIDIA is a software company that designs hardware on the side to accelerate its software. This is about to change, as AMD is making a pivot toward software. They believe that they now have the full stack of computing hardware—all the way from CPUs, to AI accelerators, to GPUs, to FPGAs, to data-processing and even server architecture. The only frontier left for AMD is software.Fast Forward to Barcelona
We walked into the room in Barcelona expecting the usual fluff talk about how AI PC is the next big thing, and how it's all hands on deck to capture market share—we've heard that before, from pretty much everyone at Computex Taiwan. Well, we did get a substantial talk on how AMD's new Ryzen AI 300 series "Strix Point" processors are the tip of the spear for the company with AI PCs, and how it thinks that it brings a winning combination of hardware to see it through; but what we didn't expect was to get a glimpse into how much the company AMD is changing things around to gain competitiveness in this new world, which led to a stunning disclosure by the company.
AMD has "tripled our software engineering, and are going all-in on the software." This not only means bring in more people, but also allow people to change roles: "we moved some of our best people in the organization to support" these teams. When this transformation is completed, the company will more closely resemble contemporaries in the industry such as Intel and NVIDIA. AMD commented that in the past they were "silicon first, then we thought about SDKs, toolchains and then the ISVs (software development companies)." They continued "Our shift in strategy is to talk to ISVs first...to understand what the developers want enabled." which is a fundamental change to how new processors are created. I really like this quote: "the old AMD would just chase speeds and feeds. The new AMD is going to be AI software first, we know how to do silicon"—and I agree that this is the right path forward.
AMD of the Old: Why Hardware-first is Bad for a Hardware Company in IT
AMD's hardware-first approach to tech has met with limited market success. Despite having a CPU microarchitecture that at least matches Intel, the company barely commands a quarter of the market (both server and client processors combined); and despite its gaming GPUs being contemporary, it barely has a sixth of this market. This is not for a lack of performance—AMD makes some very powerful CPUs and GPUs, which are able to keep competitors on their toes. The number-one problem with AMD's technology has been relatively less engagement with the software vendor ecosystem—to make best use of the hardware's exclusive and unique capabilities through first-party software technologies—APIs, developer tools, resources, developer networking, and optimization.
For example, Radeon GPUs have had tessellation capabilities at least two generations ahead of NVIDIA, which was only exploited by developers after Microsoft standardized it in the DirectX 11 API, the same happened with Mantle and DirectX 12. In both cases, the X-factor NVIDIA enjoys is a software-first approach, the way it engages with developers, and more importantly, the install-base (over 75% of the discrete GPU market-share). There have been several such examples of AMD silicon packing exotic accelerators across its hardware stack that haven't been properly exploited by the software community. The reasons are usually the same—AMD has been a hardware-first company.
Why is Tesla a hotter stock than General Motors? Because General Motors is an automobile company that happens to use some technology in its vehicles; whereas Tesla is a tech company that happens to know automobile engineering. Tesla vehicles are software-defined devices that can transport you around. Tesla's approach to transportation has been to understand what consumers want or might need from a technology standpoint, and then building the hardware to achieve it. In the end, you know Tesla for its savvy cars, much in the same way that you know NVIDIA for GPUs that "just work," and like Tesla, NVIDIA's revenues are overwhelmingly made up of hardware sales—despite them being software-first, or experience-first. Another example of exactly this would be Apple who have built a huge ecosystem of software and services that is designed to work extremely well together, but also locks people in their "walled garden," enabling huge profits for the company in the process.
NVIDIA's Weakness
This is not to say that AMD has neglected software at all—far from it, the company has played nice-guy by keeping much of its software base open-source, through initiatives such as GPUOpen and ROCm, which are great resources for software developers, and we definitely love the support for open source. It's just that AMD has not treated software as its main product, that makes people buy their hardware and bring in the revenues. AMD is aware of this and wants "to create a unified architecture across our CPU and RDNA, which will let us simplify [the] software." This looks like an approach similar to Intel's OneAPI, which makes a lot of sense, but it will be a challenging project. NVIDIA's advantage here is that they have just one kind of accelerator—the GPU, which runs CUDA—a single API for all developers to learn, which enables them to solve a huge range of computing challenges on hardware ranging from $200 to $30,000.
On the other hand, this is also a weakness of NVIDIA, and an advantage for AMD. AMD has a rich IP portfolio of compute solutions, ranging from classic CPUs and GPUs, to XDNA FPGA chips (through the Xilinx acquisition), now they just need to bring them together, exposing a unified computing interface that makes it easy to strategically shift workloads between these core types, to maximize performance, cost, efficiency or both. Such a capability would give the company the ability to sell customers a single-product combined accelerator system comprised of components like a CPU, GPU and specialized FPGA(s)—similar to how you're buying an iPhone and not a screen, processor, 5G modem and battery to combine them on your own.
Enabling Software Developers
If you've sat through NVIDIA's GTC sessions like we have, barely 5-10% of the showtime is spent talking about NVIDIA hardware (their latest AI GPUs or accelerators up and down the stack), most of the talk is about first-party software solutions—problems to solve, solutions, software, API, developer tools, collaboration tools, bare-metal system software, and only then the hardware. AMD started its journey toward exactly this.
They are now talking to the major software companies, like Microsoft, Adobe and OpenAI, to learn what their plans are and what they need from a future hardware generation. AMD's roadmaps now show the company's plans several years into the future, so that their partners can learn what AMD is creating, so the software products can better utilize these new features.
Market ResearchWe got a detailed presentation from market research firm IDC, which AMD contracted to study the short- and medium-term future of AI PCs, and the notion that PCs with native acceleration will bring a disruptive change to computing. This happened before, when bricks became iPhones, when networks became the Internet, and when text-based prompts were banished for GUI interfaces. To be honest, generative AI has taken a life of its own, and is playing a crucial role in the mainstreaming of this new tech, but the current implementation relies on cloud-based acceleration. Running everything in the cloud comes with huge power usage and expensive NVIDIA GPUs are used in the process. Are people willing to buy a whole new device just to get some of this acceleration onto their devices for privacy and latency? This remains to be seen. Even with 50 TOPS, the NPU of AMD "Strix" and Intel "Lunar Lake" won't exactly zip through image generation, but make text-based LLMs viable, as would certain audiovisual effects such as Teams webcam background replacements, noise suppression, and even live translation.
AMD is aware of the challenges, especially after Intel (Meteor Lake) and Microsoft (Copilot) spammed us with "AI" everywhere, and huge chunks of the userbase fail to see the convincing arguments. Privacy and Security are on AMD's radar, and you need to "demonstrate that you actually increase productivity per workload. If you're asking people to spend more money [... you need to prove that] you can save hours per week....that could be worth the investment, [but] will require a massive education of the end-users." There is also a goal to give special love to "build the most innovative and disruptive form factors" for notebooks, so that people are like "wow, there's something new here". Specifically in the laptop space they are watching Qualcomm's Windows on Arm initiative very closely and want to make sure to "launch a product only when it's ready," and to also "address price-points below $1000."
Where Does AMD Begin in 2024?What's the first stop in AMD's journey? It's to ensure that it's able to grow its market-share both on the client side with AI PCs, and on the data-center side, with its AI GPUs. For AI PCs, the company believes it has a winning product with the Ryzen AI 300 series "Strix Point" mobile processors, which it thinks are in a good position to ramp through 2024. What definitely helps is the fact that "Strix Point" is based on a relatively mature TSMC 4 nm foundry node, with which it can secure volumes; compared to Intel's "Lunar Lake" and upcoming "Arrow Lake," which are both expected to use TSMC's 3 nm foundry node. ASUS already announced a mid-July media event where it plans to launch dozens of AI PCs, all of which are powered by Ryzen AI 300 series chips, and meet Microsoft Copilot+ requirements. Over on the data-center side, AMD's MI300X accelerator is receiving spillover demand from competing NVIDIA H100 GPUs, and the company plans to continue investing in the software side of this solution, to bring in large orders from leading AI cloud-compute providers running popular AI applications.
The improvements to the software ecosystem will take some time, AMD is looking at a three to five year timeframe, and to support that, AMD has greatly increased their software engineer headcount as mentioned before. They have also accelerated their hardware development: "we are going to launch a new [Radeon] Instinct product every 12 months," which is a difficult task, but it helps react quicker to changes in the software markets and its demand. On the CPU side, the company "now has two CPU teams, one does n+1 [next generation] the other n+2 [two generations ahead]," which reminds us a bit of Intel's tick-tock strategy, which was more silicon manufacturing focused of course. When asked about Moore's Law and its demise, the company also commented that it is exploring "AI in chip design to go beyond place and route," and that "yesterday's war is more rasterization, more ray tracing, more bandwidth," the challenges of the next generations are not only hardware, but software support for nurturing relations with software developers plays a crucial role. AMD even thinks that the eternal tug-of-war between CPU and GPU could shift in the future: "we can't think of AI as a checkbox/gimmick feature like USB—AI could become the hero."
What's heartening though is that AMD has made the bold move of mobilizing resources toward hiring software talent over acquiring another hardware company like it usually does when its wallet is full—this will pay off in the coming years.
The major difference between AMD and NVIDIA is that AMD is a hardware company that makes software on the side to support its hardware; while NVIDIA is a software company that designs hardware on the side to accelerate its software. This is about to change, as AMD is making a pivot toward software. They believe that they now have the full stack of computing hardware—all the way from CPUs, to AI accelerators, to GPUs, to FPGAs, to data-processing and even server architecture. The only frontier left for AMD is software.Fast Forward to Barcelona
We walked into the room in Barcelona expecting the usual fluff talk about how AI PC is the next big thing, and how it's all hands on deck to capture market share—we've heard that before, from pretty much everyone at Computex Taiwan. Well, we did get a substantial talk on how AMD's new Ryzen AI 300 series "Strix Point" processors are the tip of the spear for the company with AI PCs, and how it thinks that it brings a winning combination of hardware to see it through; but what we didn't expect was to get a glimpse into how much the company AMD is changing things around to gain competitiveness in this new world, which led to a stunning disclosure by the company.
AMD has "tripled our software engineering, and are going all-in on the software." This not only means bring in more people, but also allow people to change roles: "we moved some of our best people in the organization to support" these teams. When this transformation is completed, the company will more closely resemble contemporaries in the industry such as Intel and NVIDIA. AMD commented that in the past they were "silicon first, then we thought about SDKs, toolchains and then the ISVs (software development companies)." They continued "Our shift in strategy is to talk to ISVs first...to understand what the developers want enabled." which is a fundamental change to how new processors are created. I really like this quote: "the old AMD would just chase speeds and feeds. The new AMD is going to be AI software first, we know how to do silicon"—and I agree that this is the right path forward.
AMD of the Old: Why Hardware-first is Bad for a Hardware Company in IT
AMD's hardware-first approach to tech has met with limited market success. Despite having a CPU microarchitecture that at least matches Intel, the company barely commands a quarter of the market (both server and client processors combined); and despite its gaming GPUs being contemporary, it barely has a sixth of this market. This is not for a lack of performance—AMD makes some very powerful CPUs and GPUs, which are able to keep competitors on their toes. The number-one problem with AMD's technology has been relatively less engagement with the software vendor ecosystem—to make best use of the hardware's exclusive and unique capabilities through first-party software technologies—APIs, developer tools, resources, developer networking, and optimization.
For example, Radeon GPUs have had tessellation capabilities at least two generations ahead of NVIDIA, which was only exploited by developers after Microsoft standardized it in the DirectX 11 API, the same happened with Mantle and DirectX 12. In both cases, the X-factor NVIDIA enjoys is a software-first approach, the way it engages with developers, and more importantly, the install-base (over 75% of the discrete GPU market-share). There have been several such examples of AMD silicon packing exotic accelerators across its hardware stack that haven't been properly exploited by the software community. The reasons are usually the same—AMD has been a hardware-first company.
Why is Tesla a hotter stock than General Motors? Because General Motors is an automobile company that happens to use some technology in its vehicles; whereas Tesla is a tech company that happens to know automobile engineering. Tesla vehicles are software-defined devices that can transport you around. Tesla's approach to transportation has been to understand what consumers want or might need from a technology standpoint, and then building the hardware to achieve it. In the end, you know Tesla for its savvy cars, much in the same way that you know NVIDIA for GPUs that "just work," and like Tesla, NVIDIA's revenues are overwhelmingly made up of hardware sales—despite them being software-first, or experience-first. Another example of exactly this would be Apple who have built a huge ecosystem of software and services that is designed to work extremely well together, but also locks people in their "walled garden," enabling huge profits for the company in the process.
NVIDIA's Weakness
This is not to say that AMD has neglected software at all—far from it, the company has played nice-guy by keeping much of its software base open-source, through initiatives such as GPUOpen and ROCm, which are great resources for software developers, and we definitely love the support for open source. It's just that AMD has not treated software as its main product, that makes people buy their hardware and bring in the revenues. AMD is aware of this and wants "to create a unified architecture across our CPU and RDNA, which will let us simplify [the] software." This looks like an approach similar to Intel's OneAPI, which makes a lot of sense, but it will be a challenging project. NVIDIA's advantage here is that they have just one kind of accelerator—the GPU, which runs CUDA—a single API for all developers to learn, which enables them to solve a huge range of computing challenges on hardware ranging from $200 to $30,000.
On the other hand, this is also a weakness of NVIDIA, and an advantage for AMD. AMD has a rich IP portfolio of compute solutions, ranging from classic CPUs and GPUs, to XDNA FPGA chips (through the Xilinx acquisition), now they just need to bring them together, exposing a unified computing interface that makes it easy to strategically shift workloads between these core types, to maximize performance, cost, efficiency or both. Such a capability would give the company the ability to sell customers a single-product combined accelerator system comprised of components like a CPU, GPU and specialized FPGA(s)—similar to how you're buying an iPhone and not a screen, processor, 5G modem and battery to combine them on your own.
Enabling Software Developers
If you've sat through NVIDIA's GTC sessions like we have, barely 5-10% of the showtime is spent talking about NVIDIA hardware (their latest AI GPUs or accelerators up and down the stack), most of the talk is about first-party software solutions—problems to solve, solutions, software, API, developer tools, collaboration tools, bare-metal system software, and only then the hardware. AMD started its journey toward exactly this.
They are now talking to the major software companies, like Microsoft, Adobe and OpenAI, to learn what their plans are and what they need from a future hardware generation. AMD's roadmaps now show the company's plans several years into the future, so that their partners can learn what AMD is creating, so the software products can better utilize these new features.
Market ResearchWe got a detailed presentation from market research firm IDC, which AMD contracted to study the short- and medium-term future of AI PCs, and the notion that PCs with native acceleration will bring a disruptive change to computing. This happened before, when bricks became iPhones, when networks became the Internet, and when text-based prompts were banished for GUI interfaces. To be honest, generative AI has taken a life of its own, and is playing a crucial role in the mainstreaming of this new tech, but the current implementation relies on cloud-based acceleration. Running everything in the cloud comes with huge power usage and expensive NVIDIA GPUs are used in the process. Are people willing to buy a whole new device just to get some of this acceleration onto their devices for privacy and latency? This remains to be seen. Even with 50 TOPS, the NPU of AMD "Strix" and Intel "Lunar Lake" won't exactly zip through image generation, but make text-based LLMs viable, as would certain audiovisual effects such as Teams webcam background replacements, noise suppression, and even live translation.
AMD is aware of the challenges, especially after Intel (Meteor Lake) and Microsoft (Copilot) spammed us with "AI" everywhere, and huge chunks of the userbase fail to see the convincing arguments. Privacy and Security are on AMD's radar, and you need to "demonstrate that you actually increase productivity per workload. If you're asking people to spend more money [... you need to prove that] you can save hours per week....that could be worth the investment, [but] will require a massive education of the end-users." There is also a goal to give special love to "build the most innovative and disruptive form factors" for notebooks, so that people are like "wow, there's something new here". Specifically in the laptop space they are watching Qualcomm's Windows on Arm initiative very closely and want to make sure to "launch a product only when it's ready," and to also "address price-points below $1000."
Where Does AMD Begin in 2024?What's the first stop in AMD's journey? It's to ensure that it's able to grow its market-share both on the client side with AI PCs, and on the data-center side, with its AI GPUs. For AI PCs, the company believes it has a winning product with the Ryzen AI 300 series "Strix Point" mobile processors, which it thinks are in a good position to ramp through 2024. What definitely helps is the fact that "Strix Point" is based on a relatively mature TSMC 4 nm foundry node, with which it can secure volumes; compared to Intel's "Lunar Lake" and upcoming "Arrow Lake," which are both expected to use TSMC's 3 nm foundry node. ASUS already announced a mid-July media event where it plans to launch dozens of AI PCs, all of which are powered by Ryzen AI 300 series chips, and meet Microsoft Copilot+ requirements. Over on the data-center side, AMD's MI300X accelerator is receiving spillover demand from competing NVIDIA H100 GPUs, and the company plans to continue investing in the software side of this solution, to bring in large orders from leading AI cloud-compute providers running popular AI applications.
The improvements to the software ecosystem will take some time, AMD is looking at a three to five year timeframe, and to support that, AMD has greatly increased their software engineer headcount as mentioned before. They have also accelerated their hardware development: "we are going to launch a new [Radeon] Instinct product every 12 months," which is a difficult task, but it helps react quicker to changes in the software markets and its demand. On the CPU side, the company "now has two CPU teams, one does n+1 [next generation] the other n+2 [two generations ahead]," which reminds us a bit of Intel's tick-tock strategy, which was more silicon manufacturing focused of course. When asked about Moore's Law and its demise, the company also commented that it is exploring "AI in chip design to go beyond place and route," and that "yesterday's war is more rasterization, more ray tracing, more bandwidth," the challenges of the next generations are not only hardware, but software support for nurturing relations with software developers plays a crucial role. AMD even thinks that the eternal tug-of-war between CPU and GPU could shift in the future: "we can't think of AI as a checkbox/gimmick feature like USB—AI could become the hero."
What's heartening though is that AMD has made the bold move of mobilizing resources toward hiring software talent over acquiring another hardware company like it usually does when its wallet is full—this will pay off in the coming years.
139 Comments on AMD is Becoming a Software Company. Here's the Plan
Both vendors will still have bugs, but no need to perpetuate the myth.
Still, then you think of marketing and you wonder why AMD's logo doesn't briefly appear when you boot up your Deck, right?
AMD should have been all over this. They're still not presenting themselves as the gaming hardware company, I don't get it at all. They have everything except any Intel PC on lockdown! They didn't follow up proper on the RT push - not even by saying loudly and repeatedly that it ain't ready and they're biding their time. They just don't say anything, there was this one burp many moons ago about them waiting for RT to appear in the midrange... and here we are. They're grossly behind. You'd think they'd have taken that time proper to get up to speed, or otherwise have a marketing story why its not required.
None of all that. Now they're becoming a software company. And they'll make lots of software I have no doubt. But then there's selling it.
I also wonder if they'll start leveraging their Xilinx IP to provide dedicated RT/AI capabilities that can be repurposed when not in use for gaming or rendering.
You know what it reminds me of? Volkswagen 'switching to the EV'. They've produced some okay EV's but nothing special in any sort of way, while the competition is racing past them left and right, both in the West and East. They're still making ICE's, still spending tons of resources on it, still really not changing anything and thinking they can do EV's 'on the side'. It really spells lack of commitment, and I think AMD is guilty of the same thing. Its the story of their GPU architectures' lives too. Good start, sometimes a good followup, and then things stall again. The only space where they keep their commitment (renewed...) is CPU.
Software wise, its Vulkan and DirectX12Ultimate for up-to-date graphics programming... and of course CUDA. Intel has OneAPI going but who knows how long that will take for Intel to cancel / change methodologies??
AMD ROCm, as bad as it is compared to NVidia, continues to improve. Further investments into ROCm push AMD ahead of everyone else. NVidia hardware continues to grow in price far exceeding the hardware, the only thing holding back AMD is the software.
The "AI stuff" for coding isn't even that much of an effort. Port some GPU kernels to TensorFlow or whatever else is popular in the AI Field. The bulk of the infrastructure to get ROCm loaded and running is kernel device drivers, the llvm clang compiler, a whole slew of infrastructure libraries and compilers + assemblers + linkers for AMD's CDNA and RDNA GPU instruction sets.
This is a great move by AMD. If they can unify their hardware through software, this would be a major advantage over the other companies. More importantly, they need to execute on releasing products and focus on being first to market for once.
AMD is mostly a CPU company (by revenue) with a sizable GPU side-project and is only worth Billions. They're absolutely the underdogs in this scenario. AMD's GPUs are 2nd best in the world but 2nd best is still much much smaller than NVidia in the great scheme of things.
"Multibillion" is surprisingly passe in today's economy. We're in the age of Trillions.
I haven't had AMD driver issues in ~6 years but tuning for stability has been a nightmare.
Gotta get things exactly right, which is also insane to think about.
When AMD starts to hyperfocus on the software side of things, the rest will come very quickly.
Everything we do in this computer space has us marked as users first. We need new software.
Hell, the local Ngreedia fanbois love to go after anyone that dares stating this or saying “I support FSR for the greater good of sustaining an open PC gaming platform “ Please see above. This is really sad coming from a staff member that should know better. The bad drivers is a lie, please stop spreading FUD. Also ChimeraOS, really nice experience.
And actually, the devs of both distros are working together in making both better. I agree with you that they should do more, but i think one of the reasons why everyone loves to work with AMD is that they dont pull things like this.
But yes, they should at least, show their logo somewhere.
I haven't had any major issues that I said you know what I need to switch.
You could also hold your opinion but also acknowledge the strides AMD has taken in regards to their driver quality. The two are not mutually exclusive.
I didn't say TPU hates AMD and that is exactly what you are making it seem like.
Some people way smarter than I have said that the anticheat software should be server side, not client side.
We have enough to worry in our systems to now have a program with absolute access to it.
I do believe that AMD is late on this by many years but can still turn this around. I know that AMD wants to start at the high end but it would be nice if ROCm trickles down further than just the 7900XTX. I read higher up that the tinkers don't dabble on high end hardware and i agree. If AMD wants help "Open Source" they need to extend this down to the lower models so other can play with APIs for cheap.
On the topic of Marketing, like others have been stating if they can get Sony, Microsoft and Steam to splash screen the AMD logo even for 3 sec on boot it would help exponentially. This could help get rid of that stigma that's been around for ages and may bring in some more sales. Which in turn adds up to be able to add more software engineers in the long run.
Yes, I know, that’s completely BS on AMD part, but some have indeed got ROCm to work on other gpus besides the 7900 xtx.
I'm not trying to make it seem like anything, I pointed out the facts of the discussion thus far. If you think that your own words reflect badly on TPU, there's an easy solution. Condition your statements as personal ones or cool the rhetoric.
I'm hoping you take one of those options but given the malicious use of the lol emoji I'm not holding my breath.
www.reuters.com/technology/french-antitrust-regulators-preparing-nvidia-charges-sources-say-2024-07-01/
www.heise.de/news/Frankreich-will-gegen-Nvidias-KI-Dominanz-vorgehen-9790187.html
If you want to attach "nuance" and made up rules to my original comment, that's on you.
"you think staff views should always align with TPU reviews" is simply a scarecrow argument on your part, not what I said. You are trying to strip my argument of context because you know such a statement doesn't stand when taken in whole.
"I was laughing inside" reacting with an emoji is an externalization.
Now back to computers, I like how they went to having more IPC.
Interestingly, a Northwood 2.4 I had, acted more like an Athlon! 2.8 GHz performed well, despite people claiming I would need a lot more than 2.8 GHz!