AI Startup Etched Unveils Transformer ASIC Claiming 20x Speed-up Over NVIDIA H100

dragontamer5788 · Jun 26, 2024

Vya Domus said:
I really doubt it, things like MI300 look to be much faster in general purpose compute and likely a lot cheaper, if demand for ML drops off a cliff you don't want these on your hand, it will take ages till you break ROI.

Nvidia really doesn't treat these as anything more than ML accelerators despite them still being "GPUs" technically, they have far inferior FP64/FP16 performance compared to MI300 for example.

AMD is making good hardware, but as usual the question is if AMD's software can keep up.

For the most part, people don't want to port off CUDA for minor gains that AMD's hardware represents. The HPC / Supercomputer guys probably aren't even using ROCm for the most part, but are instead writing programs at higher-levels and relying upon a smaller team of specialists to port just elements of their kernel to ROCm one step at a time. (A structure only possible because National Labs have much more $$$ to afford specialist programmers like this).

I think AMD is making good progress. They've found that NVidia is lagging on traditional SIMD compute and have carved out a niche for themselves. But NVidia still "wins" because of the overall software package in practice.

Vya Domus · Jun 26, 2024

dragontamer5788 said:
but as usual the question is if AMD's software can keep up.

Is there any actual example of an open source piece of software that runs much faster on equivalent Nvidia hardware simply because it's using CUDA ? HIP is basically a straight up copy of CUDA, it doesn't have some features like dynamic parallelism but there is no obvious reason that I can see for software just being intrinsically worse all of the time.

phints · Jun 26, 2024

X to doubt.

dragontamer5788 · Jun 26, 2024

Vya Domus said:
Is there any actual example of an open source piece of software that runs much faster on equivalent Nvidia hardware simply because it's using CUDA ? HIP is basically a straight up copy of CUDA, it doesn't have some features like dynamic parallelism but there is no obvious reason that I can see for software just being intrinsically worse all of the time.

Its not about "running faster", its about "running at all".

AMD only officially supports HIP / ROCm on their most expensive MI GPUs, which are in the $5,000+ tier. There used to be cards like Rx 580 that worked for a while, but then the latest HIP / ROCm drops support and then bugs start to creep in. So now what? You either throw away the Rx 580 and upgrade to the Vega (or whatever HIP supports), only to find out that Vega64 loses support and its time to upgrade to Rx 7800. Etc. etc.

NVidia's software support simply lasts long enough for your projects to actually work. Case in point: try to run Blender's HIP on an Rx 580 or Vega.

Then, try CUDA on an NVidia 1080 Ti. Which still works.

----------

Even MI level chips, like the MI60, lose support faster than NVidia chips.

Hakker · Jun 26, 2024

Specialized ASICs will always take over. Easiest way to look at it is Bitcoin. First it was all CPU, then GPU's came into play for great gains until ASIC came along and absolutely shattered GPU mining. The same is with the current A.I. rage. Full ASIC solutions will take it over. At the end of the day Nvidia's products are still general purpose solutions.

Firedrops · Jun 26, 2024

AI hardware is where battery tech was 5 years ago - every 2 weeks someone will claim a 50 bajillion percent increase coming SoonTM.

Vya Domus · Jun 26, 2024

dragontamer5788 said:
Its not about "running faster", its about "running at all".

But this isn't a matter of software being worse, it's a matter of software simply not being developed because Nvidia has a monopoly in these industries.

dragontamer5788 said:
AMD only officially supports HIP / ROCm on their most expensive MI GPUs, which are in the $5,000+ tier.

We weren't talking about consumer GPUs here.

ScaLibBDP · Jun 26, 2024

>>...ASIC outperform Hopper by 20x and Blackwell by 10x...

During last a couple of years from time to time I hear news like that. It is very impressive, however, experts always look at benchmarks and in most cases companies do Not release benchmarks since it will show Real Performance (!) of a hardware and it could be very different to internal in-house made evaluations!

Next, if you're interested to see more Hardware News like this one take a look at The Linley Group youtube channel:

The Linley Group

The Linley Group is the leading supplier of independent technology analysis and strategic consulting in semiconductors for a broad range of applications including networking, communications, PCs, servers, mobile, and embedded. We cover emerging and mature markets such as deep learning (AI)...

www.youtube.com

The channel has 109 videos ( I watched All of them! ), it is in a frozen state ( last video was uploaded 3 years ago ), and almost every second company was making statements that "...We Made It Better Than NVIDIA!..".

I didn't follow these companies since it would be a waste of time for me but I think that most of them do Not do well. At the same time NVIDIA made $22.6 billion in the last quarter ( ended on April 28th of 2024 ), and NVIDIA makes more and more money.

dragontamer5788 · Jun 26, 2024

Vya Domus said:
But this isn't a matter of software being worse, it's a matter of software simply not being developed because Nvidia has a monopoly in these industries.

We weren't talking about consumer GPUs here.

NVidia is ahead in things like Blender rendering because NVidia has the money to pay for software devs.

As anyone knows in any computing project: it's the software that's expensive. The hardware is whatever today. Even at NVidia prices, the software is where the bulk of the costs are going.

AMD has always made fine hardware. It's just the software support that's lacking. And yes, making sure that MI60 works for more than 5 years is important.

In this thread, people are talking about how T100 or other older NVidia cards can be used instead of H100 or other more recent chips.

Do you see anyone, anywhere, ever saying the same about MI25? MI60? MI100?

I get that AMD doesn't have the resources to keep software support on all of their GPUs. But... This crap is important to the people spending $100,000,000+ on software development on GPU platforms. You can't just cut support like AMD does every few years and expect a community to grow.

Eventually, AMD will make enough money to make a stable software platform for its GPUs. ROCm is better but people are still nervous about getting burned by previous losses of software support.

Vya Domus · Jun 26, 2024

dragontamer5788 said:
And yes, making sure that MI60 works for more than 5 years is important.

First GPUs with CUDA support where dropped pretty soon as well, no one is using MI60s at this point in time anyway so no I really don't think it's important at all.

dragontamer5788 said:
NVidia is ahead in things like Blender rendering because NVidia has the money to pay for software devs.

I don't know why Blender is faster on Nvidia GPUs and neither do you or anyone else, the CUDA backend isn't open source as far as I know so we don't know why it's faster, it could be hardware related. This is the point that I am making, you keep saying the software is worse but I can't see any conclusive evidence that really is the case, software isn't being developed on the AMD side of things for obvious market share reasons but this isn't proof the software is inferior.

dragontamer5788 · Jun 26, 2024

Vya Domus said:
First GPUs with CUDA support where dropped pretty soon as well, no one is using MI60s at this point in time anyway so no I really don't think it's important at all.

Uhhhh... MI60 was released in 2018. That's the same time as the P100 and the NVidia 1080, both of which have support for CUDA today (and even Blender rendering).

It says a lot about AMD's software support that AMD cannot support a professional level $5000+ card like MI60 as long as Nvidia can support a consumer card like the GTX 1080.

I don't know why Blender is faster on Nvidia GPUs and neither do you or anyone else, the CUDA backend isn't open source as far as I know so we don't know why it's faster, it could be hardware related. This is the point that I am making, you keep saying the software is worse but I can't see any conclusive evidence that really is the case, software isn't being developed on the AMD side of things for obvious market share reasons but this isn't proof the software is inferior.

blender/intern/cycles/kernel/device/gpu/kernel.h at main · blender/blender · GitHub

This is the CUDA / HIP / OneAPI source code to the Blender Cycles renderer kernel. You can see that its largely the same code between AMD, NVidia and Intel.

Note: AMD's original contributions to Blender were the OpenCL kernels. Which if you haven't noticed, has been completely thrown away by the Blender team by Blender 4.0. Instead, the NVidia CUDA code has remained the same and instead the CUDA code serves as the basis for HIP and OneAPI today.

I dare you to pretend that I don't know much about Blender's kernels or GPU code. I'm not a professional or anything, but I did spend some time studying this code to learn my hobby GPU abilities. I've been reading this code and following its development for years at this point (albeit at a hobby level, but its seriously one of the best demonstrations of how GPU code evolves over time in a real project). The good, the bad, the lessons learned... Blender team has experienced it and they've exerpienced it in public.

AMD's code and optimizations were thrown out with OpenCL. That's the problem, the code reached a dead end and couldn't be built on top of anymore. Blender's CUDA in contrast has over a decade of growth and stability.

Vya Domus · Jun 27, 2024

dragontamer5788 said:
MI60 was released in 2018. That's the same time as the P100 and the NVidia 1080, both of which have support for CUDA today (and even Blender rendering).

That's not the point, it's unrealistic to expect that a piece of software would continue to support hardware released so early in it's lifecycle, CUDA was already a thing for like 8 years when P100 was released. Plus I am pretty sure you can still compile code that would run on an MI60 even if it's not officially supported anymore.

dragontamer5788 said:
This is the CUDA / HIP / OneAPI source code to the Blender Cycles renderer kernel. You can see that its largely the same code between AMD, NVidia and Intel.

If this is the case this doesn't support your argument, it means the software side is fine seeing as the code is the same everywhere, it's the hardware that makes the difference after all.

KLMR · Jun 27, 2024

A render and a chart, thats what you need to "emerge".

System Name	Good enough
Processor	AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard	ASRock B650 Pro RS
Cooling	2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory	32GB - FURY Beast RGB 5600 Mhz
Video Card(s)	Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage	1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s)	LG UltraGear 32GN650-B + 4K Samsung TV
Case	Phanteks NV7
Power Supply	GPS-750C

System Name	Stealth Machine
Processor	AMD 9800X3D
Motherboard	AMD B850 Riptide Wifi
Cooling	Artic Liquid Freezer III 360
Memory	G.Skill Flare X5 DDR5-6000 30-40-40-96
Video Card(s)	XFX Mercury AMD Radeon RX 9070 XT Gaming Edition
Storage	1x Samsung 980 Pro 2TB, 1x Lexar NM790 4TB, 1x Samsung 970 Evo 2TB, 1x Samsung 870 QVO 8TB
Display(s)	LG UltraGear OLED 27GX790A and a Dell 2012M
Case	Lian Li Lancool III
Audio Device(s)	Audeze Maxwell and Logitech Z906
Power Supply	BeQuiet Powerzone 2 1000W
Mouse	Logitech G502 Hero
Keyboard	Chilkey ND75
Software	Windows 11 Pro

System Name	Good enough
Processor	AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard	ASRock B650 Pro RS
Cooling	2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory	32GB - FURY Beast RGB 5600 Mhz
Video Card(s)	Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage	1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s)	LG UltraGear 32GN650-B + 4K Samsung TV
Case	Phanteks NV7
Power Supply	GPS-750C

System Name	Good enough
Processor	AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard	ASRock B650 Pro RS
Cooling	2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory	32GB - FURY Beast RGB 5600 Mhz
Video Card(s)	Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage	1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s)	LG UltraGear 32GN650-B + 4K Samsung TV
Case	Phanteks NV7
Power Supply	GPS-750C

System Name	Good enough
Processor	AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard	ASRock B650 Pro RS
Cooling	2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory	32GB - FURY Beast RGB 5600 Mhz
Video Card(s)	Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage	1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s)	LG UltraGear 32GN650-B + 4K Samsung TV
Case	Phanteks NV7
Power Supply	GPS-750C