Intel Launches Gaudi 3 AI Accelerator: 70% Faster Training, 50% Faster Inference Compared to NVIDIA H100, Promises Better Efficiency Too

AleksandarK · Apr 9, 2024

During the Vision 2024 event, Intel announced its latest Gaudi 3 AI accelerator, promising significant improvements over its predecessor. Intel claims the Gaudi 3 offers up to 70% improvement in training performance, 50% better inference, and 40% better efficiency than Nvidia's H100 processors. The new AI accelerator is presented as a PCIe Gen 5 dual-slot add-in card with a 600 W TDP or an OAM module with 900 W. The PCIe card has the same peak 1,835 TeraFLOPS of FP8 performance as the OAM module despite a 300 W lower TDP. The PCIe version works as a group of four per system, while the OAM HL-325L modules can be run in an eight-accelerator configuration per server. This likely will result in a lower sustained performance, given the lower TDP, but it confirms that the same silicon is used, just finetuned with a lower frequency. Built on TSMC's N5 5 nm node, the AI accelerator features 64 Tensor Cores, delivering double the FP8 and quadruple FP16 performance over the previous generation Gaudi 2.

The Gaudi 3 AI chip comes with 128 GB of HBM2E with 3.7 TB/s of bandwidth and 24 200 Gbps Ethernet NICs, with dual 400 Gbps NICs used for scale-out. All of that is laid out on 10 tiles that make up the Gaudi 3 accelerator, which you can see pictured below. There is 96 MB of SRAM split between two compute tiles, which acts as a low-level cache that bridges data communication between Tensor Cores and HBM memory. Intel also announced support for the new performance-boosting standardized MXFP4 data format and is developing an AI NIC ASIC for Ultra Ethernet Consortium-compliant networking. The Gaudi 3 supports clusters of up to 8192 cards, coming from 1024 nodes comprised of systems with eight accelerators. It is on track for volume production in Q3, offering a cost-effective alternative to NVIDIA accelerators with the additional promise of a more open ecosystem. More information and a deeper dive can be found in the Gaudi 3 Whitepaper.

View at TechPowerUp Main Site | Source

Space Lynx · Apr 9, 2024

Shares a production line on TSMC? lol bad move Intel. Nvidia already bought all the production time from TSMC. Vaporware.

thesmokingman · Apr 9, 2024

Pat is begging Wallstreet to believe... lmao.

ScaLibBDP · Apr 9, 2024

Simply to note: Intel evaluates performance of its latest hardware with already outdated line of NVIDIA H100 accelerators.

Minus Infinity · Apr 10, 2024

Hardly a single person in the AI field believes Intel can be trusted to support the hardware long-term and then there is also the question of their software SYCL no one uses. ROCm on the other hand is well liked. Nearly all analysts says only AMD is is competitor to Nvidia.

Scrizz · Apr 10, 2024

Minus Infinity said:
Hardly a single person in the AI field believes Intel can be trusted to support the hardware long-term and then there is also the question of their software SYCL no one uses. ROCm on the other hand is well liked. Nearly all analysts says only AMD is is competitor to Nvidia.

Analysts lmao

Tomorrow · Apr 10, 2024

Space Lynx said:
Shares a production line on TSMC? lol bad move Intel. Nvidia already bought all the production time from TSMC. Vaporware.

Intel will use 5nm. Nvidia Blackwell will use N4P.

stimpy88 · Apr 10, 2024

If only there was an open-source alternative to CUDA. nGreedia would be back to "for the gamers" in a heartbeat!

Space Lynx · Apr 10, 2024

Tomorrow said:
Intel will use 5nm. Nvidia Blackwell will use N4P.

won't AI still be using 5nm node though? and that node is sold out until late 2025 last I read.

Tomorrow · Apr 10, 2024

Space Lynx said:
won't AI still be using 5nm node though? and that node is sold out until late 2025 last I read.

Both nodes are confirmed and tho they are both technically "5nm class" then i doubt that Intel will have capacity issues at TSMC.
I mean it also depends on demand. I doubt that the demand for Intel's products will rise that sharply.

Actually Intel using 5nm is a sign that they have not managed to secure more advanced nodes from TSMC. Nvidia already confirmed that they will use N4P and i suspect AMD will the same or similar. A top tier AI product going into volume production in Q3 2024 is generally expected to be made on 4nm or even 3nm already, not 5nm.

Soul_ · Apr 10, 2024

Space Lynx said:
won't AI still be using 5nm node though? and that node is sold out until late 2025 last I read.

And you think Intel just started their supply chain activities? Sold out to who? I think the answer is in your question itself.

Fouquin · Apr 10, 2024

People don't understand how important those efficiency numbers are over H100. Right now it's not always a matter of how fast the core is, it's a matter of how many can be running at once. Even if Intel's chip was slower, if they offered more performance per watt than NVIDIA or AMD they would be a better buy. In the US at least we are literally hitting the limit of how many of these massive AI clusters we can have operating. A single state's power grid can only sustain maybe 100,000 H100 systems before capacity is exceeded. This creates an incredibly massive network bottleneck as all these clusters have to be sharing the load across great distances to train new models. If Intel comes in and says, "Hey, we can put 180,000 units into your maximum power budget with higher performance," that's a big win for them.

The other factor is availability, which we have seen referenced a few times now in regards to NVIDIA. NVIDIA has been delaying orders, withholding systems, or outright limiting purchase quantities with many of their clients that can't wait for the long lead times on getting H100 systems in hand and running. Intel and AMD are in prime position to take those clients from NVIDIA, and if Intel shows they can outright beat the H100s that these clients have already likely been trying to buy, and can offer shorter delivery windows, they become the defacto choice for those clients.

Minus Infinity · Apr 11, 2024

Scrizz said:
Analysts lmao

As in people working in the field. You butt hurt Intel isn't a player?

Scrizz · Apr 11, 2024

Minus Infinity said:
As in people working in the field. You butt hurt Intel isn't a player?

I like how you turn this into a personal attack. I know what an analyst is... something you don't based on your reply of "As in people working in the field."
I never mentioned Intel and don't have a horse in the race. I merely laughed at using analysts as a source of truth. Analysts often times have a vested interest in steering people's opinion (and money) in certain directions.

Ahhzz · Apr 11, 2024

This is not the thread, the section, or the forums for personal attacks. Keep it aimed at the topic and not each other. Only warning.

Processor	7800X3D -25 all core
Motherboard	B650 Steel Legend
Cooling	Frost Commander 140
Memory	32gb ddr5 (2x16) cl 30 6000
Video Card(s)	Merc 310 7900 XT @3100 core -.75v
Display(s)	Agon 27" QD-OLED Glossy 240hz 1440p
Case	NZXT H710
Power Supply	Corsair RM850x

Processor	AMD 5900x
Motherboard	Asus x570 Strix-E
Cooling	Hardware Labs
Memory	G.Skill 4000c17 2x16gb
Video Card(s)	RTX 3090
Storage	Sabrent
Display(s)	Samsung G9
Case	Phanteks 719
Audio Device(s)	Fiio K5 Pro
Power Supply	EVGA 1000 P2
Mouse	Logitech G600
Keyboard	Corsair K95

System Name	:)
Processor	Intel 13700k
Motherboard	Gigabyte z790 UD AC
Cooling	Noctua NH-D15
Memory	64GB GSKILL DDR5
Video Card(s)	Gigabyte RTX 4090 Gaming OC
Storage	960GB Optane 905P U.2 SSD + 4TB PCIe4 U.2 SSD
Display(s)	Alienware AW3423DW 175Hz QD-OLED + AOC Agon Pro AG276QZD2 240Hz QD-OLED
Case	Fractal Design Torrent
Audio Device(s)	MOTU M4 - JBL 305P MKII w/2x JL Audio 10 Sealed --- X-Fi Titanium HD - Presonus Eris E5 - JBL 4412
Power Supply	Silverstone 1000W
Mouse	Roccat Kain 122 AIMO
Keyboard	KBD67 Lite / Mammoth75
VR HMD	Reverb G2 V2
Software	Win 11 Pro

Processor	AMD Ryzen 9 5950X
Motherboard	Asus ROG Crosshair VIII Hero WiFi
Cooling	Arctic Liquid Freezer II 420
Memory	32Gb G-Skill Trident Z Neo @3806MHz C14
Video Card(s)	MSI GeForce RTX2070
Storage	Seagate FireCuda 530 1TB
Display(s)	Samsung G9 49" Curved Ultrawide
Case	Cooler Master Cosmos
Audio Device(s)	O2 USB Headphone AMP
Power Supply	Corsair HX850i
Mouse	Logitech G502
Keyboard	Cherry MX
Software	Windows 11

Processor	7800X3D -25 all core
Motherboard	B650 Steel Legend
Cooling	Frost Commander 140
Memory	32gb ddr5 (2x16) cl 30 6000
Video Card(s)	Merc 310 7900 XT @3100 core -.75v
Display(s)	Agon 27" QD-OLED Glossy 240hz 1440p
Case	NZXT H710
Power Supply	Corsair RM850x

Intel Launches Gaudi 3 AI Accelerator: 70% Faster Training, 50% Faster Inference Compared to NVIDIA H100, Promises Better Efficiency Too

AleksandarK

News Editor

Space Lynx

Astronaut

thesmokingman

ScaLibBDP

Minus Infinity

Scrizz

Tomorrow

stimpy88

Space Lynx

Astronaut

Tomorrow

Soul_

Fouquin

Staff

Minus Infinity

Scrizz

Ahhzz

Super Moderator

System Name	OrangeHaze / Silence
Processor	i7-13700KF / i5-10400 /
Motherboard	ROG STRIX Z690-E / MSI Z490 A-Pro Motherboard
Cooling	Corsair H75 / TT ToughAir 510
Memory	64Gb GSkill Trident Z5 / 32GB Team Dark Za 3600
Video Card(s)	Palit GeForce RTX 2070 / Sapphire R9 290 Vapor-X 4Gb
Storage	Hynix Plat P41 2Tb\Samsung MZVL21 1Tb / Samsung 980 Pro 1Tb
Display(s)	22" Dell Wide/24" Asus
Case	Lian Li PC-101 ATX custom mod / Antec Lanboy Air Black & Blue
Audio Device(s)	SB Audigy 7.1
Power Supply	Corsair Enthusiast TX750
Mouse	Logitech G502 Lightspeed Wireless / Logitech G502 Proteus Spectrum
Keyboard	K68 RGB — CHERRY® MX Red
Software	Win10 Pro \ RIP:Win 7 Ult 64 bit