Intel Claims "Ponte Vecchio" Will Trade Blows with NVIDIA Hopper in Most Compute Workloads

btarunr · Aug 24, 2022

With AMD and NVIDIA launching its next-generation HPC compute architectures, "Hopper" and CDNA2, it began seeming like Intel's ambitious "Ponte Vecchio" accelerator based on the Xe-HP architecture, has missed the time-to-market bus. Intel doesn't think so, and in its Hot Chips 34 presentation, disclosed some of the first detailed performance claims that—at least on paper—put the "Hopper" H100 accelerator's published compute performance numbers to shame. We already had some idea of how Ponte Vecchio would perform this spring, at Intel's ISC'22 presentation, but the company hadn't finalized the product's power and thermal characteristics, which are determined by its clock-speed and boosting behavior. Team blue claims to have gotten over the final development hurdles, and is ready with some big numbers.

Intel claims that in classic FP32 (single-precision) and FP64 (double-precision) floating-point tests, its silicon is highly competitive with the H100 "Hopper," with the company claiming 52 TFLOP/s FP32 for the "Ponte Vecchio," compared to 60 TFLOP/s for the H100; and a significantly higher 52 TFLOP/s FP64 for the "Ponte Vecchio," compared to 30 TFLOP/s for the H100. This has to do with the SIMD units of the Xe-HP architecture all being natively capable of double-precision floating-point operations; whereas NVIDIA's architecture typically relies on FP64-specialized streaming multiprocessors.

Where Intel claims dominance over NVIDIA is with the XMX-accelerated XMX-Float, an architecture-specific workload, where it scores 419 TFLOP/s. This test doesn't work on "Hopper," as it lacks specialized hardware. XMX-accelerated half-precision tests such as Bfloat16 (BF16) and FP16 performance is sub-par, with Intel claiming 839 TFLOP/s, compared to 2 PFLOP/s of the NVDIA chip. With 8-bit operations, such as INT8, even with XMX acceleration, "Ponte Vecchio" scores 1.678 PFLOP/s compared to 4 PFLOP/s of the NVIDIA chip.

Whether Intel has "missed the bus" for this generation in the HPC accelerator market will now boil down to pricing and availability. If Intel can manage good volumes, is able to leverage its oneAPI developer ecosystem, is able to score design wins with major HPC projects and cloud-compute providors; and most importantly, is able to beat "Hopper" in price-performance and energy-efficient, then Intel could remain relevant in this generation, and continue investments into the next.

View at TechPowerUp Main Site | Source

Daven · Aug 24, 2022

CDNA2 is around 48/48 FP32/FP64. All three solutions are competitive under this performance metric but AMD has been shipping its solution since last December, Nvidia only until recently and Intel’s solution to no one’s surprise is late and MIA.

edit: oh and TPU, the FP32 and FP64 numbers are flipped in the table for Hopper but correct in the text.

Tomorrow · Aug 24, 2022

Well unlike Ponte Vecchio, Hopper may actually release in the near future. Im sure Intel can quote whatever numbers they get in their labs but without the volume it may as well not exist. I mean seriously. AMD went with two chiplets in CPU's and only now will introduce 6 memory controller chiplets. All made on the same or nearly the same node.

But Intel out of the blue decides: Hey! Lets make a 47 chiplet GPU on 3 different nodes and fabs. No wonder it still has not launched and is facing delays.

Crackong · Aug 24, 2022

Vapourware is still vapourware ?

ncrs · Aug 24, 2022

It all looks great on paper, but this assumes that software is "XMX-acceleration enabled", which is asking a lot.
The most important part of the NVIDIA CUDA ecosystem is their software support. Ask AMD how it's going with trying to break into the GPGPU market - if you follow ROCm developments you know

While Intel has released a CUDA translation layer for oneAPI it's going to be a bumpy road with NVIDIA having the advantage.

pavle · Aug 24, 2022

Ponte Vecchio is old even before it came out.

Jimmy_ · Aug 24, 2022

ok! Great numbers... but the only question is - WHEN?

Tropick · Aug 24, 2022

"Whether Intel has "missed the bus" for this generation in the HPC accelerator market will now boil down to pricing and availability."

~~Whether~~ Intel has "missed the bus" for this generation in the HPC accelerator market ~~will now boil down to pricing and availability~~.

I'd say releasing your hardware over a year late puts you pretty firmly in "missing the bus" territory, Intel. Put up or shut up.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	DarkStar
Processor	AMD Ryzen 7 5800X3D
Motherboard	Gigabyte X570 Aorus Master 1.0 (BIOS F39g)
Cooling	Arctic Liquid Freezer II 420mm AIO (rev4)
Memory	4x8GB Patriot Viper DDR4 4400C19 @ 3733Mhz 14-14-13-27 1T
Video Card(s)	GAINWARD GeForce RTX 2080Ti Phoenix GS 11GB GDDR6 @ 2100Mhz Core/16Gbps Mem
Storage	1TB Samsung 990 Pro (OS);2TB Samsung PM9A1;4TB XPG S70 Blade (Games);14TB WD UltraStar HC530 (Video)
Display(s)	27" ASUS ROG Swift PG279Q @ 2560x1440 @ 165Hz IPS G-Sync
Case	be quiet! Dark Base Pro 900 Rev.2
Audio Device(s)	SteelSeries Arctis Nova Pro Wireless
Power Supply	1000W Seasonic PRIME Ultra Titanium;600W APC SMT750i UPS
Mouse	Logitech G604
Keyboard	Logitech G910 Orion Spark
Software	Windows 11 Pro x64 24H2 (Build 26100.3775)

System Name	Personal Gaming Rig
Processor	Ryzen 7800X3D
Motherboard	MSI X670E Carbon
Cooling	MO-RA 3 420
Memory	32GB 6000MHz
Video Card(s)	RTX 4090 ICHILL FROSTBITE ULTRA
Storage	4x 2TB Nvme
Display(s)	Samsung G8 OLED
Case	Silverstone FT04

Processor	i7-7700k @5ghz
Motherboard	Asus strix Z270-F
Cooling	EK AIO 240mm
Memory	Hyper-X ( 16 GB - XMP )
Video Card(s)	RTX 2080 super OC
Storage	512GB - WD(Nvme) + 1TB WD SDD
Display(s)	Acer Nitro 165Hz OC
Case	Deepcool Mesh 55
Audio Device(s)	Razer Karken X
Power Supply	Asus TUF gaming 650W brozen
Mouse	Razer Mamba Wireless & Glorious Model D Wireless
Keyboard	Cooler Master K70
Software	Win 10

System Name	Trackstar
Processor	AMD Ryzen 7 5800X3D -20 All Core CO (on Corsair XC5 block)
Motherboard	Gigabyte B550 AORUS Elite V2 Rev 1.0 (F17 BIOS)
Cooling	Corsair XD5 pump / Corsair XR5 1x 360mm (front) + 1x 420mm (top) rads
Memory	32GB G.Skill DDR4-3600 CL14 1:1 (F4-3600C14Q-32GVKA kit)
Video Card(s)	ASRock RX 6950XT OC Formula (on Bykski A-AR6900XTOCF-X block)
Storage	WD_BLACK SN850X 2TB w/HS (FW ver. 620361WD)
Display(s)	Dell S3222DGM 32" 1440p/165Hz FreeSync cap @ 160Hz
Case	Fractal Design Meshify S2
Audio Device(s)	Realtek ALC1200 Integrated Audio
Power Supply	Super Flower Leadex Platinum SE 1200W on Liebert GXT4-1500RT120 UPS
Mouse	Corsair Nightsword RGB
Keyboard	Corsair K60 RGB PRO
VR HMD	N/A
Software	Windows 11 Pro 23H2 (Build 22631.3958)
Benchmark Scores	https://www.3dmark.com/spy/53932022

Intel Claims "Ponte Vecchio" Will Trade Blows with NVIDIA Hopper in Most Compute Workloads

Editor & Senior Moderator