Raja Koduri Previews "PetaFLOPs Scale" 4-Tile Intel Xe HP GPU

Steevo · Aug 20, 2020

Blueberries said:
Having linear scalability is WILD, and 10.5TFLOPS on a single chipset is nothing to scoff at.

I'll rehash what I said when Xe was announced: if Intel doesn't provide a competitive product with their initial launch, they absolutely will with their third or fourth generation.

Also a broken clock is right twice a day.

mtcn77 · Aug 20, 2020

stimpy88 said:
Him leaving AMD was the best thing that has happened to them since Lisa Su and the Zen architecture.

The guy literally aimed at destroying Mr. Eric Demer's career. I happen to be a fan of him.
I hope industry sees a comeback until the score is settled...
Intel's main advantage is EMIB vs. tsv architecture. Intel's is clearly better, though AMD has taken great strides and knows the ins-and-outs of the technology very clearly. AMD can rain down on the Intel parade anytime an opportunity presents itself.

On The Coming Chiplet Revolution and AMD's MCM Promise

With Moore's Law being pronounced as within its death throes, historic monolithic die designs are becoming increasingly expensive to manufacture. It's no secret that both AMD and NVIDIA have been exploring an MCM (Multi-Chip-Module) approach towards diverting from monolithic die designs over to...

www.techpowerup.com

PowerPC · Aug 20, 2020

stimpy88 said:
Him leaving AMD was the best thing that has happened to them since Lisa Su and the Zen architecture.

I never said anything about what him leaving meant for AMD.

I only said his expression in this picture says it all about what he probably feels now about this move. He left way before it was clear that Intel was going under and AMD was rising over them. Kinda strange that you have to point out something from my post that I never argued.

stimpy88 · Aug 20, 2020

PowerPC said:
I never said anything about what him leaving meant for AMD.

I only said his expression in this picture says it all about what he probably feels now about this move. He left way before it was clear that Intel was going under and AMD was rising over them. Kinda strange that you have to point out something from my post that I never argued.

I answered your opinion of him, with my own opinion of him.

dragontamer5788 · Aug 21, 2020

Blueberries said:
Having linear scalability is WILD, and 10.5TFLOPS on a single chipset is nothing to scoff at.

I'll rehash what I said when Xe was announced: if Intel doesn't provide a competitive product with their initial launch, they absolutely will with their third or fourth generation.

Their track record with Itanium and Xeon Phi would say otherwise.

I mean heck, Xe could arguably bee the continuation of Larabee / Xeon Phi, since its simply Intel's next coprocessor. Granted, they're starting over from scratch on this one (or at least, on their Gen11 architecture), but this isn't the first time Intel has tried to enter the high-end Coprocessor market.

Mescalamba · Aug 22, 2020

Vayra86 said:
What strikes me with Intel in all of their new developments is the lack of focus on scalability in terms of yields. Nowhere can we see a straight copy of the idea of chiplets that are as small as possible. They're still trying to make big complicated stuff. Even these tiled GPUs are humongous. They're also differentiating everything all over the place with a myriad of product lines and tweaks... its like they literally don't WANT to make an efficient, single product stack and derive new products from it - they just build a whole new one for every little segment. The wide variety of core configurations alone... wtf.

Looks like old ideas desperately trying to keep themselves relevant, despite ever increasing foundry challenges. Its like they love to repeat 10nm. Intel seems to be adamant that extreme specialization and tweaking is the way forward... but isn't that a dead end, ultimately, and probably pretty soon?

I think their decisions are dictated by their marketing department, not development one.

Its basically looking like Kodak which tried really hard to pi** against wind, only to capitulate later, train has already left the station and Kodak left the building a bit later too..

Trying to force any market to do whatever you want is really really stupid idea. Much like mankind trying to do same with the nature. It never worked and never will. And it always comes back and bites bottom of anyone who is trying to do that.

IopaNalop · Aug 23, 2020

What is the Logic of the Title : PetaFLOPs = 1000 TeraFlops , But in the post the 21161 GFLOPS (21.161 TeraFLOPs) is nothing close to 1000 TeraFlops !!!

dragontamer5788 · Aug 23, 2020

IopaNalop said:
What is the Logic of the Title : PetaFLOPs = 1000 TeraFlops , But in the post the 21161 GFLOPS (21.161 TeraFLOPs) is nothing close to 1000 TeraFlops !!!

There are 2048 execution units, and each can perform 8xFP32 operations per clock cycle.

However, 4-bit Neural Networks do exist. If the GPU provides 8x FMA instructions per FP32 unit, that's 128 4-bit Tensor ops per cycle. Leading to...

Mr. Koduri has mentioned that the 4-tile chip is capable of "PetaFLOPs performance" which means that the GPU is going to be incredibly fast for tasks like machine learning and AI. Given that the GPU supports tensor cores if we calculate that it has 2048 compute units (EUs), capable of performing 128 operations per cycle (128 TOPs) and the fact that there are about 2 FMA (Fused Multiply-Add) units, that equals to about 524,288 FLOPs of AI power. This means that the GPU needs to be clocked at least at 2 GHz clock to achieve the PetaFLOP performance target, or have more than 128 TOPs of computing ability.

Roughly 1 PetaFLOP. Except its not a "Flop", its a "IOP" (integer-op), and only 4-bit at that. (Unless there's some 4-bit floating-point unit that I haven't heard of before...). Its a stretch for sure, but neural-nets are popular enough that it might be realistic for one or two customers out there...

System Name	Compy 386
Processor	7800X3D
Motherboard	Asus
Cooling	Air for now.....
Memory	64 GB DDR5 6400Mhz
Video Card(s)	7900XTX 310 Merc
Storage	Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s)	55" Samsung 4K HDR
Audio Device(s)	ATI HDMI
Mouse	Logitech MX518
Keyboard	Razer
Software	A lot.
Benchmark Scores	Its fast. Enough.

Processor	AMD Ryzen 9 5950X
Motherboard	Asus ROG Crosshair VIII Hero WiFi
Cooling	Arctic Liquid Freezer II 420
Memory	32Gb G-Skill Trident Z Neo @3806MHz C14
Video Card(s)	MSI GeForce RTX2070
Storage	Seagate FireCuda 530 1TB
Display(s)	Samsung G9 49" Curved Ultrawide
Case	Cooler Master Cosmos
Audio Device(s)	O2 USB Headphone AMP
Power Supply	Corsair HX850i
Mouse	Logitech G502
Keyboard	Cherry MX
Software	Windows 11

Raja Koduri Previews "PetaFLOPs Scale" 4-Tile Intel Xe HP GPU

New Member