Monday, February 3rd 2020

NVIDIA's Next-Generation "Ampere" GPUs Could Have 18 TeraFLOPs of Compute Performance

Feb 3rd, 2020 00:35 Discuss (172 Comments)

NVIDIA will soon launch its next-generation lineup of graphics cards based on a new and improved "Ampere" architecture. With the first Tesla server cards that are a part of the Ampere lineup going inside Indiana University Big Red 200 supercomputer, we now have some potential specifications and information about its compute performance. Thanks to the Twitter user dylan552p(@dylan522p), who did some math about the potential compute performance of the Ampere GPUs based on NextPlatform's report, we discovered that Ampere is potentially going to feature up to 18 TeraFLOPs of FP64 compute performance.

With Big Red 200 supercomputer being based on Cray's Shasta supercomputer building block, it is being deployed in two phases. The first phase is the deployment of 672 dual-socket nodes powered by AMD's EPYC 7742 "Rome" processors. These CPUs provide 3.15 PetaFLOPs of combined FP64 performance. With a total of 8 PetaFLOPs planned to be achieved by the Big Red 200, that leaves just a bit under 5 PetaFLOPs to be had using GPU+CPU enabled system. Considering the configuration of a node that contains one next-generation AMD "Milan" 64 core CPU, and four of NVIDIA's "Ampere" GPUs alongside it. If we take for a fact that Milan boosts FP64 performance by 25% compared to Rome, then the math shows that the 256 GPUs that will be delivered in the second phase of Big Red 200 deployment will feature up to 18 TeraFLOPs of FP64 compute performance. Even if "Milan" doubles the FP64 compute power of "Rome", there will be around 17.6 TeraFLOPs of FP64 performance for the GPU.

Sources: @dylan522p(Twitter), The Next Platform

Add your own comment

172 Comments on NVIDIA's Next-Generation "Ampere" GPUs Could Have 18 TeraFLOPs of Compute Performance

#26

Legacy-ZA

And it can be yours for a mere $10 000, pre-order now.

#27

gamefoo21

jabbadapNah big navi does not compete with this. It's Vega based Arcturus.

At least there's someone else who understands this.

Volta is really quite different than the Turing cores.

Vega V20 fights directly with Volta and provides some a lot of competition.

Navi is the compute crippled GeForce/Quadro fighter.

This is the big reason that the APUs use Vega based GFX because for office stuff OpenCL performance is king. V20 also has the same degenerated rendering hardware as Navi. So it can do GFX like Navi but crushes it in compute tasks.

A simple comparison...

Xbox One/S and the PS4 variants use Polaris GPUs and they are in the Fury/Navi line.

The Xbox One X uses a GPU based off of the 290X/Vega branch.

Radeon VII isn't much faster in gaming than a 5700XT, but it's 5.5 times faster in compute. The 5700XT has basically the same compute performance as a 2080Ti.

Edit: I really hope Arcturus brings it, because the sooner CUDA dies out the better. Locking so much important research to a closed system isn't good. OpenCL ftw!

#28

cucker tarlson

gamefoo21Radeon VII isn't much faster in gaming than a 5700XT, but it's 7 times faster in compute. The 5700XT has basically the same compute performance as a 2080Ti.

Radeon 7=7x 2080Ti
makes sense
at least there's someone else who understands this.

#29

gamefoo21

cucker tarlson7 times?

Sorry 6ish times...

Radeon 5700XT
FP64 (double) performance
609.6 GFLOPS (1:16)

Radeon VII
FP64 (double) performance
3.360 TFLOPS (1:4)

GeForce 2080Ti
FP64 (double) performance
420.2 GFLOPS (1:32)

Quadro RTX 5000 PC Jesus Edition
FP64 (double) performance
348.5 GFLOPS (1:32)

8x 2080ti
5.5x 5700XT
9.6x RTX 5000

Big boy cards what Ampere is actually fighting... Volta and V20

Instinct MI60
FP64 (double) performance
7.373 TFLOPS (1:2)

Quadro GV100
FP64 (double) performance
7.066 TFLOPS (1:2)

Quadro GV100S
FP64 (double) performance
8.177 TFLOPS (1:2)

Corrected

#30

MxPhenom 216

ASIC Engineer

dj-electricThis new architecture might be incredibly fast, but i can't even begin to imagine the pricing on products...

Rumors are pricing will stay the same as Turing. Could be less due to 7nm silicon cost savings.

#31

efikkan

My advice is to consume these rumors with generous amounts of NaCl.

ratirtI'm sure about High-end NAVI (RDNA) not coming but RDNA2 will hit the market this year. Lisa Su has already confirmed that the RDNA2 high-end graphics release in 2020 and CES will have a keynote about it and this will also have Ray tracing support. If this RDNA2 will compete with NV graphics is unknown but also we don't know much about new NV release.

Lisa Su also said that the high-end was important to them, funny considering AMD haven't been participating there for years. It makes me wonder what she means by "big Navi" and "high-end", it could simply mean anything bigger and better than RX 5700 XT.

ratirtI read that the high-end RDNA2 will flood markets with outstanding 4k performance. If that is going to happen only time will tell.

Outstanding 4K performance by most people's standard would mean performance way beyond RTX 2080 Ti, but in marketing terms it could easily mean something comparable to RTX 2080/2080 Super at a lower price than current pricing.

I haven't seen any AMD cards "flood the market" in the mid-range or high-end since the 200/300 series. Even "big hits" like RX 480/580 were outsold 8-10x by GTX 1060, etc.

#32

Steevo

ppnEUV will be limited to 429 mm2. AMD card will draw twice the power. So that 429mm2 draws 429 watts, and in case of nvidia 215 watts for the same performance. So there you have it.

I'm not sure what I just read. But damn, is it wrong.

#33

Super XP

cucker tarlsonRadeon 7=7x 2080Ti
makes sense
at least there's someone else who understands this.

I can't see 7x over the RX 5700XT.
Also a side note, in "Double-Precision Workloads" the Radeon VII is the fastest GPU on the planet.

#34

gamefoo21

Super XPI can't see 7x over the RX 5700XT.
Also a side note, in "Double-Precision Workloads" the Radeon VII is the fastest GPU on the planet.

I corrected it.

5.5x faster than the 5700XT
8x faster than the 2080ti

Turing isn't in the same game as Volta. Ampere will continue the tradition of being beastly compute oriented.

#35

Super XP

ppnEUV will be limited to 429 mm2. AMD card will draw twice the power. So that 429mm2 draws 429 watts, and in case of nvidia 215 watts for the same performance. So there you have it.

That's nonsense, double the power draw? Lol it's interesting how people get RDNA 1 and RDNA 2 confused, then come to a conclusion based on no design enhancements and just doubling the power draw based on today's Navi.

#36

cucker tarlson

Super XPAlso a side note, in "Double-Precision Workloads" the Radeon VII is the fastest GPU on the planet.

not even close

#37

ratirt

efikkanMy advice is to consume these rumors with generous amounts of NaCl.

Lisa Su also said that the high-end was important to them, funny considering AMD haven't been participating there for years. It makes me wonder what she means by "big Navi" and "high-end", it could simply mean anything bigger and better than RX 5700 XT.

Outstanding 4K performance by most people's standard would mean performance way beyond RTX 2080 Ti, but in marketing terms it could easily mean something comparable to RTX 2080/2080 Super at a lower price than current pricing.

I haven't seen any AMD cards "flood the market" in the mid-range or high-end since the 200/300 series. Even "big hits" like RX 480/580 were outsold 8-10x by GTX 1060, etc.

It could mean that. bigger than 5700XT. It could mean something else time will tell.

Outstanding is what I get from the article and what's been said. If you think it will be better than 2080Ti it's fine with me. I'm not going to make this assumption.
I will leave it as outstanding. When somebody compares Ryzen launch to what the new RDNA2 will brings, outstanding is what comes to my mind. If this is a marketing scheme we will have to wait and see.

#38

cucker tarlson

ratirtIt could mean that. bigger than 5700XT. It could mean something else time will tell.

Outstanding is what I get from the article and what's been said. If you think it will be better than 2080Ti it's fine with me. I'm not going to make this assumption.
I will leave it as outstanding. When somebody compares Ryzen launch to what the new RDNA2 will brings, outstanding is what comes to my mind. If this is a marketing scheme we will have to wait and see.

if it isn't faster than 2080ti,then it isn't standing out.

#39

Super XP

efikkanMy advice is to consume these rumors with generous amounts of NaCl.

Lisa Su also said that the high-end was important to them, funny considering AMD haven't been participating there for years. It makes me wonder what she means by "big Navi" and "high-end", it could simply mean anything bigger and better than RX 5700 XT.

Outstanding 4K performance by most people's standard would mean performance way beyond RTX 2080 Ti, but in marketing terms it could easily mean something comparable to RTX 2080/2080 Super at a lower price than current pricing.

I haven't seen any AMD cards "flood the market" in the mid-range or high-end since the 200/300 series. Even "big hits" like RX 480/580 were outsold 8-10x by GTX 1060, etc.

AMD hasn't participated in the high end. Well of course not, how could they after the Bulldozer release? People shouldn't be asking why AMD hasn't participated in the high end for years as it's already Very Well Known why they couldn't back then.

Soon after AMD launched the ZEN CPUs they were developing RDNA1. Soon after strong Ryzen sales and profits they injected much needed R&D into the RTG hence RDNA2.
On top of next gen gaming consoles coming out Christmas 2020 powered by RDNA2. Etc

This is old news already well known. People will underestimate AMDs RDNA2 potential because of how there Radeons performed in comparison to Nvidia GPUs. Without realizing AMD split the Server GPU and the Gamers GPU no longer releasing ONE design to appease every market segment.

If the gaming oriented RX 5700XT is not a positive indication that AMD in caught up to Nvidia then we will have to wait and see how this plays out in 2020.

cucker tarlsonnot even close

My post is based on FACTS. Is yours ? Nope

#40

ratirt

cucker tarlsonif it isn't faster than 2080ti,then it isn't standing out.

And you know it isn't? Cause you were arguing that it will never be released few posts up.
I said the odds are that it will be release cause it has been mentioned by Lisa Su, other executives from AMD and lead Radeon managers. Although you didn't say if you mean RDNA or RDNA2. If AMD says it is coming then it is. AMD's 2020 GPU roadmap says, New Navi (RDNA2) in 2020 but if you say it's not then maybe tell AMD this, cause they don't know that.

cucker tarlsonThere will be no big navi.
They cant even get the small navi to work properly.
Ampere will compete with turing and next gen consoles (and their desktop equivalents).

#41

gamefoo21

cucker tarlsonnot even close

Single GPU was the V20 MI60. Then NV super binned and overclocked the V100 to make the V100S.

Dual GPU single cards. The V20 based pro duo is king.

Under $2500 USD... Radeon VII is the fastest.

#42

jabbadap

gamefoo21I corrected it.

5.5x faster than the 5700XT
8x faster than the 2080ti

Turing isn't in the same game as Volta. Ampere will continue the tradition of being beastly compute oriented.

Well do correct it again, it's Tesla not Quadro as Quadro GV100 is actual product and it has 7.6TFlops of fp64 compute power. Actual server products are Tesla V100 SXM2 16GB/32GB, V100 PCIe 16GB/32GB, V100S PCIe 32GB...

So yeah this upcoming Tesla card will fight against upcoming AMD MI100 and Intel XE Ponte Vecchio server gpus.

#43

gamefoo21

jabbadapWell do correct it again, it's Tesla not Quadro as Quadro GV100 is actual product and it has 7.6TFlops of fp64 compute power. Actual server products are Tesla V100 SXM2 16GB/32GB, V100 PCIe 16GB/32GB, V100S PCIe 32GB...

So yeah this upcoming Tesla card will fight against upcoming AMD MI100 and Intel XE Ponte Vecchio server gpus.

I based my numbers and models directly on the default design numbers as per the GPU database.

NV already had Tesla cores in the 8000/9000 series...

I refer to the core architecture which your Tesla V100 is running what architecture again...

Architecture: Volta

#44

T4C Fantasy

CPU & GPU DB Maintainer

The die shown in the pic is not Volta or Turing afaik which means this next gen Tesla has 84 next gen SMs

So 64 cores per SM would be 5376
128 cores per SM would be 10752 cores

#45

Kapone33

gamefoo21Single GPU was the V20 MI60. Then NV super binned and overclocked the V100 to make the V100S.

Dual GPU single cards. The V20 based pro duo is king.

Under $2500 USD... Radeon VII is the fastest.

In what?

#46

Super XP

cucker tarlsonThere will be no big navi.
They cant even get the small navi to work properly.
Ampere will compete with turing and next gen consoles (and their desktop equivalents).

Quite the contrary lol
Dr. Lisa Su is a smart CEO, give credit where credit is due. Claiming she's lying about a high end graphics card i.e. Big Navi is basically insulting the women. Lmao

kapone32In what?

Double precision workloads.

R7 Double Precision Work Loads

#47

T4C Fantasy

CPU & GPU DB Maintainer

Also the next gen MI100 by AMD will be 8192 cores
128 CUs
lists.freedesktop.org/archives/amd-gfx/2019-July/036848.html

#48

gamefoo21

kapone32In what?

In FP64 workloads. For video processing the Radeon VII is the people's champion.

Despite Steve being so far up NVs ass all he knows how to bench for workstation performance is CUDA based. LoL

#49

Vya Domus

gamefoo21Volta is really quite different than the Turing cores.

It really isn't that different when it comes to compute, in fact Volta was the basis for Turing. SM configurations, caches, concurrency/scheduling, etc , they are practically the same with the exception of FP64 units and RT cores.

#50

gamefoo21

Vya DomusIt really isn't that different when it comes to compute, in fact Volta was the basis for Turing. SM configurations, caches, concurrency/scheduling, etc , they are practically the same with the exception of FP64 units and RT cores.

Different enough that the CUDA cores in Turing take multiple cycles to deal with big numbers still.

I think they are related like brothers. Same parents different outcomes using mostly the same building blocks.

Also it's easier to make them seem different because I see so many people posting this is a direct replacement for Turing, like this is what will be driving the 3080Ti... Parts of it, but this is the nerdy core, Turing and it's successor are more the jock core. Pretty but kinda shit at math.

LoL

Add your own comment

NVIDIA's Next-Generation "Ampere" GPUs Could Have 18 TeraFLOPs of Compute Performance

172 Comments on NVIDIA's Next-Generation "Ampere" GPUs Could Have 18 TeraFLOPs of Compute Performance

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

NVIDIA's Next-Generation "Ampere" GPUs Could Have 18 TeraFLOPs of Compute Performance

Related News

172 Comments on NVIDIA's Next-Generation "Ampere" GPUs Could Have 18 TeraFLOPs of Compute Performance

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts