Thursday, May 11th 2017

NVIDIA GV100 Silicon Detailed

NVIDIA at the GTC 2017 event, announced its next-generation "Volta" GPU architecture. As with its current "Pascal" architecture, "Volta" was unveiled in its biggest, most feature-rich implementation, the Tesla V100 HPC board, driven by the GV100 silicon. Given the HPC applications of NVIDIA's Tesla family of products, the GV100 has certain components that won't make it to the consumer GeForce family. Despite these, the GV100 is the pinnacle of NVIDIA's silicon engineering. According to the GPU block diagram released by the company, the GV100 has a similar component hierarchy to previous-generation NVIDIA chips, with some major changes to its basic number-crunching machinery, the streaming multiprocessor (SM).

The "Volta" streaming multiprocessor (SM) on the GV100 silicon features both FP32 and FP64 CUDA cores. Consumer graphics implementations of "Volta" which drive future GeForce products could lack those specialized FP64 cores. Each SM features 64 FP32 CUDA cores, and 32 FP64 cores. The FP64 cores can handle 32-bit, 16-bit, and even primitive 8-bit operations. The GV100 features 80 SMs, so you're looking at 5,120 FP32 and 2,560 FP64 CUDA cores. In addition, Volta introduces a component called Tensor cores, specialized machinery designed to speed up deep-learning training and neural net building. An SM has 8 of these, so the GV100 has 640. As with FP64 cores, Tensor cores may not make it to consumer-graphics implementations. Given its SM count, the GV100 features 320 TMUs. NVIDIA clocked the GV100 to run at 1455 MHz boost.
The Tesla V100 is advertised to offer 50% higher FP32 and FP64 peak performance over the "Pascal" based Tesla P100. Its peak FP32 throughput is rated at 15 TFLOP/s, with 7.5 TFLOP/s FP64 peak throughput. The Tensor cores "effectively" run at 120 TFLOP/s to perform their very specialized task of training deep-learning neural nets. These components feature matrix-matrix multiplication units, which is a key math operation in neural net training. They accelerate neural net building/training by 12X.
Built on the new 12 nanometer process, the GV100 is a multi-chip module with a large, 815 mm² GPU die, with a gargantuan transistor-count of 21.1 billion, neighbored by four 32 Gbit HBM2 memory stacks, which make up 16 GB of memory. These stacks interface with the GV100 over a 4096-bit wide memory interface, through a silicon interposer. At 1 GHz, this memory setup could cushion the GV100 with a memory bandwidth of 1 TB/s. HBM2 could still be exclusive to the Tesla family of products in NVIDIA's product-stack, as it continues to be expensive to implement in the consumer-segment for NVIDIA. Besides FP64 and Tensor cores, consumer implementations of "Volta" could feature inexpensive yet suitably fast GDDR6 memory. One of the pioneering manufacturers of HBM, SK Hynix, even demonstrated GDDR6 at GTC, so unless NVIDIA is fighting for its life in performance against AMD, we expect it to stick to GDDR6 in the consumer segment.
The Tesla V100 HPC card will be developed in two packages - integrated boards with NVLink interface for more high-density farm builds, and add-on card with PCI-Express interface for workstations. It will be sold through specialized retail channels.
Add your own comment

23 Comments on NVIDIA GV100 Silicon Detailed

#1
Caring1
Good that they appear to be using the full speed HBM2 and not the slightly slower version.
Posted on Reply
#2
DeathtoGnomes
The thing I hate about announcing new architecture is that they always say "some of these features wont be be available to consumers" Why the f*** say anything at all? idiots.
Posted on Reply
#3
ratirt
Wonder how would the Volta consumer cards look like. This Volta Tesla seems pretty monstrosity to me.
DeathtoGnomesThe thing I hate about announcing new architecture is that they always say "some of these features wont be available to consumers" Why the f*** say anything at all? idiots.
Cause its not needed or it's just way too expensive for NV and consumers would not afford it. Looking at current top notch cards from NV would you pay let say 3 grand for a video card?
Posted on Reply
#4
medi01
Transistor/mm2 figure didn't change much, hm.
DeathtoGnomesThe thing I hate about announcing new architecture is that they always say "some of these features wont be be available to consumers" Why the f*** say anything at all? idiots.
It's a 15k$ card aimed at certain use (not gaming) and some of it is not for consumers, exactly what is your problem with stating it?
Posted on Reply
#5
ZoneDymo
medi01Transistor/mm2 figure didn't change much, hm.




It's a 15k$ card aimed at certain use (not gaming) and some of it is not for consumers, exactly what is your problem with stating it?
Ermm everyone who buys these cards are consumers, they consume the products.
Its not like only gamers are consumers.

Their point makes sense, its a bit like those concept cars that get shown on car shows with all kinds of nifty gadgets that never make it into actual production cars, what is the point then?
If this card actually has features that we cant ever get, then again, what is the point of making them or talking about them at all?
Posted on Reply
#6
Vayra86
Inb4 Nvidia announces consumer GPUs

... with GDDR6 :)

You all know this is what its gonna be. Volta will be the usual 30-35% perf bump on each price point within the Geforce stack. From what I could read on GV100, all the new bits are for enterprise, not GFX.

With GDDR6 up to 16gb/s they have more than enough headroom to cover that perf bump, they could even stretch it out to the Volta Refresh seeing as 10gb/s > 16gb/s is +60%.
Posted on Reply
#7
bug
DeathtoGnomesThe thing I hate about announcing new architecture is that they always say "some of these features wont be be available to consumers" Why the f*** say anything at all? idiots.
For the same reason you don't buy an army Hummer for your daily commute. But don't let that stand in the way of trolling.

@Vayra86 I'd love to see a 30-35% performance increase, but my gut feeling tells me Nvidia will try to milk it a little more. I hope I'm wrong.
Posted on Reply
#8
Vayra86
bugFor the same reason you don't buy an army Hummer for your daily commute. But don't let that stand in the way of trolling.

@Vayra86 I'd love to see a 30-35% performance increase, but my gut feeling tells me Nvidia will try to milk it a little more. I hope I'm wrong.
To be fair, Pascal did a little more than 30% in many cases on high end, and also had an increased price point to go with that. Nvidia's milking every % they give you, so you're not gonna be wrong.
Posted on Reply
#9
Caring1
bugFor the same reason you don't buy an army Hummer for your daily commute.
Because it is illegal to own one?
Posted on Reply
#10
bug
Caring1Because it is illegal to own one?
Because it has hardware that does stuff you don't need ;)
Posted on Reply
#11
DeathtoGnomes
ratirtWonder how would the Volta consumer cards look like. This Volta Tesla seems pretty monstrosity to me.



Cause its not needed or it's just way too expensive for NV and consumers would not afford it. Looking at current top notch cards from NV would you pay let say 3 grand for a video card?
If it dances a jig and sings Hallelujah, and my wallet found some spare fold full of cash, hell ya. But as @bug says $NVDA is milking gamers for every penny of performance by teasing features that appear to be built into every card sold, but disabled cuz gamers can afford that extra feature that prolly wont be developed into anything useful anyways. So ya why both effin talking about something thats meant to just tease those with small, limited wallets.

:lovetpu:
Posted on Reply
#12
bug
DeathtoGnomesIf it dances a jig and sings Hallelujah, and my wallet found some spare fold full of cash, hell ya. But as @bug says $NVDA is milking gamers for every penny of performance by teasing features that appear to be built into every card sold, but disabled cuz gamers can afford that extra feature that prolly wont be developed into anything useful anyways. So ya why both effin talking about something thats meant to just tease those with small, limited wallets.

:lovetpu:
Please don't put words in my mouth, I never said that. I never even implied it.
Posted on Reply
#13
DeathtoGnomes
bugFor the same reason you don't buy an army Hummer for your daily commute. But don't let that stand in the way of trolling.

@Vayra86 I'd love to see a 30-35% performance increase, but my gut feeling tells me Nvidia will try to milk it a little more. I hope I'm wrong.
sorry if I misread your intent here.
Posted on Reply
#14
bug
DeathtoGnomessorry if I misread your intent here.
I meant, they may try to deliver the smallest possible performance increment with the smallest die they can use. Thus saving costs and keeping something in store for future iterations. Then again, I expected the same for Pascal and I was wrong.

Plus, consumer Pascal doesn't have disabled FP64 units. It's a different silicon, built without them. See: forums.anandtech.com/threads/gp100-and-gp104-are-different-architectures.2473319/ (resources were even added in the consumer chip, where it made sense)
Posted on Reply
#15
idx
Basically the new GTX cards going to be :

GTX**80 - 3584 SP
GTX**70 - 2688 SP
GTX**60 - 1792 SP

Expect more restrictions on clock speed.

EDIT:
GTX**80Ti will be whatever leftover of what nvidia can't sell as GV100.
GTX**50 will be something much smaller.. I guess.
Posted on Reply
#16
btarunr
Editor & Senior Moderator
idxBasically the new GTX cards going to be :

GTX**80 - 3584 SP
GTX**70 - 2688 SP
GTX**60 - 1792 SP

Expect more restrictions on clock speed.

EDIT:
GTX**80Ti will be whatever leftover of what nvidia can't sell as GV100.
GTX**50 will be something much smaller.. I guess.
I'm predicting these CUDA core counts:
  • GTX 2080: 3,072
  • GTX 2070: 2,688
  • GTX 2060: 1,536
Posted on Reply
#17
medi01
ZoneDymoErmm everyone who buys these cards are consumers, they consume the products.
consumer - a person who purchases goods and services for personal use.

says google.
Posted on Reply
#18
idx
btarunrI'm predicting these CUDA core counts:
  • GTX 2080: 3,072
  • GTX 2070: 2,688
  • GTX 2060: 1,536
I know what you thinking, if they do disable 4 SMs just like they did with the GV100 then indeed thats exactly what are we going to see.
Nvidia may do that for 2 reasons:
  1. Sell as much yields as possible.
  2. Locking the performance ( in case if these gpus can clock really high ? ).
EDIT: I don't think they are going to leave the GTX 2070 and GTX 2080 that close in configuration ( Nvidia was so bothered by those who did overclock their GTX970s ... remember).
If they are going to disable 4 SMs from the GTX 2080, then they are probably going to disable more than 25% of the GTX 2070.
Posted on Reply
#19
bug
btarunrI'm predicting these CUDA core counts:
  • GTX 2080: 3,072
  • GTX 2070: 2,688
  • GTX 2060: 1,536
They can release the cards with a single shader. I only care about overall performance :D
Posted on Reply
#20
jabbadap
Hmm just trying to remember that summit information. Was it 40TFlops of fp64 per node? So there's probably 6*V100 teslas per node(GV100 has six nvlinks vs four of GP100). That would be at maximum steam 45Tflops though, but I presume there will be some sort of throttled base clock to keep temps and power sane. And yeah of course that power 9 offers some TFlops grunt too.
Posted on Reply
#21
TheoneandonlyMrK
looks like Q3 is the earliest we will see them ish ..

And now, Jensen announces NVIDIA DGX-1 with eight Telsa v100. It’s labeled on the slide as the “essential instrument of AI research. What used to take a week now takes a shift. It replaces 400 servers. It offers 960 tensor TFLOPS. It will ship in Q3. It will cost $149,000. He notes that if you get one now powered by Pascal, you’ll get a free upgrade to Volta.

Turns out, there’s also a small version of DGX-1, DGXX Station. Think of it as a personal sized one. It’s liquid cooled and whisper quiet. Every one of our deep learning engineers has one.

It has four Tesla V100s. It’s $69K. Order it now and we’ll deliver it in Q3. “So place your order now,” he avers. via NVIDIA
Posted on Reply
#22
DeathtoGnomes
I want AI software to play with, maybe I'll come up $69k after that. :cool::kookoo:
Posted on Reply
#23
Kanan
Tech Enthusiast & Gamer
Thanks for this thorough, non-childish actually adult, news article. This is why I'm here.

PS. On speculation for GTX 1200 series (probable name, not "2080", skipping 10 gens) I think new Titan Xv will have 4096-4584 or even up to 5120 cores (fully activated chip). GTX 1280 could have 3072 to 3584 shaders, 2560-3072 should be new GTX 1270 (partly deactivated chip) making these pretty powerful at the 400-600 dollar range, 4k gaming will be a easy thing by then, being a "normal" high end gamer. Enthusiasts will have over 100 fps for 4k without using SLI.
Posted on Reply
Add your own comment
Nov 15th, 2024 15:17 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts