Wednesday, June 21st 2017
NVIDIA Announces the Tesla V100 PCI-Express HPC Accelerator
NVIDIA formally announced the PCI-Express add-on card version of its flagship Tesla V100 HPC accelerator, based on its next-generation "Volta" GPU architecture. Based on the advanced 12 nm "GV100" silicon, the GPU is a multi-chip module with a silicon substrate and four HBM2 memory stacks. It features a total of 5,120 CUDA cores, 640 Tensor cores (specialized CUDA cores which accelerate neural-net building), GPU clock speeds of around 1370 MHz, and a 4096-bit wide HBM2 memory interface, with 900 GB/s memory bandwidth. The 815 mm² GPU has a gargantuan transistor-count of 21 billion. NVIDIA is taking institutional orders for the V100 PCIe, and the card will be available a little later this year. HPE will develop three HPC rigs with the cards pre-installed.
13 Comments on NVIDIA Announces the Tesla V100 PCI-Express HPC Accelerator
Is it 5120 shaders x 2 x 1.3 GHz clock = 14 TFLOPs? Or am I missing something?
10-20% increased framerates then. Hmmm....
Apparently geforce volta is going to use gddr5(x) and/or gddr6 though, so maybe they'll arrive in march 2018. Still, according to my research looking at tflop to framerate increases, 1080 ti to volta xx80 is likely to only be 10-20% higher framerates, rather than 980 ti to 1080 20-30% increased framerates. Not too spectacular then unless nvidia used a LOT of magic.
Don't use hardware count to guess performance, on that metric, AMD should have been destroying Nvidia for years.
Nvidia has a very refined and streamlined architecture, it reaps rewards.
Well if we assume that V6000 is full GV102 like P6000 is GP102, the "marketed"* Tflops for P6000 is 12TFlops. Thus clock speed is 12000/(60*64*2)=~1.56GHz. And then if V6000 is full GV102, with that clock speed "marketed"* TFlops would be 2*84*64*1.56 =~ 16.8 TFlops. And consider this: GV100 is huge and fat die(815mm²) and it's still keeping almost same clocks with same power envelope than GP100 with smaller 610mm² die. We just don't have enough information about rest of Volta family to know how much higher can clocks go when you can give more power to them.
*Nvidia marketed TFlops are calculated from given boost clock, which are actually lower than card is operating on normal 3D usage like gaming. I.E. 1.56GHz for pascal arch is very low frequency.