NVIDIA GV100 Silicon Detailed

btarunr · May 11, 2017

NVIDIA at the GTC 2017 event, announced its next-generation "Volta" GPU architecture. As with its current "Pascal" architecture, "Volta" was unveiled in its biggest, most feature-rich implementation, the Tesla V100 HPC board, driven by the GV100 silicon. Given the HPC applications of NVIDIA's Tesla family of products, the GV100 has certain components that won't make it to the consumer GeForce family. Despite these, the GV100 is the pinnacle of NVIDIA's silicon engineering. According to the GPU block diagram released by the company, the GV100 has a similar component hierarchy to previous-generation NVIDIA chips, with some major changes to its basic number-crunching machinery, the streaming multiprocessor (SM).

The "Volta" streaming multiprocessor (SM) on the GV100 silicon features both FP32 and FP64 CUDA cores. Consumer graphics implementations of "Volta" which drive future GeForce products could lack those specialized FP64 cores. Each SM features 64 FP32 CUDA cores, and 32 FP64 cores. The FP64 cores can handle 32-bit, 16-bit, and even primitive 8-bit operations. The GV100 features 80 SMs, so you're looking at 5,120 FP32 and 2,560 FP64 CUDA cores. In addition, Volta introduces a component called Tensor cores, specialized machinery designed to speed up deep-learning training and neural net building. An SM has 8 of these, so the GV100 has 640. As with FP64 cores, Tensor cores may not make it to consumer-graphics implementations. Given its SM count, the GV100 features 320 TMUs. NVIDIA clocked the GV100 to run at 1455 MHz boost.

The Tesla V100 is advertised to offer 50% higher FP32 and FP64 peak performance over the "Pascal" based Tesla P100. Its peak FP32 throughput is rated at 15 TFLOP/s, with 7.5 TFLOP/s FP64 peak throughput. The Tensor cores "effectively" run at 120 TFLOP/s to perform their very specialized task of training deep-learning neural nets. These components feature matrix-matrix multiplication units, which is a key math operation in neural net training. They accelerate neural net building/training by 12X.

Built on the new 12 nanometer process, the GV100 is a multi-chip module with a large, 815 mm² GPU die, with a gargantuan transistor-count of 21.1 billion, neighbored by four 32 Gbit HBM2 memory stacks, which make up 16 GB of memory. These stacks interface with the GV100 over a 4096-bit wide memory interface, through a silicon interposer. At 1 GHz, this memory setup could cushion the GV100 with a memory bandwidth of 1 TB/s. HBM2 could still be exclusive to the Tesla family of products in NVIDIA's product-stack, as it continues to be expensive to implement in the consumer-segment for NVIDIA. Besides FP64 and Tensor cores, consumer implementations of "Volta" could feature inexpensive yet suitably fast GDDR6 memory. One of the pioneering manufacturers of HBM, SK Hynix, even demonstrated GDDR6 at GTC, so unless NVIDIA is fighting for its life in performance against AMD, we expect it to stick to GDDR6 in the consumer segment.

The Tesla V100 HPC card will be developed in two packages - integrated boards with NVLink interface for more high-density farm builds, and add-on card with PCI-Express interface for workstations. It will be sold through specialized retail channels.

View at TechPowerUp Main Site

Caring1 · May 11, 2017

Good that they appear to be using the full speed HBM2 and not the slightly slower version.

DeathtoGnomes · May 11, 2017

The thing I hate about announcing new architecture is that they always say "some of these features wont be be available to consumers" Why the f*** say anything at all? idiots.

ratirt · May 11, 2017

Wonder how would the Volta consumer cards look like. This Volta Tesla seems pretty monstrosity to me.

DeathtoGnomes said:
The thing I hate about announcing new architecture is that they always say "some of these features wont be available to consumers" Why the f*** say anything at all? idiots.

Cause its not needed or it's just way too expensive for NV and consumers would not afford it. Looking at current top notch cards from NV would you pay let say 3 grand for a video card?

medi01 · May 11, 2017

Transistor/mm2 figure didn't change much, hm.

DeathtoGnomes said:
The thing I hate about announcing new architecture is that they always say "some of these features wont be be available to consumers" Why the f*** say anything at all? idiots.

It's a 15k$ card aimed at certain use (not gaming) and some of it is not for consumers, exactly what is your problem with stating it?

ZoneDymo · May 11, 2017

medi01 said:
Transistor/mm2 figure didn't change much, hm.

It's a 15k$ card aimed at certain use (not gaming) and some of it is not for consumers, exactly what is your problem with stating it?

Ermm everyone who buys these cards are consumers, they consume the products.
Its not like only gamers are consumers.

Their point makes sense, its a bit like those concept cars that get shown on car shows with all kinds of nifty gadgets that never make it into actual production cars, what is the point then?
If this card actually has features that we cant ever get, then again, what is the point of making them or talking about them at all?

Vayra86 · May 11, 2017

Inb4 Nvidia announces consumer GPUs

... with GDDR6

You all know this is what its gonna be. Volta will be the usual 30-35% perf bump on each price point within the Geforce stack. From what I could read on GV100, all the new bits are for enterprise, not GFX.

With GDDR6 up to 16gb/s they have more than enough headroom to cover that perf bump, they could even stretch it out to the Volta Refresh seeing as 10gb/s > 16gb/s is +60%.

bug · May 11, 2017

DeathtoGnomes said:
The thing I hate about announcing new architecture is that they always say "some of these features wont be be available to consumers" Why the f*** say anything at all? idiots.

For the same reason you don't buy an army Hummer for your daily commute. But don't let that stand in the way of trolling.

@Vayra86 I'd love to see a 30-35% performance increase, but my gut feeling tells me Nvidia will try to milk it a little more. I hope I'm wrong.

Vayra86 · May 11, 2017

bug said:
For the same reason you don't buy an army Hummer for your daily commute. But don't let that stand in the way of trolling.

@Vayra86 I'd love to see a 30-35% performance increase, but my gut feeling tells me Nvidia will try to milk it a little more. I hope I'm wrong.

To be fair, Pascal did a little more than 30% in many cases on high end, and also had an increased price point to go with that. Nvidia's milking every % they give you, so you're not gonna be wrong.

Caring1 · May 11, 2017

bug said:
For the same reason you don't buy an army Hummer for your daily commute.

Because it is illegal to own one?

bug · May 11, 2017

Caring1 said:
Because it is illegal to own one?

Because it has hardware that does stuff you don't need

DeathtoGnomes · May 11, 2017

ratirt said:
Wonder how would the Volta consumer cards look like. This Volta Tesla seems pretty monstrosity to me.

Cause its not needed or it's just way too expensive for NV and consumers would not afford it. Looking at current top notch cards from NV would you pay let say 3 grand for a video card?

If it dances a jig and sings Hallelujah, and my wallet found some spare fold full of cash, hell ya. But as @bug says $NVDA is milking gamers for every penny of performance by teasing features that appear to be built into every card sold, but disabled cuz gamers can afford that extra feature that prolly wont be developed into anything useful anyways. So ya why both effin talking about something thats meant to just tease those with small, limited wallets.

:lovetpu:

bug · May 11, 2017

DeathtoGnomes said:
If it dances a jig and sings Hallelujah, and my wallet found some spare fold full of cash, hell ya. But as @bug says $NVDA is milking gamers for every penny of performance by teasing features that appear to be built into every card sold, but disabled cuz gamers can afford that extra feature that prolly wont be developed into anything useful anyways. So ya why both effin talking about something thats meant to just tease those with small, limited wallets.

Please don't put words in my mouth, I never said that. I never even implied it.

DeathtoGnomes · May 11, 2017

bug said:
For the same reason you don't buy an army Hummer for your daily commute. But don't let that stand in the way of trolling.

@Vayra86 I'd love to see a 30-35% performance increase, but my gut feeling tells me Nvidia will try to milk it a little more. I hope I'm wrong.

sorry if I misread your intent here.

bug · May 11, 2017

DeathtoGnomes said:
sorry if I misread your intent here.

I meant, they may try to deliver the smallest possible performance increment with the smallest die they can use. Thus saving costs and keeping something in store for future iterations. Then again, I expected the same for Pascal and I was wrong.

Plus, consumer Pascal doesn't have disabled FP64 units. It's a different silicon, built without them. See: https://forums.anandtech.com/threads/gp100-and-gp104-are-different-architectures.2473319/ (resources were even added in the consumer chip, where it made sense)

idx · May 11, 2017

Basically the new GTX cards going to be :

GTX**80 - 3584 SP
GTX**70 - 2688 SP
GTX**60 - 1792 SP

Expect more restrictions on clock speed.

EDIT:
GTX**80Ti will be whatever leftover of what nvidia can't sell as GV100.
GTX**50 will be something much smaller.. I guess.

btarunr · May 11, 2017

idx said:
Basically the new GTX cards going to be :

GTX**80 - 3584 SP
GTX**70 - 2688 SP
GTX**60 - 1792 SP

Expect more restrictions on clock speed.

EDIT:
GTX**80Ti will be whatever leftover of what nvidia can't sell as GV100.
GTX**50 will be something much smaller.. I guess.

I'm predicting these CUDA core counts:

GTX 2080: 3,072
GTX 2070: 2,688
GTX 2060: 1,536

medi01 · May 11, 2017

ZoneDymo said:
Ermm everyone who buys these cards are consumers, they consume the products.

consumer - a person who purchases goods and services for personal use.

says google.

idx · May 11, 2017

btarunr said:
I'm predicting these CUDA core counts:

GTX 2080: 3,072

GTX 2070: 2,688

GTX 2060: 1,536

I know what you thinking, if they do disable 4 SMs just like they did with the GV100 then indeed thats exactly what are we going to see.
Nvidia may do that for 2 reasons:

Sell as much yields as possible.
Locking the performance ( in case if these gpus can clock really high ? ).

EDIT: I don't think they are going to leave the GTX 2070 and GTX 2080 that close in configuration ( Nvidia was so bothered by those who did overclock their GTX970s ... remember).
If they are going to disable 4 SMs from the GTX 2080, then they are probably going to disable more than 25% of the GTX 2070.

bug · May 11, 2017

btarunr said:
I'm predicting these CUDA core counts:

GTX 2080: 3,072

GTX 2070: 2,688

GTX 2060: 1,536

They can release the cards with a single shader. I only care about overall performance

jabbadap · May 11, 2017

Hmm just trying to remember that summit information. Was it 40TFlops of fp64 per node? So there's probably 6*V100 teslas per node(GV100 has six nvlinks vs four of GP100). That would be at maximum steam 45Tflops though, but I presume there will be some sort of throttled base clock to keep temps and power sane. And yeah of course that power 9 offers some TFlops grunt too.

TheoneandonlyMrK · May 12, 2017

looks like Q3 is the earliest we will see them ish ..

And now, Jensen announces NVIDIA DGX-1 with eight Telsa v100. It’s labeled on the slide as the “essential instrument of AI research. What used to take a week now takes a shift. It replaces 400 servers. It offers 960 tensor TFLOPS. It will ship in Q3. It will cost $149,000. He notes that if you get one now powered by Pascal, you’ll get a free upgrade to Volta.

Turns out, there’s also a small version of DGX-1, DGXX Station. Think of it as a personal sized one. It’s liquid cooled and whisper quiet. Every one of our deep learning engineers has one.

It has four Tesla V100s. It’s $69K. Order it now and we’ll deliver it in Q3. “So place your order now,” he avers. via NVIDIA

DeathtoGnomes · May 12, 2017

I want AI software to play with, maybe I'll come up $69k after that. :cool:

Kanan · May 12, 2017

Thanks for this thorough, non-childish actually adult, news article. This is why I'm here.

PS. On speculation for GTX 1200 series (probable name, not "2080", skipping 10 gens) I think new Titan Xv will have 4096-4584 or even up to 5120 cores (fully activated chip). GTX 1280 could have 3072 to 3584 shaders, 2560-3072 should be new GTX 1270 (partly deactivated chip) making these pretty powerful at the 400-600 dollar range, 4k gaming will be a easy thing by then, being a "normal" high end gamer. Enthusiasts will have over 100 fps for 4k without using SLI.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	ASUS ROG Strix B450-E Gaming
Cooling	DeepCool Gammax L240 V2
Memory	2x 8GB G.Skill Sniper X
Video Card(s)	Palit GeForce RTX 2080 SUPER GameRock
Storage	Western Digital Black NVMe 512GB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	H7 Flow 2024
Processor	AMD 5800X3D
Motherboard	Asus X570 Tough Gaming
Cooling	Custom liquid
Memory	32 GB DDR4
Video Card(s)	Intel ARC A750
Storage	Crucial P5 Plus 2TB.
Display(s)	AOC 24" Freesync 1m.s. 75Hz
Mouse	Lenovo
Keyboard	Eweadn Mechanical
Software	W11 Pro 64 bit

System Name	Dumbass
Processor	AMD Ryzen 7800X3D
Motherboard	ASUS TUF gaming B650
Cooling	Artic Liquid Freezer 2 - 420mm
Memory	G.Skill Sniper 32gb DDR5 6000
Video Card(s)	GreenTeam 4070 ti super 16gb
Storage	Samsung EVO 500gb & 1Tb, 2tb HDD, 500gb WD Black
Display(s)	1x Nixeus NX_EDG27, 2x Dell S2440L (16:9)
Case	Phanteks Enthoo Primo w/8 140mm SP Fans
Audio Device(s)	onboard (realtek?) - SPKRS:Logitech Z623 200w 2.1
Power Supply	Corsair HX1000i
Mouse	Steeseries Esports Wireless
Keyboard	Corsair K100
Software	windows 10 H
Benchmark Scores	https://i.imgur.com/aoz3vWY.jpg?2

System Name	Bro2
Processor	Ryzen 5800X
Motherboard	Gigabyte X570 Aorus Elite
Cooling	Corsair h115i pro rgb
Memory	32GB G.Skill Flare X 3200 CL14 @3800Mhz CL16
Video Card(s)	Powercolor 6900 XT Red Devil 1.1v@2400Mhz
Storage	M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB
Display(s)	LG 27UD69 UHD / LG 27GN950
Case	Fractal Design G
Audio Device(s)	Realtec 5.1
Power Supply	Seasonic 750W GOLD
Mouse	Logitech G402
Keyboard	Logitech slim
Software	Windows 10 64 bit

System Name	M3401 notebook
Processor	5600H
Motherboard	NA
Memory	16GB
Video Card(s)	3050
Storage	500GB SSD
Display(s)	14" OLED screen of the laptop
Software	Windows 10
Benchmark Scores	3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.

NVIDIA GV100 Silicon Detailed

btarunr

Editor & Senior Moderator

Caring1

DeathtoGnomes

ratirt

medi01

ZoneDymo

Vayra86

bug

Vayra86

Caring1

bug

DeathtoGnomes

bug

DeathtoGnomes

bug

idx

btarunr

Editor & Senior Moderator

medi01

idx

bug

jabbadap

TheoneandonlyMrK

DeathtoGnomes

Kanan

Tech Enthusiast & Gamer

System Name	Cyberline
Processor	Intel Core i7 2600k -> 12600k
Motherboard	Asus P8P67 LE Rev 3.0 -> Gigabyte Z690 Auros Elite DDR4
Cooling	Tuniq Tower 120 -> Custom Watercoolingloop
Memory	Corsair (4x2) 8gb 1600mhz -> Crucial (8x2) 16gb 3600mhz
Video Card(s)	AMD RX480 -> RX7800XT
Storage	Samsung 750 Evo 250gb SSD + WD 1tb x 2 + WD 2tb -> 2tb MVMe SSD
Display(s)	Philips 32inch LPF5605H (television) -> Dell S3220DGF
Case	antec 600 -> Thermaltake Tenor HTCP case
Audio Device(s)	Focusrite 2i4 (USB)
Power Supply	Seasonic 620watt 80+ Platinum
Mouse	Elecom EX-G
Keyboard	Rapoo V700
Software	Windows 10 Pro 64bit

System Name	Tiny the White Yeti
Processor	7800X3D
Motherboard	MSI MAG Mortar b650m wifi
Cooling	CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory	32GB Corsair Vengeance 30CL6000
Video Card(s)	ASRock RX7900XT Phantom Gaming
Storage	Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s)	Gigabyte G34QWC (3440x1440)
Case	Lian Li A3 mATX White
Audio Device(s)	Harman Kardon AVR137 + 2.1
Power Supply	EVGA Supernova G2 750W
Mouse	Steelseries Aerox 5
Keyboard	Lenovo Thinkpad Trackpoint II
VR HMD	HD 420 - Green Edition ;)
Software	W11 IoT Enterprise LTSC
Benchmark Scores	Over 9000

Processor	Intel i5-12600k
Motherboard	Asus H670 TUF
Cooling	Arctic Freezer 34
Memory	2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s)	EVGA GTX 1060 SC
Storage	500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s)	Dell U3219Q + HP ZR24w
Case	Raijintek Thetis
Audio Device(s)	Audioquest Dragonfly Red :D
Power Supply	Seasonic 620W M12
Mouse	Logitech G502 Proteus Core
Keyboard	G.Skill KM780R
Software	Arch Linux + Win10

System Name	RyzenGtEvo/ Asus strix scar II
Processor	Amd R5 5900X/ Intel 8750H
Motherboard	Crosshair hero8 impact/Asus
Cooling	360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory	Gskill Trident Z 3900cas18 32Gb in four sticks./16Gb/16GB
Video Card(s)	Asus tuf RX7900XT /Rtx 2060
Storage	Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s)	Samsung UAE28"850R 4k freesync.dell shiter
Case	Lianli 011 dynamic/strix scar2
Audio Device(s)	Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply	corsair 1200Hxi/Asus stock
Mouse	Roccat Kova/ Logitech G wireless
Keyboard	Roccat Aimo 120
VR HMD	Oculus rift
Software	Win 10 Pro
Benchmark Scores	laptop Timespy 6506

System Name	eazen corp \| Xentronon 7.2
Processor	AMD Ryzen 7 3700X // PBO max.
Motherboard	Asus TUF Gaming X570-Plus
Cooling	Noctua NH-D14 SE2011 w/ AM4 kit // 3x Corsair AF140L case fans (2 in, 1 out)
Memory	G.Skill Trident Z RGB 2x16 GB DDR4 3600 @ 3800, CL16-19-19-39-58-1T, 1.4 V
Video Card(s)	Asus ROG Strix GeForce RTX 2080 Ti modded to MATRIX // 2000-2100 MHz Core / 1938 MHz G6
Storage	Silicon Power P34A80 1TB NVME/Samsung SSD 830 128GB&850 Evo 500GB&F3 1TB 7200RPM/Seagate 2TB 5900RPM
Display(s)	Samsung 27" Curved FS2 HDR QLED 1440p/144Hz&27" iiyama TN LED 1080p/120Hz / Samsung 40" IPS 1080p TV
Case	Corsair Carbide 600C
Audio Device(s)	HyperX Cloud Orbit S / Creative SB X AE-5 @ Logitech Z906 / Sony HD AVR @PC & TV @ Teufel Theater 80
Power Supply	EVGA 650 GQ
Mouse	Logitech G700 @ Steelseries DeX // Xbox 360 Wireless Controller
Keyboard	Corsair K70 LUX RGB /w Cherry MX Brown switches
VR HMD	Still nope
Software	Win 10 Pro
Benchmark Scores	15 095 Time Spy \| P29 079 Firestrike \| P35 628 3DM11 \| X67 508 3DM Vantage Extreme