NVIDIA Announces the DGX-2 System - 16x Tesla V100 GPUs, 30 TB NVMe Memory for $400K

Raevenlord · Mar 27, 2018

NVIDIA's DGX-2 is likely the reason why NVIDIA seems to be slightly less enamored with the consumer graphics card market as of late. Let's be honest: just look at that price-tag, and imagine the rivers of money NVIDIA is making on each of these systems sold. The data center and deep learning markets have been pouring money into NVIDIA's coffers, and so, the company is focusing its efforts in this space. Case in point: the DGX-2, which sports performance of 1920 TFLOPs (Tensor processing); 480 TFLOPs of FP16; half again that value at 240 TFLOPs for FP32 workloads; and 120 TFLOPs on FP64.

NVIDIA's DGX-2 builds upon the original DGX-1 in all ways thinkable. NVIDIA looks at these as readily-deployed processing powerhouses, which include everything any prospective user that requires gargantuan amounts of processing power can deploy in a single system. And the DGX-2 just runs laps around the DGX-1 (which originally sold for $150K) in all aspects: it features 16x 32GB Tesla V100 GPUs (the DGX-1 featured 8x 16 GB Tesla GPUs); 1.5 TB of system ram (the DGX-1 features a paltry 0.5 TB); 30 TB NVMe system storage (the DGX-1 sported 8 TB of such storage space), and even includes a pair of Xeon Platinum CPUs (admittedly, the lowest performance increase in the whole system).

The DGX-2 has been made possible by NVIDIA's deployment of what it's calling their NVSwitch, which enables 300 GB/s chip-to-chip communication at 12 times the speed of PCIe. Paired with the company's NVLink2, this enables sixteen GPUs to be grouped together in a single system, for a total bandwidth going beyond 14 TB/s. NVIDIA is touting this as a 2 Petaflop-capable system, which isn't that hard to imagine with all of the underlying hardware - it does include 81,920 CUDA cores, and 10,240 Tensor processing cores (which are what NVIDIA uses to achieve that 2 Petaflop figure, if you were wondering. The DGX-2 consumes power that is adequate to its innards - some 10 KW of power in operation, and the whole system weighs 350 pounds.

Some of NVIDIA's remarks about this system follow:

NVSwitch: A Revolutionary Interconnect Fabric
NVSwitch offers 5x higher bandwidth than the best PCIe switch, allowing developers to build systems with more GPUs hyperconnected to each other. It will help developers break through previous system limitations and run much larger datasets. It also opens the door to larger, more complex workloads, including modeling parallel training of neural networks.

NVSwitch extends the innovations made available through NVIDIA NVLink, the first high-speed interconnect technology developed by NVIDIA. NVSwitch allows system designers to build even more advanced systems that can flexibly connect any topology of NVLink-based GPUs.

NVIDIA DGX-2: World's First Two Petaflop System
NVIDIA's new DGX-2 system reached the two petaflop milestone by drawing from a wide range of industry-leading technology advances developed by NVIDIA at all levels of the computing stack.

DGX-2 is the first system to debut NVSwitch, which enables all 16 GPUs in the system to share a unified memory space. Developers now have the deep learning training power to tackle the largest datasets and most complex deep learning models.

Combined with a fully optimized, updated suite of NVIDIA deep learning software, DGX-2 is purpose-built for data scientists pushing the outer limits of deep learning research and computing. DGX-2 can train FAIRSeq, a state-of-the-art neural machine translation model, in less than two days - a 10x improvement in performance from the DGX-1 with Volta, introduced in September.

View at TechPowerUp Main Site

cucker tarlson · Mar 27, 2018

I'm not much into this new tech and only care about geforce lineup, but holy crap, two petaflop at deep learning and 240 teraflop at fp32 make my head spin.

TheGuruStud · Mar 27, 2018

Nvidia calculates their prices with Volta.

TheoneandonlyMrK · Mar 27, 2018

500x in 5 years, oh and an extra £399,000 too , what the actual p
I belly laughed at this.

cucker tarlson · Mar 27, 2018

theoneandonlymrk said:
500x in 5 years, oh and an extra £399,000 too , what the actual p
I belly laughed at this.

margin of error difference

500x performance for 400x price, come one come all,this is once in a lifetime opportunity.

xorbe · Mar 27, 2018

That would be a very real problem if nVidia were to ever abandon the gaming market for higher margin segments.

Vayra86 · Mar 27, 2018

Y'all know there is only one question here.

xorbe said:
That would be a very real problem if nVidia were to ever abandon the gaming market for higher margin segments.

You have a good sense of humour I like it but no, that won't happen anytime soon. Gaming is a cash cow and deep learning is a new venture.

the54thvoid · Mar 27, 2018

Whether you love them or loathe them, Nvidia makes some pretty mental shit. I always thought DGX was for driving stuff and I'm thinking, how does that fit in a car..... Think I'll stick to making crayon drawings.

MasterInvader · Mar 27, 2018

"But can it run Crysis"

R-T-B · Mar 27, 2018

MasterInvader said:
"But can it run Crysis"

Yes.

_JP_ · Mar 27, 2018

MasterInvader said:
"But can it run Crysis"

So fast you don't even have to play it, it will show you how lousy you are at it without input, @4K120fps.

Fouquin · Mar 27, 2018

So to achieve the rated 2PF speeds you need to lock into nVidia's ecosystem with their proprietary Tensor cores. I guess this is great for organizations already on nVidia's plan. For everyone else, would the AMD/Inventec P47 rack (1PF full, 2PF half) be a more enticing offer from a deployment standpoint, considering it's standard mix of x86-64 and GPGPU?

Xzibit · Mar 27, 2018

the54thvoid said:
Whether you love them or loathe them, Nvidia makes some pretty mental shit. I always thought DGX was for driving stuff and I'm thinking, how does that fit in a car..... Think I'll stick to making crayon drawings.

Speaking about Driving Stuff

Yahoo Finance: Nvidia halts self-driving tests in wake of Uber accident

R-T-B · Mar 27, 2018

Xzibit said:
Speaking about Driving Stuff

Yahoo Finance: Nvidia halts self-driving tests in wake of Uber accident

The Uber accident was completely the pedestrians fault, but I digress...

Fluffmeister · Mar 28, 2018

R-T-B said:
The Uber accident was completely the pedestrians fault, but I digress...

Come on... play along. When humans get behind the wheel there are never any accidents. AI (and Nvidia) are a danger to car drivers and pedestrians all over the world.

Xzibit · Mar 28, 2018

R-T-B said:
The Uber accident was completely the pedestrians fault, but I digress...

Probably she wasn't inside a cross-walk

She was crossing from left to right. walked across a car lane before she got hit on the other car lane.

The car was equipped with 360 lidar with additonal lidar in the front and yet it failed to detect her in order to slow down or avoid.

according to police reports, it made no attempts to brake before the collision.

Something obviously went wrong.

Fluffmeister · Mar 28, 2018

Xzibit said:
Something obviously went wrong.

No shit. Titanic, Hindenburg, Challenger, Columbia...

Something obviously went wrong. Come on, be a little less jaded Nvidia Master.

boredsysadmin · Mar 28, 2018

Fouquin said:
So to achieve the rated 2PF speeds you need to lock into nVidia's ecosystem with their proprietary Tensor cores. I guess this is great for organizations already on nVidia's plan. For everyone else, would the AMD/Inventec P47 rack (1PF full, 2PF half) be a more enticing offer from a deployment standpoint, considering it's standard mix of x86-64 and GPGPU?

First of all AMD Project 47 is 1PT FULL and 1/2 PF half. So to reach 2PF you'd need 2 racks. Compare it approx 8U for DGX-2. Power usage for 2 racks of P47th - 66kw vs 10kW for a single DGX-2
Cost? I will absolutely guarantee you that two full racks or 40 servers with top-end CPU, GPU, and some local storage would cost SIGNIFICANTLY more than 400k. I guarantee it. I couldn't find a single estimation, but AMD compares it original IBM's Roadrunner which cost around $100m, AMD would cost "much less", means at least few millions.
I'm not nVidia fanboi, but numbers don't stack well for AMD here. sorry.

Xzibit · Mar 28, 2018

boredsysadmin said:
First of all AMD Project 47 is 1PT FULL and 1/2 PF half. So to reach 2PF you'd need 2 racks. Compare it approx 8U for DGX-2. Power usage for 2 racks of P47th - 66kw vs 10kW for a single DGX-2
Cost? I will absolutely guarantee you that two full racks or 40 servers with top-end CPU, GPU, and some local storage would cost SIGNIFICANTLY more than 400k. I guarantee it. I couldn't find a single estimation, but AMD compares it original IBM's Roadrunner which cost around $100m, AMD would cost "much less", means at least few millions.
I'm not nVidia fanboi, but numbers don't stack well for AMD here. sorry.

DXG-2 GPU Throughput:
FP16: 480 TFLOPs
FP32: 240 TFLOPs
FP64: 120 TFLOPs
Tensor (Deep Learning): 1.92 PFLOPs

Project 47 GPU Throughput:
FP16: 1.96 PFLOPs
FP32: 984 TFLOPs

the54thvoid · Mar 28, 2018

Xzibit said:
Probably she wasn't inside a cross-walk

She was crossing from left to right. walked across a car lane before she got hit on the other car lane.

The car was equipped with 360 lidar with additonal lidar in the front and yet it failed to detect her in order to slow down or avoid.

Something obviously went wrong.

The makers of the Lidar and Radar say that UBER disabled their system. UBER have not commented on this but they did write the software.

Bytales · Mar 28, 2018

The question is not wheter it can run Crysis, but wheter it can mine ALT-Coins/Shitcoins, worth less than a CENT each !

renz496 · Mar 28, 2018

xorbe said:
That would be a very real problem if nVidia were to ever abandon the gaming market for higher margin segments.

That will never happen. Sure those tesla is super expensive per unit. But no matter how expensive it is the revenue nvidia get from their gaming segment still easily eclipse the revenue they get from selling quadro and tesla. In fact nvidia already admit that majority of their R&D spending is being sustained by their gaming revenue. They able to push Deep Learning to where it is right now thanks to million of gamers around the world that buys thousands of GPU every year. Right now the new buzz will be "real time ray tracing" in games (back in 2009 it was tessellation). Like it or not gamer will buy new GPU so they can get to run this feature with acceptable performance.

cucker tarlson · Mar 28, 2018

Xzibit said:
DXG-2 GPU Throughput:
FP16: 480 TFLOPs
FP32: 240 TFLOPs
FP64: 120 TFLOPs
Tensor (Deep Learning): 1.92 PFLOPs

Project 47 GPU Throughput:
FP16: 1.96 PFLOPs
FP32: 984 TFLOPs

lol
you mean this

vs this

Also, nvidia has saturn V with 80 petaflops fp32 and 660 petaflops AI.

jabbadap · Mar 28, 2018

Well yeah but s/he is correct. You don't need two of them to beat one DXG-2 station(By pure fp16/fp32 TFlops). Sure Project 47 does not have Tensors or full fp64 compute(Full is 1/2 fp32, MI25 has 1/16 fp32) power so it can't beat DXG-2 in all compute tasks.

Vya Domus · Mar 28, 2018

boredsysadmin said:
First of all AMD Project 47 is 1PT FULL and 1/2 PF half. So to reach 2PF you'd need 2 racks. Compare it approx 8U for DGX-2. Power usage for 2 racks of P47th - 66kw vs 10kW for a single DGX-2
Cost? I will absolutely guarantee you that two full racks or 40 servers with top-end CPU, GPU, and some local storage would cost SIGNIFICANTLY more than 400k. I guarantee it. I couldn't find a single estimation, but AMD compares it original IBM's Roadrunner which cost around $100m, AMD would cost "much less", means at least few millions.
I'm not nVidia fanboi, but numbers don't stack well for AMD here. sorry.

One should take a better look at these things before drawing such a conclusion , Nvidia is very specific with their wording. Those 2 PFlops come with the help of Tensor Cores, it should go without saying that this sort of performance is not fully comparable with traditional heterogeneous computing.

Do not live under the false impression that Nvidia has some sort of magic sauce that no one else can conjure up. It's just a lot of dedicated silicon designed for a specific set of tasks. I can guarantee you that in the majority of cases AMD's traditional system is faster and more cost effective while Nvidia's only truly crushes it under very , very specific scenarios. Notice how Jensen talks about this stuff pretty much exclusively within the context of CNNs and that sort of stuff because that's really the only area they've focused on.

System Name	The Ryzening
Processor	AMD Ryzen 9 5900X
Motherboard	MSI X570 MAG TOMAHAWK
Cooling	Lian Li Galahad 360mm AIO
Memory	32 GB G.Skill Trident Z F4-3733 (4x 8 GB)
Video Card(s)	Gigabyte RTX 3070 Ti
Storage	Boot: Transcend MTE220S 2TB, Kintson A2000 1TB, Seagate Firewolf Pro 14 TB
Display(s)	Acer Nitro VG270UP (1440p 144 Hz IPS)
Case	Lian Li O11DX Dynamic White
Audio Device(s)	iFi Audio Zen DAC
Power Supply	Seasonic Focus+ 750 W
Mouse	Cooler Master Masterkeys Lite L
Keyboard	Cooler Master Masterkeys Lite L
Software	Windows 10 x64

System Name	Purple rain
Processor	10.5 thousand 4.2G 1.1v
Motherboard	Zee 490 Aorus Elite
Cooling	Noctua D15S
Memory	16GB 4133 CL16-16-16-31 Viper Steel
Video Card(s)	RTX 2070 Super Gaming X Trio
Storage	SU900 128,8200Pro 1TB,850 Pro 512+256+256,860 Evo 500,XPG950 480, Skyhawk 2TB
Display(s)	Acer XB241YU+Dell S2716DG
Case	P600S Silent w. Alpenfohn wing boost 3 ARGBT+ fans
Audio Device(s)	K612 Pro w. FiiO E10k DAC,W830BT wireless
Power Supply	Superflower Leadex Gold 850W
Mouse	G903 lightspeed+powerplay,G403 wireless + Steelseries DeX + Roccat rest
Keyboard	HyperX Alloy SilverSpeed (w.HyperX wrist rest),Razer Deathstalker
Software	Windows 10
Benchmark Scores	A LOT

Processor	OCed 5800X3D
Motherboard	Asucks C6H
Cooling	Air
Memory	32GB
Video Card(s)	OCed 6800XT
Storage	NVMees
Display(s)	32" Dull curved 1440
Case	Freebie glass idk
Audio Device(s)	Sennheiser
Power Supply	Don't even remember

System Name	RyzenGtEvo/ Asus strix scar II
Processor	Amd R5 5900X/ Intel 8750H
Motherboard	Crosshair hero8 impact/Asus
Cooling	360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory	Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s)	Powercolour RX7900XT Reference/Rtx 2060
Storage	Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s)	Samsung UAE28"850R 4k freesync.dell shiter
Case	Lianli 011 dynamic/strix scar2
Audio Device(s)	Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply	corsair 1200Hxi/Asus stock
Mouse	Roccat Kova/ Logitech G wireless
Keyboard	Roccat Aimo 120
VR HMD	Oculus rift
Software	Win 10 Pro
Benchmark Scores	8726 vega 3dmark timespy/ laptop Timespy 6506

System Name	Purple rain
Processor	10.5 thousand 4.2G 1.1v
Motherboard	Zee 490 Aorus Elite
Cooling	Noctua D15S
Memory	16GB 4133 CL16-16-16-31 Viper Steel
Video Card(s)	RTX 2070 Super Gaming X Trio
Storage	SU900 128,8200Pro 1TB,850 Pro 512+256+256,860 Evo 500,XPG950 480, Skyhawk 2TB
Display(s)	Acer XB241YU+Dell S2716DG
Case	P600S Silent w. Alpenfohn wing boost 3 ARGBT+ fans
Audio Device(s)	K612 Pro w. FiiO E10k DAC,W830BT wireless
Power Supply	Superflower Leadex Gold 850W
Mouse	G903 lightspeed+powerplay,G403 wireless + Steelseries DeX + Roccat rest
Keyboard	HyperX Alloy SilverSpeed (w.HyperX wrist rest),Razer Deathstalker
Software	Windows 10
Benchmark Scores	A LOT

NVIDIA Announces the DGX-2 System - 16x Tesla V100 GPUs, 30 TB NVMe Memory for $400K

Raevenlord

News Editor

cucker tarlson

TheGuruStud

TheoneandonlyMrK

cucker tarlson

xorbe

Vayra86

the54thvoid

Super Intoxicated Moderator

MasterInvader

R-T-B

_JP_

Fouquin

Staff

Xzibit

R-T-B

Fluffmeister

Xzibit

Fluffmeister

boredsysadmin

New Member

Xzibit

the54thvoid

Super Intoxicated Moderator

Bytales

renz496

cucker tarlson

jabbadap

Vya Domus

System Name	msdos
Processor	8086
Motherboard	mainboard
Cooling	passive
Memory	640KB + 384KB extended
Video Card(s)	EGA
Storage	5.25"
Display(s)	80x25
Case	plastic
Audio Device(s)	modchip
Power Supply	45 watts
Mouse	serial
Keyboard	yes
Software	disk commander
Benchmark Scores	still running

Processor	7800X3D
Motherboard	MSI MAG Mortar b650m wifi
Cooling	Thermalright Peerless Assassin
Memory	32GB Corsair Vengeance 30CL6000
Video Card(s)	ASRock RX7900XT Phantom Gaming
Storage	Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s)	Gigabyte G34QWC (3440x1440)
Case	Lian Li A3 mATX White
Audio Device(s)	Harman Kardon AVR137 + 2.1
Power Supply	EVGA Supernova G2 750W
Mouse	Steelseries Aerox 5
Keyboard	Lenovo Thinkpad Trackpoint II
Software	W11 IoT Enterprise LTSC
Benchmark Scores	Over 9000

Processor	Ryzen 7800X3D
Motherboard	MSI MAG Mortar B650 (wifi)
Cooling	be quiet! Dark Rock Pro 4
Memory	32GB Kingston Fury
Video Card(s)	Gainward RTX4070ti
Storage	Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s)	LG 32" 165Hz 1440p GSYNC
Case	Asus Prime AP201
Audio Device(s)	On Board
Power Supply	be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software	W10

System Name	GameRig / Server
Processor	Ryzen 5800X3D / Threadripper 2920X
Motherboard	ROG Strix X570-E / ROG Zenith Extreme
Cooling	Dual OCCOOL 360 XT45 / Single EK 360 KIT
Memory	Trident Z 16GB / Trident Z 32GB 3600MHz
Video Card(s)	RTX 3080 / Quadro P2000
Storage	WD Black SN770 2TB / WD Re 12TB SAS
Display(s)	LG 34" UltraWide 34GN850-B / Samsung CF391
Case	Corsair Obsidian 800D / 700
Audio Device(s)	Onboard / Onboard
Power Supply	Corsair HX1000i / HX850
Software	Win11Pro x64 / Vmware vSphere
Benchmark Scores	420 6969.....

System Name	Pioneer
Processor	Ryzen R9 9950X
Motherboard	GIGABYTE Aorus Elite X670 AX
Cooling	Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory	64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s)	XFX RX 7900 XTX Speedster Merc 310
Storage	Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s)	55" LG 55" B9 OLED 4K Display
Case	Thermaltake Core X31
Audio Device(s)	TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply	FSP Hydro Ti Pro 850W
Mouse	Logitech G305 Lightspeed Wireless
Keyboard	WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software	Gentoo Linux x64 / Windows 11 Enterprise IoT 2024

System Name	LenovoⓇ ThinkPad™ T430
Processor	IntelⓇ Core™ i5-3210M processor (2 cores, 2.50GHz, 3MB cache), Intel Turbo Boost™ 2.0 (3.10GHz), HT™
Motherboard	Lenovo 2344 (Mobile Intel QM77 Express Chipset)
Cooling	Single-pipe heatsink + Delta fan
Memory	2x 8GB KingstonⓇ HyperX™ Impact 2133MHz DDR3L SO-DIMM
Video Card(s)	Intel HD Graphics™ 4000 (GPU clk: 1100MHz, vRAM clk: 1066MHz)
Storage	SamsungⓇ 860 EVO mSATA (250GB) + 850 EVO (500GB) SATA
Display(s)	14.0" (355mm) HD (1366x768) color, anti-glare, LED backlight, 200 nits, 16:9 aspect ratio, 300:1 co
Case	ThinkPad Roll Cage (one-piece magnesium frame)
Audio Device(s)	HD Audio, RealtekⓇ ALC3202 codec, DolbyⓇ Advanced Audio™ v2 / stereo speakers, 1W x 2
Power Supply	ThinkPad 65W AC Adapter + ThinkPad Battery 70++ (9-cell)
Mouse	TrackPointⓇ pointing device + UltraNav™, wide touchpad below keyboard + ThinkLight™
Keyboard	6-row, 84-key, ThinkVantage button, spill-resistant, multimedia Fn keys, LED backlight (PT Layout)
Software	MicrosoftⓇ WindowsⓇ 10 x86-64 (22H2)

Processor	AMD Ryzen 7 3700X
Motherboard	MSI MAG B550 TOMAHAWK
Cooling	AMD Wraith Prism
Memory	Team Group Dark Pro 8Pack Edition 3600Mhz CL16
Video Card(s)	NVIDIA GeForce RTX 3080 FE
Storage	Kingston A2000 1TB + Seagate HDD workhorse
Display(s)	Samsung 50" QN94A Neo QLED
Case	Antec 1200
Power Supply	Seasonic Focus GX-850
Mouse	Razer Deathadder Chroma
Keyboard	Logitech UltraX
Software	Windows 11

System Name	Good enough
Processor	AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard	ASRock B650 Pro RS
Cooling	2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory	32GB - FURY Beast RGB 5600 Mhz
Video Card(s)	Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage	1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s)	LG UltraGear 32GN650-B + 4K Samsung TV
Case	Phanteks NV7
Power Supply	GPS-750C