• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Launches World's First High-Speed GPU Interconnect

Joined
Dec 6, 2011
Messages
4,784 (1.01/day)
Location
Still on the East Side
NVIDIA today announced that it plans to integrate a high-speed interconnect, called NVIDIA NVLink, into its future GPUs, enabling GPUs and CPUs to share data five to 12 times faster than they can today. This will eliminate a longstanding bottleneck and help pave the way for a new generation of exascale supercomputers that are 50-100 times faster than today's most powerful systems.

NVIDIA will add NVLink technology into its Pascal GPU architecture -- expected to be introduced in 2016 -- following this year's new NVIDIA Maxwell compute architecture. The new interconnect was co-developed with IBM, which is incorporating it in future versions of its POWER CPUs.





"NVLink technology unlocks the GPU's full potential by dramatically improving data movement between the CPU and GPU, minimizing the time that the GPU has to wait for data to be processed," said Brian Kelleher, senior vice president of GPU Engineering at NVIDIA.

"NVLink enables fast data exchange between CPU and GPU, thereby improving data throughput through the computing system and overcoming a key bottleneck for accelerated computing today," said Bradley McCredie, vice president and IBM Fellow at IBM. "NVLink makes it easier for developers to modify high-performance and data analytics applications to take advantage of accelerated CPU-GPU systems. We think this technology represents another significant contribution to our OpenPOWER ecosystem."

With NVLink technology tightly coupling IBM POWER CPUs with NVIDIA Tesla GPUs, the POWER data center ecosystem will be able to fully leverage GPU acceleration for a diverse set of applications, such as high performance computing, data analytics and machine learning.

Advantages Over PCI Express 3.0
Today's GPUs are connected to x86-based CPUs through the PCI Express (PCIe) interface, which limits the GPU's ability to access the CPU memory system and is four- to five-times slower than typical CPU memory systems. PCIe is an even greater bottleneck between the GPU and IBM POWER CPUs, which have more bandwidth than x86 CPUs. As the NVLink interface will match the bandwidth of typical CPU memory systems, it will enable GPUs to access CPU memory at its full bandwidth.

This high-bandwidth interconnect will dramatically improve accelerated software application performance. Because of memory system differences -- GPUs have fast but small memories, and CPUs have large but slow memories -- accelerated computing applications typically move data from the network or disk storage to CPU memory, and then copy the data to GPU memory before it can be crunched by the GPU. With NVLink, the data moves between the CPU memory and GPU memory at much faster speeds, making GPU-accelerated applications run much faster.

Unified Memory Feature
Faster data movement, coupled with another feature known as Unified Memory, will simplify GPU accelerator programming. Unified Memory allows the programmer to treat the CPU and GPU memories as one block of memory. The programmer can operate on the data without worrying about whether it resides in the CPU's or GPU's memory.

Although future NVIDIA GPUs will continue to support PCIe, NVLink technology will be used for connecting GPUs to NVLink-enabled CPUs as well as providing high-bandwidth connections directly between multiple GPUs. Also, despite its very high bandwidth, NVLink is substantially more energy efficient per bit transferred than PCIe.

NVIDIA has designed a module to house GPUs based on the Pascal architecture with NVLink. This new GPU module is one-third the size of the standard PCIe boards used for GPUs today. Connectors at the bottom of the Pascal module enable it to be plugged into the motherboard, improving system design and signal integrity.

NVLink high-speed interconnect will enable the tightly coupled systems that present a path to highly energy-efficient and scalable exascale supercomputers, running at 1,000 petaflops (1 x 1018 floating point operations per second), or 50 to 100 times faster than today's fastest systems.

View at TechPowerUp Main Site
 
Joined
May 1, 2012
Messages
1,027 (0.22/day)
Location
New Jersey, USA
System Name Current Rig
Processor AMD 7800X3D
Motherboard MSI x670e Tomahawk wifi
Cooling Artic Freezer II 360
Memory G.Skill 32gb ddr5 6000mhz
Video Card(s) AMD 7900XTX 24 GB
Storage Samsung SSD 980 PRO 2TB
Display(s) Alienware 3420DW 120 Freesync
Case LianLi Lancool III white non-rgb
Audio Device(s) Onboard ALC
Power Supply Corsair Shift 1000W
Mouse G502 Hero
Keyboard Ducky Shine 5
Software Win 11 64bit
Benchmark Scores The second best!
Watching the keynote and Pascal looks like it will hammer away at Intel and their CPUs.
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,233 (7.55/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Hey there HyperTransport, long time!
 

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.53/day)
Hey there HyperTransport, long time!
I believe you mean "SidePort".

NVLink technology will be used for connecting GPUs to NVLink-enabled CPUs as well as providing high-bandwidth connections directly between multiple GPUs.

= AMD SidePort/IOMMU.

;)

As is the norm, AMD creates the idea, and Nvidia brings a useable form to the masses. Tech partnerships at it's best, really.
 
Joined
Nov 6, 2005
Messages
480 (0.07/day)
Location
Silver Spring, MD
Processor Core i7 4770K
Motherboard Asrock Z87E-ITX
Cooling Stock
Memory 16GB Gskill 2133MHz DDR3
Video Card(s) PNY GeForce GTX 670 2GB
Storage 256GB Corsair M4, 240GB Samsung 840
Display(s) 27" 1440p Achevia Shimian
Case Fractal Node 304
Audio Device(s) Audioquest Dragonfly USB DAC
Power Supply Corsair Builder 600W
Software Windows 7 Pro x64
I'm not against it... but since when was GPU scaling limited by PCIe throughput? Maybe it's a latency thing... but 16 PCIe 3.0 lanes is quite a bit of bandwidth, and I thought that we've seen time and time again that the performance impact of halving that (running 8x) is minimal even with the highest end cards.

They say it's because they need access to CPU memory and that GPU memory is "small", but I think again we've seen the opposite trend. Plenty of enthusiast computers have 16GB of main memory but have easily 3-4GB of VRAM per card. Is it just that a lot of main memory isn't used in gaming, and you can just hoard textures in there if you had more bandwidth?
 
Joined
Jun 24, 2011
Messages
571 (0.12/day)
Location
Islamabad
System Name Hhumas-PC
Processor Intel(R) Core(TM)2 Extreme CPU X9650 @ 3.00GHz (4 CPUs), ~3.0GHz
Motherboard Asus P5W DH Deluxe
Cooling ZALMAN CNPS 9700 NT 110mm 2 Ball Ultra Quiet
Memory OCZ Platinium 2x2GB
Video Card(s) ZOTAC GeForce GTX 680 2GB 256-bit GDDR5
Storage WD Green 500 GB , WD 500 GB
Display(s) Dell U2410F
Case Casecom
Audio Device(s) Creative Sound Blaster X-Fi Titanium Fatal1ty Pro
Power Supply Corsair HX1000
Software Windows 7 Ultimate 32 Bit
its crazy ...
 

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.53/day)
I'm not against it... but since when was GPU scaling limited by PCIe throughput? Maybe it's a latency thing... but 16 PCIe 3.0 lanes is quite a bit of bandwidth, and I thought that we've seen time and time again that the performance impact of halving that (running 8x) is minimal even with the highest end cards.
If you do GPGPU, ever since such was possible, PCIe has been a limitation. This FACT (shown by the AOKI paper at the beginning of the STREAM, and has been something I have been personally talking about for years) has been present since PCIe came out, really.


A limitation in gaming? Yes AND No. AMD Multi-GPU stutter problems are due to PCIe limitations.

If you watched Nvidia's promo...they easily pointed out that in order to provide what is needed to make a real jump in graphics, requires 1000's of bits of memory interconnect...compared to the 384 we have today. Being able to feed that memory, as well as other GPUs, is not possible over PCIe...hence NV-LINK.
 
Joined
May 1, 2012
Messages
1,027 (0.22/day)
Location
New Jersey, USA
System Name Current Rig
Processor AMD 7800X3D
Motherboard MSI x670e Tomahawk wifi
Cooling Artic Freezer II 360
Memory G.Skill 32gb ddr5 6000mhz
Video Card(s) AMD 7900XTX 24 GB
Storage Samsung SSD 980 PRO 2TB
Display(s) Alienware 3420DW 120 Freesync
Case LianLi Lancool III white non-rgb
Audio Device(s) Onboard ALC
Power Supply Corsair Shift 1000W
Mouse G502 Hero
Keyboard Ducky Shine 5
Software Win 11 64bit
Benchmark Scores The second best!
If you do GPGPU, ever since such was possible, PCIe has been a limitation. This FACT (shown by the AOKI paper at the beginning of the STREAM, and has been something I have been personally talking about for years) has been present since PCIe came out, really.


A limitation in gaming? Yes AND No. AMD Multi-GPU stutter problems are due to PCIe limitations.

If you watched Nvidia's promo...they easily pointed out that in order to provide what is needed to make a real jump in graphics, requires 1000's of bits of memory interconnect...compared to the 384 we have today. Being able to feed that memory, as well as other GPUs, is not possible over PCIe...hence NV-LINK.


So is NVLINK a physical replacement to PCIe on the Nvidia mobos?

I will be glad to buy a mobo that has both NVLINK and PCIe. Especially tired of seeing Intel's lack of PCIe 4, DDR4, new technology interfaces, etc on their Z97 and X99 platforms.
 

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.53/day)
So is NVLINK a physical replacement to PCIe on the Nvidia mobos?


Christian_25H covered that well already:

Although future NVIDIA GPUs will continue to support PCIe, NVLink technology will be used for connecting GPUs to NVLink-enabled CPUs as well as providing high-bandwidth connections directly between multiple GPUs. Also, despite its very high bandwidth, NVLink is substantially more energy efficient per bit transferred than PCIe.
 
Joined
Apr 3, 2012
Messages
4,370 (0.95/day)
Location
St. Paul, MN
System Name Bay2- Lowerbay/ HP 3770/T3500-2+T3500-3+T3500-4/ Opti-Con/Orange/White/Grey
Processor i3 2120's/ i7 3770/ x5670's/ i5 2400/Ryzen 2700/Ryzen 2700/R7 3700x
Motherboard HP UltraSlim's/ HP mid size/ Dell T3500 workstation's/ Dell 390/B450 AorusM/B450 AorusM/B550 AorusM
Cooling All stock coolers/Grey has an H-60
Memory 2GB/ 4GB/ 12 GB 3 chan/ 4GB sammy/T-Force 16GB 3200/XPG 16GB 3000/Ballistic 3600 16GB
Video Card(s) HD2000's/ HD 2000/ 1 MSI GT710,2x MSI R7 240's/ HD4000/ Red Dragon 580/Sapphire 580/Sapphire 580
Storage ?HDD's/ 500 GB-er's/ 500 GB/2.5 Samsung 500GB HDD+WD Black 1TB/ WD Black 500GB M.2/Corsair MP600 M.2
Display(s) 1920x1080/ ViewSonic VX24568 between the rest/1080p TV-Grey
Case HP 8200 UltraSlim's/ HP 8200 mid tower/Dell T3500's/ Dell 390/SilverStone Kublai KL06/NZXT H510 W x2
Audio Device(s) Sonic Master/ onboard's/ Beeper's!
Power Supply 19.5 volt bricks/ Dell PSU/ 525W sumptin/ same/Seasonic 750 80+Gold/EVGA 500 80+/Antec 650 80+Gold
Mouse cheap GigaWire930, CMStorm Havoc + Logitech M510 wireless/iGear usb x2/MX 900 wireless kit 4 Grey
Keyboard Dynex, 2 no name, SYX and a Logitech. All full sized and USB. MX900 kit for Grey
Software Mint 18 Sylvia/ Opti-Con Mint KDE/ T3500's on Kubuntu/HP 3770 is Win 10/Win 10 Pro/Win 10 Pro/Win10
Benchmark Scores World Community Grid is my benchmark!!
Looks promising...How long will it take to get to gaming desktops, any guesses? 2016 seems like enough time to incorporate it to a gaming platform...hmm?

*EDIT, just noticed I had put in 2026, instead of 2016. :oops:
 
Last edited:
Joined
Jul 19, 2006
Messages
43,604 (6.51/day)
Processor AMD Ryzen 7 7800X3D
Motherboard ASUS TUF x670e
Cooling EK AIO 360. Phantek T30 fans.
Memory 32GB G.Skill 6000Mhz
Video Card(s) Asus RTX 4090
Storage WD m.2
Display(s) LG C2 Evo OLED 42"
Case Lian Li PC 011 Dynamic Evo
Audio Device(s) Topping E70 DAC, SMSL SP200 Headphone Amp.
Power Supply FSP Hydro Ti PRO 1000W
Mouse Razer Basilisk V3 Pro
Keyboard Tester84
Software Windows 11

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.53/day)
Interesting, what are these going to be?


I guess we'll start to see them in 2016?

Although, the mention of IBM POWERPC chips...kinda...well...removes my excitement. :roll:


Looking at the physical sample NV showed today, it looks a lot like a module for the new Apple MAC PRO trashcan-PC.

If that's Nvidia's choice to stay relevant to the marketplace...to work with Apple...well...
 
Joined
Jul 19, 2006
Messages
43,604 (6.51/day)
Processor AMD Ryzen 7 7800X3D
Motherboard ASUS TUF x670e
Cooling EK AIO 360. Phantek T30 fans.
Memory 32GB G.Skill 6000Mhz
Video Card(s) Asus RTX 4090
Storage WD m.2
Display(s) LG C2 Evo OLED 42"
Case Lian Li PC 011 Dynamic Evo
Audio Device(s) Topping E70 DAC, SMSL SP200 Headphone Amp.
Power Supply FSP Hydro Ti PRO 1000W
Mouse Razer Basilisk V3 Pro
Keyboard Tester84
Software Windows 11
Heh, I didn't even know IBM still made PowerPC chips! Looking forward to seeing how it pans out.
 

H2323

New Member
Joined
Mar 25, 2014
Messages
5 (0.00/day)
I believe you mean "SidePort".



= AMD SidePort/IOMMU.

;)

As is the norm, AMD creates the idea, and Nvidia brings a useable form to the masses. Tech partnerships at it's best, really.

K...seriously how is this usable....the CPU has to have this built in as well, and it is clear that the only one that will do this is PowerPC and any custom ARM SoC Nvidia wants to design. This is for enterprise and supercomputers it will not be on your PC. AMD and Intel already have there own internal solutions.
 
Joined
Aug 10, 2007
Messages
4,267 (0.68/day)
Location
Sanford, FL, USA
Processor Intel i5-6600
Motherboard ASRock H170M-ITX
Cooling Cooler Master Geminii S524
Memory G.Skill DDR4-2133 16GB (8GB x 2)
Video Card(s) Gigabyte R9-380X 4GB
Storage Samsung 950 EVO 250GB (mSATA)
Display(s) LG 29UM69G-B 2560x1080 IPS
Case Lian Li PC-Q25
Audio Device(s) Realtek ALC892
Power Supply Seasonic SS-460FL2
Mouse Logitech G700s
Keyboard Logitech G110
Software Windows 10 Pro
In the short term we may not get CPUs with NV-Link in consumer form but perhaps it will allow for better performing and more efficient multi-GPU gaming cards?

The Titan-Z is already outdated, wait for the Titan-Z2 with 12GB of unified memory ;)
 

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.53/day)
In the short term we may not get CPUs with NV-Link in consumer form but perhaps it will allow for better performing and more efficient multi-GPU gaming cards?


Yep, push data to primary card, and then have secondary cards link together as slave devices, presenting itself as a large compute interface that the OS sees like a single compute device.



Oh wait, that's exactly the scenario hinted at in the presentation...:p and shown in the slides.
 
Joined
Aug 10, 2007
Messages
4,267 (0.68/day)
Location
Sanford, FL, USA
Processor Intel i5-6600
Motherboard ASRock H170M-ITX
Cooling Cooler Master Geminii S524
Memory G.Skill DDR4-2133 16GB (8GB x 2)
Video Card(s) Gigabyte R9-380X 4GB
Storage Samsung 950 EVO 250GB (mSATA)
Display(s) LG 29UM69G-B 2560x1080 IPS
Case Lian Li PC-Q25
Audio Device(s) Realtek ALC892
Power Supply Seasonic SS-460FL2
Mouse Logitech G700s
Keyboard Logitech G110
Software Windows 10 Pro
I'll wait for a whitepaper on it to be posted. It's better for their bottom line if I don't watch these types of presentations :)
 
Joined
May 1, 2012
Messages
1,027 (0.22/day)
Location
New Jersey, USA
System Name Current Rig
Processor AMD 7800X3D
Motherboard MSI x670e Tomahawk wifi
Cooling Artic Freezer II 360
Memory G.Skill 32gb ddr5 6000mhz
Video Card(s) AMD 7900XTX 24 GB
Storage Samsung SSD 980 PRO 2TB
Display(s) Alienware 3420DW 120 Freesync
Case LianLi Lancool III white non-rgb
Audio Device(s) Onboard ALC
Power Supply Corsair Shift 1000W
Mouse G502 Hero
Keyboard Ducky Shine 5
Software Win 11 64bit
Benchmark Scores The second best!
Uhhh, what happened to Volta? Wasn't that supposed to be Maxwell's follow up?

It may still slot in between Maxwell and Pascal in 2015/2016.
 
Joined
Nov 4, 2005
Messages
11,982 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
2016..just right....around.........the.....................corner..........................................................



Is it just me or does it seem like someone threw the PR team there out a window and they are trying to make enough PR slides to land on.
 
Joined
Jan 10, 2014
Messages
161 (0.04/day)
System Name First Gaming PC
Processor AMD APU Kaveri A10-7850k
Motherboard MSI A88XM-E45
Cooling Stock Cooler
Memory Kingston HyperX 8 GB 1866MHz
Video Card(s) Intergrated with CPU
Storage Kingston Hyperx 3k 120 GB(OS) + 1 TB WD Blue
Display(s) LG 20EN33V 1920 x 1080
Case Infinity Rave
Audio Device(s) Intergrated Sound Card
Power Supply Enermax NAXN 500w
Software Windows 8.1 64-bit
so its like AMD HSA but you still need CPU + Dedicated GPU?
 
Joined
Oct 2, 2004
Messages
13,791 (1.87/day)
Unfortunately i fell asleep during the keynote. It was too much scientific computing and very little for the gaming. And if i'm honest, both graphics demos were rather boring. That Unreal Engine fight scene was nothing to talk about and that whale water simulation, it was ok and shows the muscle, but i wasn't actually impressed by it. There are so many better and more impressive ways to showcase fluid dynamics than with a transparent whale...
 
Top