• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA GeForce GTX 1080 SLI

Joined
Jun 10, 2014
Messages
2,958 (0.79/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
Right, all things I'm "dreaming about" assume some hypothetical new architecture ... in this case specifically, a data sync for the next frame that doesn't disturb the rendering of the current frame - sync that happens independently while the current frame is being rendered to hide at least the part of that latency ... possibly through nvlink
The thing that bothers me about multi-GPU is not only that we actually need it, but also that it's something that should be solvable. As mentioned multi-GPU scaling is up to the games, but all the stuttering problems are up to the drivers/hardware. NVLink would be nice, a big NVLink SLI bridge would be awesome. Still we would need more speed than both 1st generation NVLink and PCIe 3.0 provides in order to push the transfer times of the final frame down to <0.1ms, so that wouldn't happen any time soon.

I do have a "solution" that will work though without any hardware changes; create a new "AFR mode" which uses the primary GPU only for display, and all the rendering is done by "slaves". At least then the latency will be constant, but there still will be some latency.

Back to my previous point that we need multi-GPU. The last ~3 years or so the gaming market has changed from gamers targeting 1080p/60 Hz to 1440p/120 Hz and soon 2160p/144+ Hz. Even with the great improvements of Kepler, Maxwell and Pascal, gamers are not able to get the performance they want. We are not going to get to the point where a single GPU is "fast enough" in the next two GPU generations (Volta/post-Volta), so there is a place for multi-GPU. The problem with multi-GPU is essentially the chicken and the egg problem, but I'm pretty sure that if multi-GPU were "stutter free" and scaled well in most top games, then a lot of people would buy it. At this point it's more or less a "benchmarking feature", people who want smooth gaming stay away from it. (Multi-GPU works fine though for professional graphics where microstutter doesn't matter...)

I don't know about that, going to 4k seems to need less AA and more temporal AA to reduce shimmering with camera movement
You are right that shimmering/flicker is a problem for camera movement, especially when there are a lot of objects moving only slightly (like grass waving in the wind). TXAA combines MSAA with temporal filtering and previous pixels which essentially blurs the picture, which kind of defeats the purpose of higher resolution in the first place, since it removes the sharpness and detals. The reason for adding this technique is proper AA is too expensive for higher resolution. The advantage of SSAA(supersampling) and MSAA is that it reduces the aliasing while retaining the sharpness. The problem with all types of post-process (or semi post-process) AA techniques (incl. TXAA) is that they work on the rendered image, and can essentially just run different filters on existing data. Proper AA does on the other hand sample the data at a higher resolution and then average it out, essentially just rendering it at a higher resolution. SSAA is the best and most demanding. Even 2x SSAA almost quadruples the load, essentially rendering a 1080p image in 4K and scaling down the result. You might understand why this gets expensive in 1440p, 2160p and so on...

Oh yes, to successfully cover all scenarios, you'll need to split frame buffer into tiles (similar to what Dice did in Frostbite engine for cell implementation in PS3) and distribute tile sets with similar total complexity to different gpus ... which would increase cpu overhead but I think it's worth investigating for engine makers ...
I'm quite serious, I'm very familiar with landscape rendering. Most games renders the terrain from a first person perspective, which will make some regions quick to render and some slow. If you split vertically on the middle, you'll end up in a situation where both GPUs are not well saturated at any time, and continuously adjusting the split is not a good solution either. This is the reason why split frame rendering has been abandoned for games. Why use it when AFR scales so much better?

maybe even use integrated gpu (or one of the multi gpus asymetrically using simultaneous multi-projection) in a preprocessing step only to help speed up dividing jobs for multi gpus based on geometry and shading complexity (on a screen tile level, not on the pixel level).
Agreed
I'm just going to address this quickly. Offloading some simple stuff to an integrated GPU is of course possible, but the latency would not make it worth the effort. The integrated GPU will only perform like 2-5% of a powerful one anyway, and you have to remember that every MB of transfer between them is expensive. If you want to offload some preprocessing, physics, etc. then the result needs to be as compact as possible. If you need to transfer like 250 MB in each frame, then you'll get like 20 FPS and huge latency problems ;)
 
Joined
Nov 4, 2005
Messages
11,915 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Joined
Sep 9, 2013
Messages
530 (0.13/day)
System Name Can I run it
Processor delidded i9-10900KF @ 5.1Ghz SVID best case scenario +LLC5+Supercool direct die waterblock
Motherboard ASUS Maximus XII Apex 2801 BIOS
Cooling Main = GTS 360 GTX 240, EK PE 360,XSPC EX 360,2x EK-XRES 100 Revo D5 PWM, 12x T30, AC High Flow Next
Memory 2x16GB TridentZ 3600@4600 16-16-16-36@1.61V+EK Monarch, Separate loop with GTS 120&Freezemod DDC
Video Card(s) Gigabyte RTX 3080 Ti Gaming OC @ 0.762V 1785Mhz core 20.8Gbps mem + Barrow full cover waterblock
Storage Transcend PCIE 220S 1TB (main), WD Blue 3D NAND 250GB for OC testing, Seagate Barracuda 4TB
Display(s) Samsung Odyssey OLED G9 49" 5120x1440 240Hz calibrated by X-Rite i1 Display Pro Plus
Case Thermaltake View 71
Audio Device(s) Q Acoustics M20 HD speakers with Q Acoustics QB12 subwoofer
Power Supply Silverstone ST-1200 PTS 1200W 80+ Platinum
Mouse Logitech G Pro Wireless
Keyboard Logitech G913 (GL Linear)
Software Windows 11
That game is awesome to play.......


But I prefer games with more play time and interaction than sitting and stroking my penis and looking at numbers. :D

I just show that in synthetic benchmark it has some scaling. No need to troll.
 
Joined
Feb 8, 2012
Messages
3,013 (0.65/day)
Location
Zagreb, Croatia
System Name Windows 10 64-bit Core i7 6700
Processor Intel Core i7 6700
Motherboard Asus Z170M-PLUS
Cooling Corsair AIO
Memory 2 x 8 GB Kingston DDR4 2666
Video Card(s) Gigabyte NVIDIA GeForce GTX 1060 6GB
Storage Western Digital Caviar Blue 1 TB, Seagate Baracuda 1 TB
Display(s) Dell P2414H
Case Corsair Carbide Air 540
Audio Device(s) Realtek HD Audio
Power Supply Corsair TX v2 650W
Mouse Steelseries Sensei
Keyboard CM Storm Quickfire Pro, Cherry MX Reds
Software MS Windows 10 Pro 64-bit
I'm quite serious, I'm very familiar with landscape rendering.
So far there is nothing to prove otherwise, so no worries.
Consequently you should quite seriously read how Dice cleverly divided deferred rendering jobs for all the SP in the cell processor in old PS3 version of Frostbite engine: http://www.slideshare.net/DICEStudio/spubased-deferred-shading-in-battlefield-3-for-playstation-3
They did it for different reasons ... to help the weak gpu with the lighting pass (multiple lights + global illumination light probes).
I'm suggesting a research following the same principle, not using exact same solution.
No need to get stuck with either horizontal nor vertical frame split in two pieces, when engine workload for a single frame is already split into passes and when tile based approach may offer better granularity for balanced job division, for example:
tiles.jpg

these are the tiles for the lighting pass rated by calculation complexity. Every pass could work similarly. Essentially doing the split frame rendering by splitting the workload rather than frame ;).
 
Joined
May 11, 2016
Messages
55 (0.02/day)
System Name Custom
Processor AMD Ryzen 5800X 3D
Motherboard Asus Prime
Cooling Li Lian All in One RGB
Memory 64 GB DDR4
Video Card(s) Gigabyte RTX 4090
Storage Samsung 980 2tb
Display(s) Samsung 8K 85inch
Case Cooler Master Cosmos
Audio Device(s) Samsung Dolby Atmos 7.1 Surround Sound
Power Supply EVGA 120-G2-1000-XR 80 PLUS GOLD 1000 W Power Supply
If the new sli bridges eliminated microstutter, then I wouldn't care if there was no improvement in frame rate. But reading other forums, sli users are still dealing with the same level of microstutter as before.
 
Joined
Jun 4, 2004
Messages
480 (0.06/day)
System Name Blackbird
Processor AMD Threadripper 3960X 24-core
Motherboard Gigabyte TRX40 Aorus Master
Cooling Full custom-loop water cooling, mostly Aqua Computer and EKWB stuff!
Memory 4x 16GB G.Skill Trident-Z RGB @3733-CL14
Video Card(s) Nvidia RTX 3090 FE
Storage Samsung 950PRO 512GB, Crusial P5 2TB, Samsung 850PRO 1TB
Display(s) LG 38GN950-B 38" IPS TFT, Dell U3011 30" IPS TFT
Case CaseLabs TH10A
Audio Device(s) Edifier S1000DB
Power Supply ASUS ROG Thor 1200W (SeaSonic)
Mouse Logitech MX Master
Keyboard SteelSeries Apex M800
Software MS Windows 10 Pro for Workstation
Benchmark Scores A lot.
To end all the discussion about when those bridges are needed and when not, Nvidia should make the connection info about those bridges available through the driver as they are doing with all the other stats that can be tracked in realtime, like core/mem clock speed or GPU utilisation. I'm thinking of clockspeed, bandwidth and utilisation of the SLI bridge and maybe even provide a setting to change the frequency of the SLI connectors manually in the driver.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,159 (2.84/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
So, even if you solve the bandwidth problem with sending frame data to the GPU actually displaying the frame, you'll still have this weird latency problem where the frame coming from the local GPU will always be ready to be rendered before the secondary GPU. The funny thing is (when it works,) running a third GPU can sometimes reduce the amount of micro stutter, probably because the two non-primary GPUs have the same amount of latency to communicate with the primary. So consider this, to get a frame from one GPU to another is measures on the scale of milliseconds where the transfer of a frame on the local GPU is literally measured in nanoseconds. The primary will always get its frame to the active frame buffer for the display before a secondary or tertiary GPU will. This latency alone is the source of most of the woes with micro-stutter. The only way to solve that problem is to normalize the rate at which frames and produced and received.

Micro-stutter gets worse when you start pushing a GPU to its limits because the GPU can't get the frame over to the primary as quickly as if it were running at 50% and is getting the frame to where it needs to be earlier so, when you're running at well over 60FPS, using something like v-sync makes it possible to buffer frames and essentially smooth out the latency with the frame times.

With that said, if you had a device that didn't do the rendering but, handled only the display output and buffering for a monitor or monitors would be the solution. Equal latency between the devices that produce the frames to the thing that consumes the frames is what will give us the best experience but, so long as one GPU is responsible for outputting that frame, they're going to have real fun trying to get the two sources of frame data to maintain a consistent latency between rendered frames to be displayed.

tl;dr: The less you tax a multi-GPU setup, the less likely it's going to cause micro-stutter issues because the latency between frames will become more consistent.
 
Joined
Nov 4, 2005
Messages
11,915 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
I just show that in synthetic benchmark it has some scaling. No need to troll.


It was more of a jab at the only reasons to have/use it is for epeen and benchmarking, so its essentially useless for 99.999999999999999999999999999% of users.
 
Joined
May 11, 2016
Messages
55 (0.02/day)
System Name Custom
Processor AMD Ryzen 5800X 3D
Motherboard Asus Prime
Cooling Li Lian All in One RGB
Memory 64 GB DDR4
Video Card(s) Gigabyte RTX 4090
Storage Samsung 980 2tb
Display(s) Samsung 8K 85inch
Case Cooler Master Cosmos
Audio Device(s) Samsung Dolby Atmos 7.1 Surround Sound
Power Supply EVGA 120-G2-1000-XR 80 PLUS GOLD 1000 W Power Supply
Okay I drove to newegg yesterday and picked up my two gtx 1070s and an MSI hb sli bridge, just in case there was some future benefit. Every game that I tried (Witcher 3, Project Cars, GTA 5, Mortal Kombat X, The Evil Within, Mad Max, Fallout 4) ran ABOVE 60 fps in 4K at max/ultra settings. The exceptions were Crysis 3 which hovered around 53 fps and GTA 4 which averaged 53 fps in 4K.

The 1070s are gigabyte g1 gaming cards and the overclock software I used to push these in sli is EVGA Precision X. To overclock I simply increased the power target to the max of 111%, the core clock to in increments of +25 to land at +75 ( +100 caused the Heaven Valley Benchmark to crash ), set a max temp of 81 degrees, and I haven't touch the memory clock yet. This land me at a stable clock speed of 2078 with no fluctuation during gameplay at a max temp of 72 degrees. I'm still testing because I've only had these cards for a couple of hours, but it is a huge improvement over my two 970s and it so happens that the games I play have sli profiles.

I didn't notice any microstutter, but because I was testing with vsync off on all the games there was slight screen tearing when panning the camera too quickly in a few of the games. It happened far less than with 970 sli and was less noticeable in comparison due to frame rate being above 60 compare to the previous 35fps average of the 970 sli. I've noticed through my different builds over the last 2 years that gaming in 4K introduces a higher probability of screen tearing that is more pronounced the lower the frame rate, which is solved by vsync at the cost of performance . 60 fps almost drastically lowers the visibility and occurrences of screen tearing without the need for vsync.

I'll keep testing and post any updates.
 
Last edited:
Joined
Jun 27, 2016
Messages
6 (0.00/day)
Long time reader, first time member here.

Wanted to share some very interesting findings in this vid by Hardware Unboxed:


Big gains can apparently be had from the HB Bridge vs old Flex Bridge depending on the title (Fallout 4, Black Ops 3, Witcher 3). Thing is, the HB results could be replicated by putting 2 Flex Bridges together, so the gain is not from the HB Bridge itself, but from the linking of both SLI connectors. Can TPU confirm their findings by running those games under similar conditions? Noted the review had BO3, but only for 1600x900.
 
Joined
Oct 22, 2014
Messages
13,888 (3.82/day)
Location
Sunshine Coast
System Name Lenovo ThinkCentre
Processor AMD 5650GE
Motherboard Lenovo
Memory 32 GB DDR4
Display(s) AOC 24" Freesync 1m.s. 75Hz
Mouse Lenovo
Keyboard Lenovo
Software W11 Pro 64 bit
Big gains can apparently be had from the HB Bridge vs old Flex Bridge depending on the title (Fallout 4, Black Ops 3, Witcher 3). Thing is, the HB results could be replicated by putting 2 Flex Bridges together, so the gain is not from the HB Bridge itself, but from the linking of both SLI connectors.
It's already been stated flexible bridges do not have the same gains, even if two are used.
 
Joined
Oct 22, 2014
Messages
13,888 (3.82/day)
Location
Sunshine Coast
System Name Lenovo ThinkCentre
Processor AMD 5650GE
Motherboard Lenovo
Memory 32 GB DDR4
Display(s) AOC 24" Freesync 1m.s. 75Hz
Mouse Lenovo
Keyboard Lenovo
Software W11 Pro 64 bit
Stated where?

It seemed to be game-specific, so I wanted to get verification on this.
Pretty sure W1zz confirmed it in one of the threads discussing the 1080's.
 
Joined
Dec 18, 2005
Messages
8,253 (1.20/day)
System Name money pit..
Processor Intel 9900K 4.8 at 1.152 core voltage minus 0.120 offset
Motherboard Asus rog Strix Z370-F Gaming
Cooling Dark Rock TF air cooler.. Stock vga air coolers with case side fans to help cooling..
Memory 32 gb corsair vengeance 3200
Video Card(s) Palit Gaming Pro OC 2080TI
Storage 150 nvme boot drive partition.. 1T Sandisk sata.. 1T Transend sata.. 1T 970 evo nvme m 2..
Display(s) 27" Asus PG279Q ROG Swift 165Hrz Nvidia G-Sync, IPS.. 2560x1440..
Case Gigabyte mid-tower.. cheap and nothing special..
Audio Device(s) onboard sounds with stereo amp..
Power Supply EVGA 850 watt..
Mouse Logitech G700s
Keyboard Logitech K270
Software Win 10 pro..
Benchmark Scores Firestike 29500.. timepsy 14000..
It's already been stated flexible bridges do not have the same gains, even if two are used.


i tried using two flexible bridges.. more for cosmetic reasons than anything else.. two look better than one.. he he

sadly it didnt work.. it produced very pronounced interference lines across the image.. i am guessing some kind of cross chatter between the two cables was taking place.. i had to revert back to just the one..

its a shame about the lack of game support.. having said that i have two of the games listed as not using the second card.. just cause and primal.. i didnt know they lacked sli support until i read about it.. they played just fine.. mind you i dont run 4K..

trog
 
Last edited:
Joined
Mar 18, 2017
Messages
1 (0.00/day)
I am wondering if I need to invest in a high bandwidth SLI bridge. I ran Unigine heaven 4.0 benchmark on my 1080 ti SLI setup and the results seem to leave something to be desired even though they are pretty awesome.
Can you tell me what you think? You can see the video here.

 
Joined
Oct 22, 2014
Messages
13,888 (3.82/day)
Location
Sunshine Coast
System Name Lenovo ThinkCentre
Processor AMD 5650GE
Motherboard Lenovo
Memory 32 GB DDR4
Display(s) AOC 24" Freesync 1m.s. 75Hz
Mouse Lenovo
Keyboard Lenovo
Software W11 Pro 64 bit
I am wondering if I need to invest in a high bandwidth SLI bridge. I ran Unigine heaven 4.0 benchmark on my 1080 ti SLI setup and the results seem to leave something to be desired even though they are pretty awesome.
Can you tell me what you think? You can see the video here.
You may get a slight increase, but better of waiting for drivers to be optimised first, as the 1080Ti is still in it's infancy as a card, especially in SLI where you might only see a small increase in performance.
 
Joined
Apr 6, 2017
Messages
8 (0.00/day)
Location
Bakersfield, CA
System Name Protoss V2
Processor Intel i7-7700K @ 4.9 GHz
Motherboard Asus Maximus IX Hero Z270
Cooling NH-D15
Memory Corsair Vengeance LPX 16GB 3 GHz
Video Card(s) Nvidia GTX 1080FE SLI @ 2.1 GHz
Storage Samsung 840 Evo 500 GB; WD Black 1 TB x2 RAID 0
Display(s) Asus ROG Swift PG278Q
Case Corsair Air 540 White
Audio Device(s) N/A
Power Supply EVGA SuperNOVA 850G2
Mouse Corsair M65 RGB
Keyboard Corsair K95 RGB
Software Windows 10 Pro
I have no complaints with my GTX 1080 FE SLI setup. I've got a 1440Pmonitor and playing games at higher refresh rate with higher resolution? It just looks amazing. Also, I think that the sweet spot for gaming right now is still 1440P rather than 4K.

Also, I can't wait for explicit mGPU support to start rolling out on a lot of DX12 games. I hear that it'll utilize your GPUs better than your traditional SLI/Crossfire setup. I mean it would make sense, with explicit mGPU not needing the drivers to utilize both GPUs.
 
Joined
Dec 18, 2005
Messages
8,253 (1.20/day)
System Name money pit..
Processor Intel 9900K 4.8 at 1.152 core voltage minus 0.120 offset
Motherboard Asus rog Strix Z370-F Gaming
Cooling Dark Rock TF air cooler.. Stock vga air coolers with case side fans to help cooling..
Memory 32 gb corsair vengeance 3200
Video Card(s) Palit Gaming Pro OC 2080TI
Storage 150 nvme boot drive partition.. 1T Sandisk sata.. 1T Transend sata.. 1T 970 evo nvme m 2..
Display(s) 27" Asus PG279Q ROG Swift 165Hrz Nvidia G-Sync, IPS.. 2560x1440..
Case Gigabyte mid-tower.. cheap and nothing special..
Audio Device(s) onboard sounds with stereo amp..
Power Supply EVGA 850 watt..
Mouse Logitech G700s
Keyboard Logitech K270
Software Win 10 pro..
Benchmark Scores Firestike 29500.. timepsy 14000..
I am wondering if I need to invest in a high bandwidth SLI bridge. I ran Unigine heaven 4.0 benchmark on my 1080 ti SLI setup and the results seem to leave something to be desired even though they are pretty awesome.
Can you tell me what you think? You can see the video here.


turn one of the cards off and see what happens.. quite what part of the video you think looks awesome has me a bit baffled it all looks like a jerky load of crap to me..

trog
 
Top