- Joined
- Jun 10, 2014
- Messages
- 2,977 (0.78/day)
Processor | AMD Ryzen 9 5900X ||| Intel Core i7-3930K |
---|---|
Motherboard | ASUS ProArt B550-CREATOR ||| Asus P9X79 WS |
Cooling | Noctua NH-U14S ||| Be Quiet Pure Rock |
Memory | Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz |
Video Card(s) | MSI GTX 1060 3GB ||| MSI GTX 680 4GB |
Storage | Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB |
Display(s) | Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24" |
Case | Fractal Design Define 7 XL x 2 |
Audio Device(s) | Cambridge Audio DacMagic Plus |
Power Supply | Seasonic Focus PX-850 x 2 |
Mouse | Razer Abyssus |
Keyboard | CM Storm QuickFire XT |
Software | Ubuntu |
The thing that bothers me about multi-GPU is not only that we actually need it, but also that it's something that should be solvable. As mentioned multi-GPU scaling is up to the games, but all the stuttering problems are up to the drivers/hardware. NVLink would be nice, a big NVLink SLI bridge would be awesome. Still we would need more speed than both 1st generation NVLink and PCIe 3.0 provides in order to push the transfer times of the final frame down to <0.1ms, so that wouldn't happen any time soon.Right, all things I'm "dreaming about" assume some hypothetical new architecture ... in this case specifically, a data sync for the next frame that doesn't disturb the rendering of the current frame - sync that happens independently while the current frame is being rendered to hide at least the part of that latency ... possibly through nvlink
I do have a "solution" that will work though without any hardware changes; create a new "AFR mode" which uses the primary GPU only for display, and all the rendering is done by "slaves". At least then the latency will be constant, but there still will be some latency.
Back to my previous point that we need multi-GPU. The last ~3 years or so the gaming market has changed from gamers targeting 1080p/60 Hz to 1440p/120 Hz and soon 2160p/144+ Hz. Even with the great improvements of Kepler, Maxwell and Pascal, gamers are not able to get the performance they want. We are not going to get to the point where a single GPU is "fast enough" in the next two GPU generations (Volta/post-Volta), so there is a place for multi-GPU. The problem with multi-GPU is essentially the chicken and the egg problem, but I'm pretty sure that if multi-GPU were "stutter free" and scaled well in most top games, then a lot of people would buy it. At this point it's more or less a "benchmarking feature", people who want smooth gaming stay away from it. (Multi-GPU works fine though for professional graphics where microstutter doesn't matter...)
You are right that shimmering/flicker is a problem for camera movement, especially when there are a lot of objects moving only slightly (like grass waving in the wind). TXAA combines MSAA with temporal filtering and previous pixels which essentially blurs the picture, which kind of defeats the purpose of higher resolution in the first place, since it removes the sharpness and detals. The reason for adding this technique is proper AA is too expensive for higher resolution. The advantage of SSAA(supersampling) and MSAA is that it reduces the aliasing while retaining the sharpness. The problem with all types of post-process (or semi post-process) AA techniques (incl. TXAA) is that they work on the rendered image, and can essentially just run different filters on existing data. Proper AA does on the other hand sample the data at a higher resolution and then average it out, essentially just rendering it at a higher resolution. SSAA is the best and most demanding. Even 2x SSAA almost quadruples the load, essentially rendering a 1080p image in 4K and scaling down the result. You might understand why this gets expensive in 1440p, 2160p and so on...I don't know about that, going to 4k seems to need less AA and more temporal AA to reduce shimmering with camera movement
I'm quite serious, I'm very familiar with landscape rendering. Most games renders the terrain from a first person perspective, which will make some regions quick to render and some slow. If you split vertically on the middle, you'll end up in a situation where both GPUs are not well saturated at any time, and continuously adjusting the split is not a good solution either. This is the reason why split frame rendering has been abandoned for games. Why use it when AFR scales so much better?Oh yes, to successfully cover all scenarios, you'll need to split frame buffer into tiles (similar to what Dice did in Frostbite engine for cell implementation in PS3) and distribute tile sets with similar total complexity to different gpus ... which would increase cpu overhead but I think it's worth investigating for engine makers ...
I'm just going to address this quickly. Offloading some simple stuff to an integrated GPU is of course possible, but the latency would not make it worth the effort. The integrated GPU will only perform like 2-5% of a powerful one anyway, and you have to remember that every MB of transfer between them is expensive. If you want to offload some preprocessing, physics, etc. then the result needs to be as compact as possible. If you need to transfer like 250 MB in each frame, then you'll get like 20 FPS and huge latency problemsmaybe even use integrated gpu (or one of the multi gpus asymetrically using simultaneous multi-projection) in a preprocessing step only to help speed up dividing jobs for multi gpus based on geometry and shading complexity (on a screen tile level, not on the pixel level).
Agreed