- Joined
- May 31, 2016
- Messages
- 4,446 (1.42/day)
- Location
- Currently Norway
System Name | Bro2 |
---|---|
Processor | Ryzen 5800X |
Motherboard | Gigabyte X570 Aorus Elite |
Cooling | Corsair h115i pro rgb |
Memory | 32GB G.Skill Flare X 3200 CL14 @3800Mhz CL16 |
Video Card(s) | Powercolor 6900 XT Red Devil 1.1v@2400Mhz |
Storage | M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB |
Display(s) | LG 27UD69 UHD / LG 27GN950 |
Case | Fractal Design G |
Audio Device(s) | Realtec 5.1 |
Power Supply | Seasonic 750W GOLD |
Mouse | Logitech G402 |
Keyboard | Logitech slim |
Software | Windows 10 64 bit |
I’m not talking about to which degree this is a problem but rather it is here already and obviously in the future it will be more noticeable. These cards are RT capable and if you pay cash for something, you don’t want to be constraint. Saying this card is 1440p doesn’t mean you must play at that resolution, and you can’t go up and crank down the details to play comfortably. Some people might see that as a huge disadvantage you know. Infinite Vram? I'm talking about bare minimum to play a game which apparently this GPUs (3070 and 3070 Ti) would have been capable of with the games I have mentioned. I really don’t understand what are you trying to prove? That he RT implementation sucks because in different games RT is more demanding? So is rasterization in those games obviously and since RT hits the performance when enabled that’s the natural cause what happens. Listen, I asked about the RAM because I was curious. Nonetheless these cards could play RT 4k with those games but can’t already. Not tomorrow but now. So, lets just leave it at that. If you are ok with it that perfectly fine.As for this, we'll have to disagree on that. While this "problem" will no doubt become more noticeable in the future, at the same time its absolute compute performance (whether rasterization or RT) will simultaneously decrease relative to the demands put on it by games, meaning that by the point where this is a dominating issue (rather than an extreme niche case, like today), those GPUs likely wouldn't produce playable framerates even if they had infinite VRAM. Remember, Doom Eternal is just about the easiest-to-run AAA shooter out there in terms of its compute requirements (and it can likely run more than fine at 2160p RT on a 3070 if you lower the texture quality or some other memory-heavy setting to the second highest setting). And it's not like these two games are even remotely representative of RT loads today - heck, nothing is, given that performance for the 3090 Ti at 2160p varies from ~137fps to ~24fps. The span is too wide. So, using these two edge cases as a predictor for the future is nit-picking and statistically insignificant. So again, calling the cards "handicapped" here is ... well, you're picking out an extreme edge case and using it in a way that I think is overblown. You can't expect universal 2160p60 RT from any GPU today, so why would you do so with an upper mid-range/lower high end GPU? That just doesn't make sense. Every GPU has its limitations, and these ones clearly have their limitations most specifically in memory-intensive RT at 2160p - the most extreme use case possible. That is a really small limitation. Calling that a "handicap" is making a mountain out of a molehill.
Ok, simplistic, you ask a question, and you answer it yourself. These are two different architectures if you want to compare those just by the result you can. I didn’t miss anything. I don’t see how you can compare two different architectures saying one uses more or less power than the other giving the same performance or similar. Different nodes, different architectures. Obviously, that’s the case. So comparison of these two to understand the difference lies there. Different node and different architectures. There’s no point on dwelling on it. Obviously, they are different, and the difference will be there in the results. I focused on the results themselves and performance/consumption. Here is your answer. Node difference, architecture difference since these are completely different products just have the same goal.That is a way too simplistic solution to this conundrum. As a 6900XT owner using it on a 1440p60 display, I know just how low that GPU will clock and how efficiently it will run if it doesn't need the power (that 75W figure I gave for Elden Ring isn't too exceptional). I've also run an undervolted, underclocked profile at ~2100MHz which never exceeded 190W no matter what I threw at it. The point being: RDNA2 has no problem clocking down and reducing power if needed. And, to remind you, in the game used for power testing here, the 6900XT matches the performance of the 3080Ti and 3090 at 1440p while consuming less power. Despite its higher clocks, even at peak. And, of course, all of these GPUs will reduce their clocks roughly equally, given an equal reduction in the workload. Yet what we're seemingly seeing here is a dramatic difference in said reductions, to the tune of a massive reversal of power efficiency.
So, while you're right that power consumption and performance scaling are not linear, and that a wide-and-slow GPU will generally be more efficient than a fast-and-narrow one, your application of these principles here ignores a massive variable: architectural and node differences. We know that RDNA2 on TSMC 7nm is more efficient than Ampere on Samsung 8nm, even at ~500MHz higher clocks. This is true pretty much true across the AMD-Nvidia product stacks, though with some fluctuations. And it's not like the 3090Ti is meaningfully wider than a 3090 (the increase in compute resources is tiny), and by extension not a 6900XT either. You could argue that the 3080Ti and 3090 are wider than the 6900 XT, and they certainly clock lower - but that runs counter to your argument, as they then ought to be more efficient at peak performance, not less. This tells us that AMD simply has the architecture and node advantage to clock higher yet still win out in terms of efficiency. Thus, there doesn't seem to be any reason why these GPUs wouldn't also clock down and reduce their power to similar degrees, despite their differing starting points. Now, performance scaling per frequency for any single GPU or architecutre isn't entirely linear either, but it is close to linear within the reasonable operating frequency ranges of most GPUs. Meaning that if two GPUs produce ~X performance, one at 2GHz and one at 2.5GHz, the drop in clock speeds needed to reach X/2 performance should be similar, not in MHz but in relative % to their starting frequencies. Not the same, but sufficiently similar for the difference not to matter much. And as power and clock speeds follow each other, even if non-linear, the power drop across the two GPUs should also be similar. Yet here we're seeing one GPU drop drastically more than the other - if we're comparing 3090 to 6900 XT, we're talking a 66% drop vs. a 46% drop. That's a rather dramatic difference considering that they started out at the same level of absolute performance.
I remember HWUB talking about the NV driver overlay or something. A CPU is being utilized more due to lower resolution than AMD’s counterpart and thus the lower results since the resources are taken. I don’t know if that issue has been fixed by Nvidia or not. I know there was an instance brought by HWUB. Bump resolution up and you have no or way less driver overlay. Also, if they are limited at low res (obviously are) just as AMD counterparts are, maybe it is due to game itself. There was also a mention of the architecture of the Ampere GPUs. The FP32 processing mainly that some has pointed out to have an impact on the lower res high FPS performance. If I remember correctly.One possible explanation: That the Ampere cards are actually really CPU limited at 1080p in CP2077, and would dramatically outperform the 6900XT there if not held back. This would require the same to not be true at 1440p, as the Ampere GPUs run at peak power there, indicating no significant bottleneck elsewhere. This would then require power measurements of the Ampere cards at 1080p without Vsync to check. Another possible explanation is that Nvidia is drastically pushing these cards beyond their efficiency sweet spot in a way AMD isn't - but given the massive clock speeds of RDNA2, that is also unlikely - both architectures seem to be pushed roughly equally (outside of the 3090 Ti, which is ridiculous in this regard). It could also just be some weird architectural quirk, where Ampere is suddenly drastically more efficient below a certain, quite low clock threshold (significantly lower than any of its GPUs clock in regular use). This would require power testing at ever-decreasing clocks to test.
Either way, these measurements are sufficiently weird to have me curious.
I was curious about the RT and ram insufficiency and thus my question to Wiz. Any conclusions I leave for you but I did share mine which you don't need to agree with.