NVIDIA GP104 "Pascal" ASIC Pictured

#26

"I go fast!1!11!1!"

AMD and NVIDA are both releasing mid-upper level cards with 14/16nm. They're saving the top tier for HBM2 next year. :cry: Prices aren't going to change much. You'll get better performance per watt but that's pretty much it.

#27

vega22

same performance plateau since 2013....

#28

FordGT90Concept

"I go fast!1!11!1!"

But in this case, the plateau is intentional and internal (AMD/NVIDIA), not unintentional and external (TSMC). No one cares about HBM that much; they should be offering a GDDR5X monster.

#29

efikkan

Personally I don't care about HBM2 or not at all. I just want more GPU performance, and I trust Nvidia will be able to find a way to supply enough memory bandwidth.

GP104 will be the new mid range GPU from Nvidia, and will roughly perform on the level of today's high end. So for anyone currently owning a GTX 980 Ti this wouldn't be much of an upgrade.

I'm looking forward to the new high-end model (probably a new Titan first), which should arrive in late Q3 or Q4. There is a GP102 chip in the works so this could be it. Even if it turns out that it uses GDDR5X I will be satisfied as long as they fill it with many Cuda cores.

#30

Ruru

S.T.A.R.S.

efikkanPersonally I don't care about HBM2 or not at all. I just want more GPU performance, and I trust Nvidia will be able to find a way to supply enough memory bandwidth.

512-bit GDDR5X would be cool. Feels like 512-bit is enough, so I don't even OC my R9 290's memory at all.

#31

efikkan

9700 Pro512-bit GDDR5X would be cool. Feels like 512-bit is enough, so I don't even OC my R9 290's memory at all.

Why would you need 512-bit? A 256-bit bus will give up to 384 GB/s and 384-bit bus up to 576 GB/s which means that a 384-bit bus with GDDR5X should really be more than enough for both Pascal and Vega.

#32

BlueFalcon

efikkanWhy would you need 512-bit? A 256-bit bus will give up to 384 GB/s and 384-bit bus up to 576 GB/s which means that a 384-bit bus with GDDR5X should really be more than enough for both Pascal and Vega.

HBM provides 3 other major benefits. The first is reduced GPU die size. The memory controller of Fury X is smaller than 512-bit Hawaii's and Hawaii's was in turn smaller than Tahiti's. This can either help reduce manufacturing costs, or allow AMD/NV to use up the extra transistor/die space on making a more powerful flagship card. The second is reduced power usage. Even if HBM saves 15-30W of power over faster GDDR5X, that's 15-30W of power that can be used to improve perf/watt marketing standing or increase GPU clock speeds to improve performance. The third major benefit relates to PCB length. The compact nature of HBM will allow even smaller flagship cards or alternatively allow AMD/NV greater headroom to release flagship Vega Radeon Duo and Titan Z successors if they wanted to.

There is too much hate regarding HBM because people simply compare Fury X to 980Ti and discredit all the major benefits of HBM. Fury X's front-end bottlenecks, lack of sufficient ROPs, weak geometry engines and lower clock speeds exacerbated by lack of decent overclocking headroom, pump whine, etc. all contributed to its lackluster performance and standing against after-market 980Ti cards. If we were to isolate HBM and compare it head-to-head against GDDR5X, it's a far superior alternative, if costs permit its use.

Considering how NV was able to perform very well this generation with 256-bit/384-bit bus cards and not even rely on HBM, but now we know that GP100 uses HBM2, I am going to agree with AMD and NV engineers that if costs are not a factor, then HBM2 big Pascal and Vega are far superior options. Since NV already has GP100 with HBM2 and AMD has Vega with HBM2 on the road-map, it's pointless to debate a hypothetical 384-bit big Pascal or Vega.

#33

rruff

the54thvoidFor now, AMD is still clawing it's way back up and Intel and Nvidia are pissing about on other work.

Even if Polaris and Zen are good, AMD lacks the funds to keep refining and supporting their products like Nvidia and Intel. They've dug themselves a big hole and it's going to take a lot to get out of it.

#34

PP Mguire

BlueFalconHBM provides 3 other major benefits. The first is reduced GPU die size. The memory controller of Fury X is smaller than 512-bit Hawaii's and Hawaii's was in turn smaller than Tahiti's. This can either help reduce manufacturing costs, or allow AMD/NV to use up the extra transistor/die space on making a more powerful flagship card. The second is reduced power usage. Even if HBM saves 15-30W of power over faster GDDR5X, that's 15-30W of power that can be used to improve perf/watt marketing standing or increase GPU clock speeds to improve performance. The third major benefit relates to PCB length. The compact nature of HBM will allow even smaller flagship cards or alternatively allow AMD/NV greater headroom to release flagship Vega Radeon Duo and Titan Z successors if they wanted to.

There is too much hate regarding HBM because people simply compare Fury X to 980Ti and discredit all the major benefits of HBM. Fury X's front-end bottlenecks, lack of sufficient ROPs, weak geometry engines and lower clock speeds exacerbated by lack of decent overclocking headroom, pump whine, etc. all contributed to its lackluster performance and standing against after-market 980Ti cards. If we were to isolate HBM and compare it head-to-head against GDDR5X, it's a far superior alternative, if costs permit its use.

Considering how NV was able to perform very well this generation with 256-bit/384-bit bus cards and not even rely on HBM, but now we know that GP100 uses HBM2, I am going to agree with AMD and NV engineers that if costs are not a factor, then HBM2 big Pascal and Vega are far superior options. Since NV already has GP100 with HBM2 and AMD has Vega with HBM2 on the road-map, it's pointless to debate a hypothetical 384-bit big Pascal or Vega.

Nobody is debating what the big chips will have, some of us are saying it's not necessary to have more memory bandwidth currently on the mid-tier cards. There is too much hate on P104 equipping GDDR5 instead of X or HBM, but the fact is it's not necessary to have the higher bandwidth chips raising the costs of the midrange segment.

#35

rtwjunkie

PC Gaming Enthusiast

PP MguireNobody is debating what the big chips will have, some of us are saying it's not necessary to have more memory bandwidth currently on the mid-tier cards. There is too much hate on P104 equipping GDDR5 instead of X or HBM, but the fact is it's not necessary to have the higher bandwidth chips raising the costs of the midrange segment.

:respect: Preach it, Brother! :clap:

#36

efikkan

BlueFalconHBM provides 3 other major benefits...

Most of that is true, provided that you have a GPU which needs the bandwidth of a 512-bit memory bus or more.

BlueFalconThere is too much hate regarding HBM because people simply compare Fury X to 980Ti and discredit all the major benefits of HBM. Fury X's front-end bottlenecks, lack of sufficient ROPs, weak geometry engines and lower clock speeds exacerbated by lack of decent overclocking headroom, pump whine, etc. all contributed to its lackluster performance and standing against after-market 980Ti cards. If we were to isolate HBM and compare it head-to-head against GDDR5X, it's a far superior alternative, if costs permit its use.

No one is complaining about the benefits of HBM(1/2), but the point is that Fiji doesn't need HBM at all. AMD wasted a lot of resources on something they wouldn't need for a couple of generations. GTX 980 Ti (5632 Gflop/s, 336 GB/s) is able to outperform Fury X (8602 GFlop/s(+53%), 512 GB/s(+52%)), but in theory Fury X "should" have been 50% faster. There is no way it needs all that bandwidth when GTX 980 Ti can do without it.

And now that we have GDDR5X which is currently much cheaper than HBM, and GDDR5X on a 384-bit bus will 576 GB/s, it will still be a while before gaming GPUs really need HBM.

BTW, Fiji is not struggling from ROP performance, that kind of problem would increase with resolution or AA. It does however have enormous inefficiencies in it's scheduling, in the 30-50% range compared to Maxwell.

BlueFalconConsidering how NV was able to perform very well this generation with 256-bit/384-bit bus cards and not even rely on HBM, but now we know that GP100 uses HBM2, I am going to agree with AMD and NV engineers that if costs are not a factor, then HBM2 big Pascal and Vega are far superior options. Since NV already has GP100 with HBM2 and AMD has Vega with HBM2 on the road-map, it's pointless to debate a hypothetical 384-bit big Pascal or Vega.

HBM will replace GDDR5(X) over time. GP100 uses HBM2 because it needs the bandwidth for compute. HBM2 will still be limited in supply throuout 2016. We'll see in a few months what GP102 has in store for us, it wouldn't surprise me if it uses GDDR5X, which would be fast enough until HBM becomes cheaper.

#37

Nihilus

rtwjunkieSo.....the 1080 (or whatever it shall be) with 256-bit bus has 50% less bandwidth than the 256-bit bus of the 980? :confused: Am I understanding your complaint right?

It fills the same slot the 980 does now (upper mid-level), so I'm not sure why you would compare it to 980Ti. Is it very likely to equal or come very close to the 980Ti in performance? Yes, which is a win all around for consumers, as it will be cheaper than the current 980Ti flagship.

Usually the next generation of cards is closer to the tier above on the previous card. ie the Amd 480 is looking to match the 390. Look at the GTX 970 - it was closer to that of a 780ti! We will see if the price of the 1080 is a cheap as you can find a 980 or even lightly used 980ti.

#38

Nihilus

Masoud1980Thanks for the answer
Google Translate Translate does not forgive good
I feel certain familiar forgive or Iranians or Persians you're right?

No problem. Struggled with your last sentance, but we know you are trying. Have fun on the Forums!

#39

bug

rtwjunkieSo.....the 1080 (or whatever it shall be) with 256-bit bus has 50% less bandwidth than the 256-bit bus of the 980? :confused: Am I understanding your complaint right?

It fills the same slot the 980 does now (upper mid-level), so I'm not sure why you would compare it to 980Ti. Is it very likely to equal or come very close to the 980Ti in performance? Yes, which is a win all around for consumers, as it will be cheaper than the current 980Ti flagship.

How about we ignore specs and wait the reviews that should be available within a month or so?
My hunch is, given that both AMD and Nvidia have now access to 14/16nm processes (which are a great leap from 28nm), their new architectures will play a much greater role than simply comparing shader count and bandwidth.

#40

Ferrum Master

Those all that say that 256bit is enough are quite flawed...

Titan X cannot access all his 12GB in one cycle actually, thus making some hurdles in particular usage scenarios like rendering and and heavy data processing with lot of calculated data. For high resolutions 4-5K it will be crucial to have really wide bus, the more power it has, the more space and less latency it needs.

HBM is actually developed for server and compute needs, the gaming market comes second.

#41

efikkan

Ferrum MasterThose all that say that 256bit is enough are quite flawed...

Titan X cannot access all his 12GB in one cycle actually

If you knew how rendering works, you'd know it will never need to access all of it in a single render. The largest part of the allocated memory are object and landscape; meshes, textures, normal maps, displacement maps, uv maps and so on. All of this is pretty much static and most of it is stored in multiple detail levels (typically 4-6 levels). This means that even if you are for some strange reason rendering every object in the game in the highest resolution you will never need more than 50% of these resources. All modern games apply LoD algorithms and culling algorithms, so they usually use less than 15% of these resources in a single render. There is no game using over 25% of all allocated memory in a single render. Even with resource streaming no game will work in the way you describe, it would result either in resource "popping"(ref. Rage) or 0.4 FPS in performance.

Ferrum Masterthus making some hurdles in particular usage scenarios like rendering and and heavy data processing with lot of calculated data.

The usage of "calculated data" (like perlin noise mixed with low resolution data) has nothing at all to do with the bandwidth between the GPU and it's memory. In fact there are two ways to solve it and render giant unique landscapes: having a giant GPU memory (like 200 GB in size) or use resource streaming. The speed between the GPU and it's memory is in no way the bottleneck here.

Ferrum MasterFor high resolutions 4-5K it will be crucial to have really wide bus, the more power it has, the more space and less latency it needs.

Why? Do you actually know how much the temporary frame for a 4K render actually needs? 64 MB without AA, 256 MB with 4xMSAA (before compression). You obviously don't need hundreds of GB/s to store this data.

#42

Ferrum Master

efikkanIf you knew how rendering works

Obviously you din't understand the issue. Budget time and latency is the problem. Try enabling ray tracing in CGI scene an let the numbers roll.

#43

efikkan

Ferrum MasterObviously you din't understand the issue. Budget time and latency is the problem. Try enabling ray tracing in CGI scene an let the numbers roll.

Budget time and latency has nothing to do with it. Everyone knows calculations is the bottleneck of ray tracing.

#44

the54thvoid

Super Intoxicated Moderator

I'm backing @efikkan for a description I don't understand in my laymans terms. They seem to know exactly what they are talking about.

#45

deu

rruffEven if Polaris and Zen are good, AMD lacks the funds to keep refining and supporting their products like Nvidia and Intel. They've dug themselves a big hole and it's going to take a lot to get out of it.

Someone didnt check the internet since friday!

wccftech.com/amd-stock-52-highest-percentage-gain-listing/

#46

PP Mguire

Ferrum MasterThose all that say that 256bit is enough are quite flawed...

Titan X cannot access all his 12GB in one cycle actually, thus making some hurdles in particular usage scenarios like rendering and and heavy data processing with lot of calculated data. For high resolutions 4-5K it will be crucial to have really wide bus, the more power it has, the more space and less latency it needs.

HBM is actually developed for server and compute needs, the gaming market comes second.

That's taking a stance like the debate is on compute, these are gaming cards (especially P104, we might see Quadro variants with HBM). We don't need memory bandwidth like that for gaming. As a Titan X owner on 4k, increasing VRAM clock speed doesn't increase FPS at all negating the need for a wider bus or higher memory bandwidth. We lack the raw processing power. With P100 HBM2 will just be icing on the proverbial cake, but for P104 it's an unnecessary increase in cost that reaps no benefits.

#47

Ferrum Master

PP MguireThat's taking a stance like the debate is on compute, these are gaming cards (especially P104, we might see Quadro variants with HBM). We don't need memory bandwidth like that for gaming. As a Titan X owner on 4k, increasing VRAM clock speed doesn't increase FPS at all negating the need for a wider bus or higher memory bandwidth. We lack the raw processing power. With P100 HBM2 will just be icing on the proverbial cake, but for P104 it's an unnecessary increase in cost that reaps no benefits.

That's what i am saying. Albeit we are on the edge... 384bits is the bare minimum. I do have FPS gains upping my vram. Actualy a simple rendering benchmars, like GPU-Z have also really reacts well to the memory speed increase. And that is actually bad... Okay, our beloved valley... my card does react to vRAM OC.... see for yourself.

#48

the54thvoid

Super Intoxicated Moderator

deuSomeone didnt check the internet since friday!

wccftech.com/amd-stock-52-highest-percentage-gain-listing/

Jesus - reading all the comments following was like stepping in afterbirth. Some quite rabid fanboys on that site. Even our most avid AMD chaps here aren't that bad.

As for the stock jump - no biggie - AMD announces x,y and z and investors buy because people buy cheap stocks to gamble they will make money. It's speculative investment - most will sell just before the launch of Polaris and Zen. Also helps to announce a prospective mega bucks deal with China - makes it all look better.

Not saying these things are not good or happening but as a bit of an anti capitalist (specifically in the proliferation of huge private wealth, gained at the expense of others, with zero social distribution) I fucking hate the stock markets - they're bogus.

#49

PP Mguire

Ferrum MasterThat's what i am saying. Albeit we are on the edge... 384bits is the bare minimum. I do have FPS gains upping my vram. Actualy a simple rendering benchmars, like GPU-Z have also really reacts well to the memory speed increase. And that is actually bad... Okay, our beloved valley... my card does react to vRAM OC.... see for yourself.

That's a synthetic benchmark, which we all know doesn't really equate to real world performance. In heavy VRAM games upping my VRAM my old 980s didn't help much on 1440p and neither does upping my Titans on 4k. Even in the synthetic bench you're showing a measly 4fps up in average and in real games at 4k I show less than that with a heavy VRAM clock. Yet upping my core boost from 1300 to 1500 shows a good 10+ in games. Like I've said I think 3 times now, memory bandwidth isn't an issue. We need raw chip power.

#50

rruff

deuSomeone didnt check the internet since friday!

Yes, I'm well aware that AMD's stock went up. Check back in 5 years.

NVIDIA GP104 "Pascal" ASIC Pictured

56 Comments on NVIDIA GP104 "Pascal" ASIC Pictured

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

NVIDIA GP104 "Pascal" ASIC Pictured

Related News

56 Comments on NVIDIA GP104 "Pascal" ASIC Pictured

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts