Wednesday, November 18th 2015
NVIDIA Details "Pascal" Some More at GTC Japan
NVIDIA revealed more details of its upcoming "Pascal" GPU architecture at the Japanese edition of the Graphics Technology Conference. The architecture will be designed to nearly double performance/Watt over the current "Maxwell" architecture, by implementing the latest tech. This begins with stacked HBM2 (high-bandwidth memory 2). The top "Pascal" based product will feature four 4-gigabyte HBM2 stacks, totaling 16 GB of memory. The combined memory bandwidth for the chip will be 1 TB/s. Internally, bandwidths can touch as high as 2 TB/s. The chip itself will support up to 32 GB of memory, and so enterprise variants (Quadro, Tesla), could max out the capacity. The consumer GeForce variant is expected to serve up 16 GB.
It's also becoming clear that NVIDIA will build its "Pascal" chips on the 16 nanometer FinFET process (AMD will build its next-gen chips on more advanced 14 nm process). NVIDIA is innovating a new interconnect called NVLink, which will change the way the company has been building dual-GPU graphics cards. Currently, dual-GPU cards are essentially two graphics cards on a common PCB, with PCIe bandwidth from the slot shared by a bridge-chip, and an internal SLI bridge connecting the two GPUs. With NVLink, the two GPUs will be interconnected with an 80 GB/s bi-directional data path, letting each GPU directly address memory controlled by the other. This should greatly improve memory management in games that take advantage of newer APIs such as DirectX 12 and Vulkan; and prime the graphics card for higher display resolutions. NVIDIA is expected to launch its first "Pascal" based products in the first half of 2016.
Source:
VR World
It's also becoming clear that NVIDIA will build its "Pascal" chips on the 16 nanometer FinFET process (AMD will build its next-gen chips on more advanced 14 nm process). NVIDIA is innovating a new interconnect called NVLink, which will change the way the company has been building dual-GPU graphics cards. Currently, dual-GPU cards are essentially two graphics cards on a common PCB, with PCIe bandwidth from the slot shared by a bridge-chip, and an internal SLI bridge connecting the two GPUs. With NVLink, the two GPUs will be interconnected with an 80 GB/s bi-directional data path, letting each GPU directly address memory controlled by the other. This should greatly improve memory management in games that take advantage of newer APIs such as DirectX 12 and Vulkan; and prime the graphics card for higher display resolutions. NVIDIA is expected to launch its first "Pascal" based products in the first half of 2016.
67 Comments on NVIDIA Details "Pascal" Some More at GTC Japan
In reality, I feel we will have rough parity in performance, which will be a plus for the beleaguered AMD.
if AMD is having samjunk build their chips then I'm DEFINITELY going with nvidia again.
I'm looking at a total of three markets here. There's an emerging market, a market that Intel has a death grip on, and a market where there's some competition. Intel isn't stupid, so they'll focus development on the emerging market, and the technology built there will filter down into other markets. As the emerging market is HPC, that technology will be driving the bus over the next few years. As adoption costs money, we'll see the interconnect go from HPC to servers to consumer goods incrementally.
As such, let's figure this out. PCI-e 4.0 may well be featured heavily in both the consumer (Intel has mild competition) and server (Intel has a death grip) markets. These particular products are continually improved, but they're iterative improvements. It isn't a stretch to think that they'll have PCI-e 4.0 in the next generation, given that it's a minor improvement. While the consumer and server markets continue to improve, the vast majority of research and development is done on the HPC market. A market where money is less of an object, and where a unique new connection type isn't a liability, if better connection speeds can be delivered.
Intel develops a new interconnect for the HPC crowd, that offers substantially improved transfer rates. They allow AMD to license the interconnect so that they can demonstrate that the standard isn't anti-competitive. AMD has the standard, but they don't have the resources to compete in the HPC world. They're stuck just trying to right the ship with consumer hardware and server chips (Zen being their first chance next year). Intel has effectively produced a new interconnect standard in the market where their dominance is most challenged, demonstrated that they aren't utilizing anti-competitive practices, but have never actually opened themselves up for competition. AMD is currently a lame duck because the HPC market is just out of its reach.
By the time the new technologies filter down to consumer and server level hardware PCI-e 4.0 will have been around for a couple of years. Intel will have already utilized PCI-e as they pushed for, while already being out from the FTC's restrictions on including PCI-e. They'll be able to offer token PCI-e support, and actually focus on their own interconnect. It'll have taken at least a few years to filter to consumers, but the money Intel invested into research isn't going to be forgotten.
You seem to be looking at the next two years. I'll admit that the next couple of generations aren't likely to jettison PCI-e, and INtel will in fact embrace 4.0. What I'm worried about is 4-6 years down the line, once Intel has become invested heavily into the HPC market and they need to compete with Nvidia to capture more of it. They aren't stupid, so they'll do whatever it takes to destroy the competition, especially when they're the only game in town capable of offering a decent CPU. Once they've got a death grip on that market, the technology will just flow down hill from there. This isn't the paranoid delusion that this little development will change everything tomorrow, but that it is setting the ship upon a course that will hit a rock in the next few years. It is screwed up to say this will influence things soon, but it isn't unreasonable to say that Intel has a history of doing whatever it takes to secure market dominance. FTC and fair trade practices be damned.
Intel could turn the whole system into a SoC or MCM ( processor+graphics/co-processor+ shared eDRAM + interconnect) and probably will, because sure as hell IBM/Mellanox/Nvidia will be looking at the same scenario. If you're talking about PCIE being removed from consumer motherboards, then yes, eventually that will be the case. Whether Nvidia (or any other add-in card vendor) survive will rely on strength of product. Most chip makers are moving towards embedded solutions - and in Nvidia's case also have a mezzanine module solution with Pascal, so that evolution is already in progress. All I can say is good luck with that. ARM has an inherent advantage that Intel cannot match so far. X86 simply does not scale down far enough to match ARM in high volume consumer electronics, and Intel is too monolithic a company to react and counter an agile and pervasive licensed ecosystem. They are in exactly the same position IBM was in when licensing meant x86 became competitive enough to undermine their domination of the nascent PC market. Talking of IBM and your "4-6 years down the line", POWER9 (2017) won't even begin deployment until 3 years hence, with POWER10 slated for 2020-21 entry. Given that in non-GPU accelerated system, Intel's Xeon still lags behind IBM's BGQ and SPARC64 in computational effectiveness, Intel has some major competition.
On a purely co-processor point, Tesla continues to be easier to deploy, and have greater performance than Xeon Phi, which Intel counters by basically giving away Xeon Phi to capture market share (Intel apparently gifted Xeon Phi's for China's Tiahne-2)- although its performance per watt and workload challenges mean that vendors still look to Tesla (as the latest Green500 list attests). Note that the top system is using a PEZY-SC GPGPU that does not contain a graphics pipeline ( as I suspect future Tesla's will evolve).
Your argument revolves around Intel being able to change the environment by force of will. That will not happen unless Intel choose to walk a path separate from its competitors and ignore the requirements of vendors. Intel do not sell HPC systems. Intel provide hardware in form of interconnects and form factored components. A vendor that actually constructs, deploys, and maintains the system - such as Bull (Atos) for the sake of an example, still has to sell the right product for the job, which is why they sell Xeon powered S6000'sto some customers, and IBM powered Escala'sto others. How does Intel force both vendors and customers to turn away from competitors of equal (or far greater in some cases) financial muscle when their products are demonstrably inferior for certain workloads? Short memory. Remember the last time Intel tried to bend the industry to its will? How did Itanium work out?
Intel's dominance has been achieved through three avenues.
1. Forge a standard and allow that standard to become open (SSE, AVX, PCI, PCI-E etc) but ensure that their products are first to utilize the feature and become synonymous with its usage.
2. Use their base of IP and litigation to wage economic war on their competitors.
3. Limit competitors market opportunities by outspending them ( rebates, bribery)
None of those three apply to their competitors in enterprise computing.
1. You're talking about a proprietary standard (unless Intel hand it over to a special interest group). Intel's record is spotty to say the least. How many proprietary standards have forced the hand of an entire industry? Is Thunderbolt a roaring success?
2. Too many alliances, too many many big fish. Qualcomm isn't Cyrix, ARM isn't Seeq, IBM isn't AMD or Chips & Technologies. Intel's record of trying to enforce its will against large competitors? You remember Intel's complete back down to Microsoft over incorporating NSP in its processors? Intel's record against industry heavyweights isn't that which pervades the small pond of "x86 makers who aren't Intel"
3. Intel's $4.2 billion in losses in 2014 ( add to that the forecast of $3.4 billion in losses this year) through literally trying to buy x86 mobile market share indicate that their effectiveness outside of their core businesses founded 40 years ago, isn't that stellar. Like any business faced with overwhelming competition willing to cut profit to the bone (or even sustain losses for the sake of revenue) they bend to the greater force. Intel are just hoping that they are better equipped than the last time they got swamped ( Japanese DRAM manufacturing forcing Intel from the market).
You talk as if Intel is some all-consuming juggernaut. The reality is that Intel's position isn't as rock solid as you may think. It does rule the x86 market, but their slice of the consumer and enterprise revenue pie is far from assured. Intel can swagger all it likes in the PC market, but their acquisition of Altera and pursuit of Cray's interconnect business are indicators that they know they have a fight on their hands. I'm not prone to voicing absolutes unless they are already proven, but I would be near certain that Intel would not introduce a proprietary standard - licensed or not, if it decreased marketing opportunity - and Intel's co-processor market doesn't even begin to offset the marketing advantages of the third-party add-in board market.
***********************************************************************************************
You also might want to see the AMD license theory from a different perspective:
Say Intel develop a proprietary non-PCI-E standard and decide to license it to AMD to legitimize it as a default standard. What incentive is there for AMD to use it? Intel use the proprietary standard and cut out the entire add-in board market (including AMD's own graphics). If AMD have a creditable x86 platform, why wouldn't they retain PCI-E, have the entire add-in board market to themselves (including both major players in graphics and their HSA partners products), rather than fight Intel head-to-head in the marketplace with a new interface
Which option do you think would benefit AMD more? Which option would boost AMD's market share to the greater degree?
First off, my memory is long enough. Their very first standard in modern computing was the x86 architecture. The foundation which their entire business is built upon today, correct? Yes, AMD pioneered x86-64, but Intel has been riding against RISC and its ilk for how many decades? Itanium, RDRAM, and their failure in the business field are functionally foot notes in a much larger campaign. They've managed to functionally annihilate AMD, despite AMD having had market dominance for a period of time. They've managed several fiascos (Itanium, netburst, the FTC, etc....), yet came away less crippled than Microsoft. I view them as very good at what they do, hulking to the point where any competition is unacceptable, and capable of undoing any errors by throwing enough cash and resources at them to completely remove their issues.
To your final proposition, please reread my original statement. I propose that developing a proprietary standard allows a lame duck competitor a leg up, prevents competition in an emerging market, and still meets FTC requirements. AMD benefits from making cards to the new interconnect standard because they can suddenly offer their products to an entirely new market. Intel isn't helping their CPU business here, they're allowing AMD an avenue by which to make their HPC capable GPUs immediately compatible with Intel's offerings. Intel effectively has AMD battle Nvidia for the HPC market, and while those two grind each other down they are able to mature their FPGA projects up to the point where HPC can be done on SOC options. They have their own interconnect, a company that's willing to fight their battle for them, and time. AMD is willing to get in on the fight because it's money. Simply redesigning the interconnect will allow them to reach a new market, bolstered by Intel's tacit support.
Once all of this leaves the HPC market, and filters down to consumer hardware, is what I'm less than happy about. ARM isn't a factor in the consumer space. It isn't a factor because none of our software is designed to take advantage of a monsterous number of cores. I don't see that changing in the next decade, because it would effectively mean billions, if not trillions, in money spent to completely rewrite code. As such, consumers will get some of the technologies of HPC, but only those which can be translated to x86-64. NVLink and the like won't translate anywhere except GPUs. A new interconnect on the other hand would translate fine. If Intel developed it in parallel to PCI-e 4.0 they would have a practical parachute, should they run into issues. Can you not see how this both embraces PCI-e, while preparing to eject it once their alternatives come to fruition?
After saying all this, I can assume part of your response. The HPC market is emerging, and Intel's architecture is holding it back. I get it, HPC is a bottomless pit for money where insane investments pay off. My problem is that Nvidia doesn't have the money Intel does. Intel is a lumbering giant, that never competes in a market, they seek dominance and control. I don't understand what makes the HPC market any different. This is why I think they'll pull something insane, and try to edge Nvidia out of the market. They've got a track record of developing new standards, and throwing money at something until it works. A new interconnect standard fits that bill exactly. While I see why a human would have misgivings about going down the same path, Intel isn't human.
If you'd like a more recent history lesson on Intel introducing their own standards, let's review QPI. If that doesn't float your boat, Intel is a part of the OIC which is standardizing interconnection for IoT devices. I'd also like to point out that Light Peak became Thunderbolt, and they moved Light Peak to MXC (which to my knowledge is in use for high cost systems: www.rosenberger.com/mxc/). Yes, Thunderbolt and Itanium were failures, but I'll only admit error if you can show me a company that's existed as long as Intel, yet never had a project failure.
None of what was exists now. Makes little difference. Intel (like most tech companies that become established) in their early stages innovated (even if their IP largely accrued from Fairchild Semi and cross licences with Texas Instruments, National Semi, and IBM). Mature companies rely more on purchasing IP. Which is exactly the model Intel have followed. AMD are, and always have been, a bit part player. They literally owe their existence to Intel ( if it were not for Robert Noyce investing in AMD in 1969 they wouldn't have got anywhere close to their $1.55m incorporation target), and have been under Intel's boot since they signed their first contract to license Intel's 1702A EPROMin 1970. AMD have been indebted to Intel's IP their entire existence excluding their first few months where they manufactured licenced copies of Fairchild's TTL chips. Except that Itanium was never accepted by anyone except HP who were bound by contract to accept it.
Except StrongARM and XScale (ARMv4/v5) never became any sort of success
Except Intel's microcontrollers have been consistently dominated by Motorola and ARM
Basically Intel has been fine so long as it stayed with x86. Deviation from the core product has met with failure. The fact that Intel is precisely nowhere in the mobile market should be a strong indicator that that trend is continuing. Intel will continue to purchase IP to gain relevancy and will in all probability continue to lose money outside of its core businesses. ...a market where Intel would still be the dominant player in a market where Intel has 98.3 - 98.5% market share....and unless AMD plans on isolating itself from its HSA partners, licenses also need to be granted to them. And why the hell would they do that? How can AMD compete with Intel giving away Xeon Phi co-processors? Nvidia survives because of the CUDA ecosystem. With AMD offering CUDA -porting tools and FirePro offering a superior FP64 to Tesla (with both being easier to code for and offering better performance per watt over Phi), all Intel would be doing is substituting one competitor with another - with the first competitor remaining viable anyway thanks to IBM and ARM. Thats a gross oversimplification. Intel and Nvidia compete in ONE aspect of HPC - GPU accelerated clusters. Nvidia has no competition with Intel in some other areas ( notably cloud services where Nvidia have locked up both Microsoft and Amazon - the latter already having its 3rd generation Maxwell cards installed), while Intel's main revenue earner data centers don't use GPUs, and396 of the top 500 supersdon't use GPUs either. Not really. It's more R&D and more time and effort spent qualifying hardware for a market that Intel will dominate from well before any contract is signed. Tell me this: When has Intel EVER allowed licensed use of their IP before Intel itself had assumed a dominant position with the same IP? (The answer is never). Your scenario postulates that AMD will move from one architecture where they are dominated by Intel to another architecture where they are dominated by Intel AND have to factor in Intel owning the specification. You have just made an argument that Intel will do anything to win, and yet you expect AMD to bite on Intel owned IP where revisions to the specification could be changed unilaterally by Intel. You do remember that AMD signed a long term deal for Intel processors with the 8085 and Intel promptly stiffed AMD on the 8086 forcing AMD to sign up with Zilog? or Intel granting AMD an x86 license then stiffing them on the 486? Intel owns the PC market. It doesn't own the enterprise sector.
In the PC space, Nvidia is dependent upon Wintel. In the enterprise sector it can pick and choose an architecture. Nvidia hardware sits equally well with IBM, ARM, or x86, and unlike consumer computing IBM particularly is a strong competitor and owns a solid market share. You don't understand what makes the HPC market any different, well the short answer is IBM isn't AMD, and enterprise customers are somewhat more discerning in their purchases than the average consumer....and as you've already pointed out ARM isn't an issue for Intel (or RISC in general for that matter) in PC's. The same obviously isn't true in enterprise. Proprietary tech. Used by Intel (basically a copy of DEC's EV6 and later AMD's HyperTransport). Not used by AMD. It's introduction led to Intel giving Nvidia $1.5 billion. Affected nothing other than removing Nvidia MCP chipsets thanks to the FSB stipulation in licensing-of little actual consequence since Nvidia was devoting less resources to them after the 680i. That about covers it I think.
You seem to be forecasting a doomsday scenario that is 1. probably a decade away, and 2. being prepared for now.
By the time PCI-E phases out, the industry will have moved on to embedded solutions and it will be a moot point.
Anyhow, I think I'm done here. My background is in big iron ( my first job after leaving school was coding for Honeywell and Burroughs mainframes), and I keep current even though I left the industry back in '92 (excepting the occasional article writing) , so I'm reasonably confident enough in my view. I guess we'll find out in due course how wrong, or right we were.
To be honest, I saw the theory of Intel tossing PCI-E 4.0 somewhat inflammatory, alarmist, and so remote a possibility as to be impossible based on what needs to happen for that scenario to play out.
That technology can be harnessed for anyone's hardware as long as they write the software/drives for the hardware. That is NOT a "proprietary use of the protocol", but it's required to use AMD FreeSync software/drives for use with their hardware. I see no problem.
Heck, leaving Fury alone, even Tonga isn't far:
And 380x is significantly faster than 960.
If you are comparing performance per watt, I'd suggest viewing thisand then factor in the wattage draw difference in the power consumption charts. I don't think there are any published figures for HBM DDR vs GDDR5 IMC power draw, but AMD published some broad numbers just for the memory chips.
The Fury X has 512GB/s of bandwidth, so 512 / 35 = 14.6W
The GTX 980Ti has 336GB/s of bandwidth so 336 / 10.66 = 31.5W ( 16.9W more, which should be subtracted from the 980 Ti's power consumption for direct comparison)
I'd note that the 16.9W is probably closer to 30-40W overall including the differences in memory controller power requirement and real world usage, but accurate figures seem to be hard to come by. Hope this adds some clarification for you.
Power consumption 3,36W
link
Idling Fury X consumes 15w more than idling Fury Nano.
hardcop
But graphic card is most important part.