Monday, July 25th 2016
NVIDIA Launches Maxed-out GP102 Based Quadro P6000
Late last week, NVIDIA announced the TITAN X Pascal, its fastest consumer graphics offering targeted at gamers and PC enthusiasts. The reign of TITAN X Pascal being the fastest single-GPU graphics card could be short-lived, as NVIDIA announced a Quadro product based on the same "GP102" silicon, which maxes out its on-die resources. The new Quadro P6000, announced at SIGGRAPH alongside the GP104-based Quadro P5000, features all 3,840 CUDA cores physically present on the chip.
Besides 3,840 CUDA cores, the P6000 features a maximum FP32 (single-precision floating point) performance of up to 12 TFLOP/s. The card also features 24 GB of GDDR5X memory, across the chip's 384-bit wide memory interface. The Quadro P5000, on the other hand, features 2,560 CUDA cores, up to 8.9 TFLOP/s FP32 performance, and 16 GB of GDDR5X memory across a 256-bit wide memory interface. It's interesting to note that neither cards feature full FP64 (double-precision) machinery, and that is cleverly relegated to NVIDIA's HPC product line, the Tesla P-series.
Besides 3,840 CUDA cores, the P6000 features a maximum FP32 (single-precision floating point) performance of up to 12 TFLOP/s. The card also features 24 GB of GDDR5X memory, across the chip's 384-bit wide memory interface. The Quadro P5000, on the other hand, features 2,560 CUDA cores, up to 8.9 TFLOP/s FP32 performance, and 16 GB of GDDR5X memory across a 256-bit wide memory interface. It's interesting to note that neither cards feature full FP64 (double-precision) machinery, and that is cleverly relegated to NVIDIA's HPC product line, the Tesla P-series.
22 Comments on NVIDIA Launches Maxed-out GP102 Based Quadro P6000
Does anyone here have any experience with the recent Quadro cards from Nvidia? Do they run with (modded) Geforce drivers or is this not possible anymore? If so, then is overclocking without hardmods an option?
Sure you can play games on TITAN X but it is not marketed as a GTX GeForce gaming card anymore and there has to be a reason that nVidia choose to go this way. Not very many gamers have TITAN series card and a lot of games have been pissed that nVidida set TITAN series at such an inaccessible price that 99.9% of all gamers can not afford it or just wont buy it.
This could mean that the 1080Ti could have all 3840 cuda cores enabled for pure gaming and INT8 nerfed so it will not compete with TITAN X as a Deep Learning calculation card.
The sweet spot for a absolute top performing gaming card to sell in volume seams to be around maximum of $800-$900.
If nVidia was to give 1080Ti full 3840 cuda cores at around $800-$900 price range it would make gamers delirious with excitement and make it the best selling Ti gaming card of all time making them a lot more money in volume sales than TITAN X ever would or could as a gaming card, gaming is all about volume these days to make the big bucks.
This way gamers would feel that they got the absolute top gaming card and it would make it a lot easier for more people fork out around $800-$900 knowing this. It is much more fun to buy the best gaming card than the second best, it's all psychology to get us to buy more.
Sweclocker seams to have some sense that this could possibly happen:
www.sweclockers.com/nyhet/22431
But this time they justify the price tag with deep learning instead of DP performance as with the first Titan..
They might go a similiar route as with Kepler: Titan X > 1080Ti with just higher clocks > Titan X Black fully enabled > 1080Ti something fully enabled with higher clocks
www.rsc.org/chemistryworld/2016/07/quantum-computer-simulates-hydrogen-molecule-complex-calculations
Hmm I don't remember route been that with kepler. It was Titan as castrated gk110, gtx780 even more castrated gk110/gk110b, gtx780ti full gk110b, titan black full gk110b to titan Z 2x castrated gk110b. If I had to guess nvidia will release gtx780 like castrated gp102 card only if vega 10 is not powerful enough. Next to nothing, from performance aspect hbm just takes one bottleneck off the equation, but gddr5x do quite the same. Of course if you use some high end cf/sli rig and multimonitor surround/eyefinity with 4k displays, then you might need that bandwidth. But other than that single gpu configs are still more shader power limited than memory bandwidth.
... and everything is tad warmer than absolute zero
I'd like to see them implement quick sort algorithm on their qomputer rather than calculate waveform of the molecule :laugh:
Quantum has already proved to be faster at cracking huge datasets/tables/sorting than classical, but its a waste of the machine time when more important work that requires computational power beyond the limits of classic computing.
www.gizmag.com/d-wave-quantum-computer-supercomputer-ranking/27476/
A simple sort and test shows "How did the D-Wave computer do on the tests? On the largest problem sizes tested, the V5 chip found optimal solutions in less than half a second, while the best classical software solver required 30 minutes to find those same solutions. This makes the D-Wave computer over 3,600 times faster than the classical computer in these tests."
Sure I read somewhere that D-Wave is a lot more a proof of concept and that's probably why Google bought into it. By all means though, it's a start into a very weird school of 'computing'.
www.cnet.com/uk/news/d-wave-quantum-computer-sluggishness-finally-confirmed/
gfxbench.com/device.jsp?benchmark=gfx40&os=Windows&api=gl&D=NVIDIA+null+Quadro+P6000&testgroup=overall
Leads Tessellation offscreen otherwise mixed results.
Either GFXbench or NVIDIA's driver branch R372 need to be optimised.