Tuesday, August 14th 2018

NVIDIA Announces Turing-based Quadro RTX 8000, Quadro RTX 6000 and Quadro RTX 5000

NVIDIA today reinvented computer graphics with the launch of the NVIDIA Turing GPU architecture. The greatest leap since the invention of the CUDA GPU in 2006, Turing features new RT Cores to accelerate ray tracing and new Tensor Cores for AI inferencing which, together for the first time, make real-time ray tracing possible.

These two engines - along with more powerful compute for simulation and enhanced rasterization - usher in a new generation of hybrid rendering to address the $250 billion visual effects industry. Hybrid rendering enables cinematic-quality interactive experiences, amazing new effects powered by neural networks and fluid interactivity on highly complex models.
The company also unveiled its initial Turing-based products - the NVIDIA Quadro RTX 8000, Quadro RTX 6000 and Quadro RTX 5000 GPUs - which will revolutionize the work of some 50 million designers and artists across multiple industries.

"Turing is NVIDIA's most important innovation in computer graphics in more than a decade," said Jensen Huang, founder and CEO of NVIDIA, speaking at the start of the annual SIGGRAPH conference. "Hybrid rendering will change the industry, opening up amazing possibilities that enhance our lives with more beautiful designs, richer entertainment and more interactive experiences. The arrival of real-time ray tracing is the Holy Grail of our industry."

NVIDIA's eighth-generation GPU architecture, Turing enables the world's first ray-tracing GPU and is the result of more than 10,000 engineering-years of effort. By using Turing's hybrid rendering capabilities, applications can simulate the physical world at 6x the speed of the previous Pascal generation.

To help developers take full advantage of these capabilities, NVIDIA has enhanced its RTX development platform with new AI, ray-tracing and simulation SDKs. It also announced that key graphics applications addressing millions of designers, artists and scientists are planning to take advantage of Turing features through the RTX development platform.

"This is a significant moment in the history of computer graphics," said Jon Peddie, CEO of analyst firm JPR. "NVIDIA is delivering real-time ray tracing five years before we had thought possible."

Real-Time Ray Tracing Accelerated by RT Cores
The Turing architecture is armed with dedicated ray-tracing processors called RT Cores, which accelerate the computation of how light and sound travel in 3D environments at up to 10 GigaRays a second. Turing accelerates real-time ray tracing operations by up to 25x that of the previous Pascal generation, and GPU nodes can be used for final-frame rendering for film effects at more than 30x the speed of CPU nodes.

"Cinesite is proud to partner with Autodesk and NVIDIA to bring Arnold to the GPU, but we never expected to see results this dramatic," said Michele Sciolette, CTO of Cinesite. "This means we can iterate faster, more frequently and with higher quality settings. This will completely change how our artists work."

AI Accelerated by Powerful Tensor Cores
The Turing architecture also features Tensor Cores, processors that accelerate deep learning training and inferencing, providing up to 500 trillion tensor operations a second.

This level of performance powers AI-enhanced features for creating applications with powerful new capabilities. These include DLAA - deep learning anti-aliasing, which is a breakthrough in high-quality motion image generation - denoising, resolution scaling and video re-timing.

These features are part of the NVIDIA NGX software development kit, a new deep learning-powered technology stack that enables developers to easily integrate accelerated, enhanced graphics, photo imaging and video processing into applications with pre-trained networks.

Faster Simulation and Rasterization with New Turing Streaming Multiprocessor
Turing-based GPUs feature a new streaming multiprocessor (SM) architecture that adds an integer execution unit executing in parallel with the floating point datapath, and a new unified cache architecture with double the bandwidth of the previous generation.

Combined with new graphics technologies such as variable rate shading, the Turing SM achieves unprecedented levels of performance per core. With up to 4,608 CUDA cores, Turing supports up to 16 trillion floating point operations in parallel with 16 trillion integer operations per second.

Developers can take advantage of NVIDIA's CUDA 10, FleX and PhysX SDKs to create complex simulations, such as particles or fluid dynamics for scientific visualization, virtual environments and special effects.

Availability
Quadro GPUs based on Turing will be initially available in the fourth quarter.
Add your own comment

88 Comments on NVIDIA Announces Turing-based Quadro RTX 8000, Quadro RTX 6000 and Quadro RTX 5000

#51
ppn
cucker tarlsonIs there one post on TPU where you don't mention how expensive your gpu was, rejzor :laugh: The last one has it mentioned twice,oh God,he's mad....

If 2080 is really 500mm2 due to tensor and RT cores, we are not getting GT102 on 2080Ti, Titan RTX only. How did you get that 500mm2 number anyway ppn ?
GTX 1080 being 314mm2 and 1080ti - 471mm2. exactly +50%.

so the 256 bit RTX card is very likely 503mm2. Unbeliavable but RTX 2080 will shrink to 256 mm2 on 7nm as 2085 perhaps and remain 256 bit. nvidia will make this transition seamlessly.
RH92For your information RTX 5000 has 3702 cuda cores wich is 118 more cuda cores than 1080Ti !
Yes I saw this typo on their website too.
Posted on Reply
#52
cucker tarlson
What ? 503mm2 ? How did you calculate that ? Do we know the size of tensor cores and RT cores ?
Posted on Reply
#53
T4C Fantasy
CPU & GPU DB Maintainer
RH92We are primarily talking about RTX 2080 here ( or whatever they name it ) and i believe it's safe to assume that GT/RT 104 is going to be nowhere near 754mm2 hence why those clocks are achievable. This being said yes obviously , Turing Titan and 2080Ti will clock lower if that's what you mean. Just for a reminder we don't know yet if that 754mm2 die is a GT/RT 102 , more likely than not it's an GT/RT 100 .
754 is 102 since the RTX9000 will be 100
Plus they compared 102 to gp102 and not gp100 or gv100

Also they slipped and told the real transistor count og gp102, orinigally 12b now its 11.8 xD
Posted on Reply
#54
ppn
remember how TPU thought big RTX was 676mm2. well it is 754, or not 26x26mm but 27,5x27,5, 1,5mm bigger than the PCB footprint below the substrate.

same thing applies for the small RTX that we also have seen the bare PCB of, not 20x20 but 22x22.
Posted on Reply
#55
cucker tarlson
T4C Fantasy754 is 102 since the RTX9000 will be 100
Plus they compared 102 to gp102 and not gp100 or gv100

Also they slipped and told the real transistor count og gp102, orinigally 12b now its 11.8 xD
you sure it was tu102 not tu100? are they really going to produce a 754mm chip and a bigger one ?
Posted on Reply
#56
RH92
cucker tarlsonNo, it's 3072.
ppn.
Yes I saw this typo on their website too.
What typo are you talking about ? Nvidia website has RTX 5000 with 3702 cuda cores . I mean unless you guys know better than NVIDIA what's inside their GPU's ......
T4C Fantasy754 is 102 since the RTX9000 will be 100
Dude 754mm2 is as big as it gets for Turing , it will only go down from there. Do you think they are going to release another bigger chip just for one card ? Yeah makes 0 sense!
Posted on Reply
#57
nemesis.ie
RejZoRProblem with cores is, majority of users only have quad cores at best with 8 total threads thanks to SMT. Which places them to a "not applicable" list. We might see that in 5 years time when around 10-12 cores becomes a standard, but for now, just not enough players have them.
Although that's no different to a few years ago with people only having 1 or two cores. Or indeed new features coming to graphicds cards. There is always a period of time needed to build momentum and user base. The good thing is that we are now seeing more than 4 cores and extra threads appearing at the lower (higher volume) price points.
Posted on Reply
#59
efikkan
RejZoRAll this tech fluff is all nice and fancy, but what new releases of cards really turn me on are the new features available NOW and in ANY game.
As Bug said, the chicken and the egg…
I thought you were paying enough attention to know this problem applies to most new achievements, including new API versions, more CPU cores, etc. The sooner it gets out there, the sooner software will utilize it, but that doesn't mean you have to rush and buy it. I believe that the "launch hardware" of all of the last three Direct3D versions have been "outdated" before we've seen decent games using them, especially the last version which we are still waiting for good games.

Raytracing, at least in some form, is the future of computer graphics. And the sooner it gets out there, the sooner game developers and artists starts using it, and the sooner AMD will also add support. I honestly think it will take at least two more generations before it becomes powerful enough to be useful in a good selection of games.
RejZoRYeah, gotta love idiots who salivate over features on new cards that they won't be able to use anyway until they'll buy a new high end card in 2 years time with same feature set that will actually be used. But what do I know after observing the same thing for basically 2 decades year after year...
Like all of those investing in those "future-proof" GCN-based cards, I bet that investment is going to pay off any day now!
Posted on Reply
#60
Midland Dog
RH92For your information RTX 5000 has 3702 cuda cores wich is 118 more cuda cores than 1080Ti !
*3072, and as for how did you get 500mm squared is that its a rough approximation based on the gigarays and tensor cores as well as the SMs
Posted on Reply
#61
RH92
Midland Dog*3072, and as for how did you get 500mm squared is that its a rough approximation based on the gigarays and tensor cores as well as the SMs
First of all that message was adressed to "ppn" and i never asked how did you get 500mm2 but i guess you got confused juggling between your two accounts . On the side note considering Midland Dog and ppl are both your accounts it's hilarious to see you asking questions to yourself . WTF is wrong with peoples nowadays :kookoo:

Anyway dare to explain where you are pulling 3072 cuda cores from ?

www.nvidia.com/en-us/design-visualization/quadro-desktop-gpus/
QUADRO RTX 5000 QUICK SPECS CUDA Parallel-Processing Cores3,702NVIDIA Tensor Cores384GPU Memory16 GB GDDR6RT CoresYesGraphics BusPCI Express 3.0 x 16NVLinkYesDisplay ConnectorsDP 1.4 (4), VirtualLink (1)Form Factor4.4" (H) x 10.5" (L) Dual Slot
Posted on Reply
#62
jabbadap
RH92First of all that message was adressed to "ppn" and i never asked how did you get 500mm2 but i guess you got confused juggling between your two accounts .

Anyway dare to explain where you are pulling 3072 cuda cores from ?
It can't be 3702, so that is obviously typo on the nvidia site. Nvidia SM have 64 "cores" 4608 can be divided by it(72), 3072 can be divided by it(48) but 3702 can't be divided by it(57.84375)
Posted on Reply
#63
cucker tarlson
RH92First of all that message was adressed to "ppn" and i never asked how did you get 500mm2 but i guess you got confused juggling between your two accounts .

Anyway dare to explain where you are pulling 3072 cuda cores from ?
Mathematics. It says 3702 which is obviously a typo.
Posted on Reply
#65
RH92
jabbadapIt can't be 3702, so that is obviously typo on the nvidia site. Nvidia SM have 64 "cores" 4608 can be divided by it(72), 3072 can be divided by it(48) but 3702 can't be divided by it(57.84375)
cucker tarlsonMathematics. It says 3702 which is obviously a typo.
How do we know they didn't changed the amount of cores per SM ?
jabbadapYup, their release news have this table:
Ok now it makes more sense.
Posted on Reply
#66
T4C Fantasy
CPU & GPU DB Maintainer
RH92What typo are you talking about ? Nvidia website has RTX 5000 with 3702 cuda cores . I mean unless you guys know better than NVIDIA what's inside their GPU's ......



Dude 754mm2 is as big as it gets for Turing , it will only go down from there. Do you think they are going to release another bigger chip just for one card ? Yeah makes 0 sense!
trust me its 3072, and yes they will make a bigger core, it will be the same core count just with HBM2/3
max core counts are
RT104 - 3072
RT102 - 4608
RT100 - 4608
Posted on Reply
#67
cucker tarlson
RH92How do we know they didn't changed the amount of cores per SM ?



Ok now it makes more sense.
Cause 4608 suggests 64/128, and why the hell would they do that.

If 3072 is the full tu104, then 2944 on 2080 (supposedly) is the biggest one we get on tu104 geforce card, 2080Ti will be tu102. People said that if tu104 on 280 is cut down, then 2080Ti will be tu104 full.
Posted on Reply
#68
jabbadap
Yeah it's typo and all are clear, but let's do the mathematics anyway. That SM count should be the same for all the chips configs. So 3702 while is not the prime it can't be divided that much: 3702/2 = 1851(still not the prime) -> 1851/3 = 617(prime number) -> So possible "new" SM is 2*3=6 or 617. 4608/6 = 768 possible, 4608/617 = 7.468.... impossible. But then SM should have Tensor core/s too 576/768 = 0.75 tensor cores per SM -> not possible.
Posted on Reply
#69
cucker tarlson
jabbadapYeah it's typo and all are clear, but let's do the mathematics anyway. That SM count should be the same for all the chips configs. So 3702 while is not the prime it can't be divided that much: 3702/2 = 1851(still not the prime) -> 1851/3 = 617(prime number) -> So possible "new" SM is 2*3=6 or 617. 4608/6 = 768 possible, 4608/617 = 7.468.... impossible. But then SM should have Tensor core/s too 576/768 = 0.75 tensor cores per SM -> not that'd ruin perfromance
no way, 617 cuda per sm :laugh:

each sm has warp schedulers,registers,you wanna keep it small in order for it to be efficient.you wanna simplify scheduling for better perf and efficiency,not complicate it. and they are arranged in pairs, a 128 cuda unit on pascal cards consists of two 64 cuda units.
Posted on Reply
#70
jabbadap
cucker tarlsonAre you crazy, 617 cuda per sm ? that'd ruin perfromance
Well look again. That was not even theoretically possible, 6 cuda cores per SM was without tensors.
Posted on Reply
#71
cucker tarlson
It's 3072,end of story. And how would they segment their lower tier cards with such huge SMs ?
Posted on Reply
#72
diatribe
ppnGTX 1080 being 314mm2 and 1080ti - 471mm2. exactly +50%.
It's actually a 125% increase moving from 314mm2 to 471mm2. Remember, area is calculated by multiplying both sides of the chip.

314 squared is 98,569 and 471 is 221,841; a 125% increase.
Posted on Reply
#73
cucker tarlson
diatribeIt's actually a 125% increase moving from 314mm2 to 471mm2. Remember, area is calculated by multiplying both sides of the chip.

314 squared is 98,569 and 471 is 221,841; a 125% increase.
why would you square 314 ? it's 314 "mm square" already.
Posted on Reply
#74
jabbadap
cucker tarlsonIt's 3072,end of story. And how would they segment their lower tier cards with such huge SMs ?
Again it was very silly debate, and you still get them wrong way a round. Without tensors it could have been 617 x 6 cuda core SMs on 3072cc card and 768 x 6 cuda core SMs on 4608cc card(was not possible other way around). So it's not huge it's very tiny SMs.
Posted on Reply
#75
cucker tarlson
jabbadapAgain it was very silly debate, and you still get them wrong way a round. Without tensors it could have been 617 x 6 cuda core SMs on 3072cc card and 768 x 6 cuda core SMs on 4608cc card(was not possible other way around). So it's not huge it's very tiny SMs.
I get it now,but it's very stupid.
Posted on Reply
Add your own comment
May 11th, 2024 12:03 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts