Thursday, November 14th 2013
NVIDIA Dramatically Simplifies Parallel Programming With CUDA 6
NVIDIA today announced NVIDIA CUDA 6, the latest version of the world's most pervasive parallel computing platform and programming model.
The CUDA 6 platform makes parallel programming easier than ever, enabling software developers to dramatically decrease the time and effort required to accelerate their scientific, engineering, enterprise and other applications with GPUs.It offers new performance enhancements that enable developers to instantly accelerate applications up to 8X by simply replacing existing CPU-based libraries. Key features of CUDA 6 include:
"Our technologies have helped major studios, game developers and animators create visually stunning 3D animations and effects," said Paul Doyle, CEO at Fabric Engine, Inc. "They have been urging us to add support for acceleration on NVIDIA GPUs, but memory management proved too difficult a challenge when dealing with the complex use cases in production. With Unified Memory, this is handled automatically, allowing the Fabric compiler to target NVIDIA GPUs and enabling our customers to run their applications up to 10X faster."
In addition to the new features, the CUDA 6 platform offers a full suite of programming tools, GPU-accelerated math libraries, documentation and programming guides.
Version 6 of the CUDA Toolkit is expected to be available in early 2014. Members of the CUDA-GPU Computing Registered Developer Program will be notified when it is available for download. To join the program, register here.
For more information about the CUDA 6 platform, visit NVIDIA booth 613 at SC13, Nov. 18-21 in Denver, and the NVIDIA CUDA website.
The CUDA 6 platform makes parallel programming easier than ever, enabling software developers to dramatically decrease the time and effort required to accelerate their scientific, engineering, enterprise and other applications with GPUs.It offers new performance enhancements that enable developers to instantly accelerate applications up to 8X by simply replacing existing CPU-based libraries. Key features of CUDA 6 include:
- Unified Memory -- Simplifies programming by enabling applications to access CPU and GPU memory without the need to manually copy data from one to the other, and makes it easier to add support for GPU acceleration in a wide range of programming languages.
- Drop-in Libraries -- Automatically accelerates applications' BLAS and FFTW calculations by up to 8X by simply replacing the existing CPU libraries with the GPU-accelerated equivalents.
- Multi-GPU Scaling -- Re-designed BLAS and FFT GPU libraries automatically scale performance across up to eight GPUs in a single node, delivering over nine teraflops of double precision performance per node, and supporting larger workloads than ever before (up to 512 GB). Multi-GPU scaling can also be used with the new BLAS drop-in library.
"Our technologies have helped major studios, game developers and animators create visually stunning 3D animations and effects," said Paul Doyle, CEO at Fabric Engine, Inc. "They have been urging us to add support for acceleration on NVIDIA GPUs, but memory management proved too difficult a challenge when dealing with the complex use cases in production. With Unified Memory, this is handled automatically, allowing the Fabric compiler to target NVIDIA GPUs and enabling our customers to run their applications up to 10X faster."
In addition to the new features, the CUDA 6 platform offers a full suite of programming tools, GPU-accelerated math libraries, documentation and programming guides.
Version 6 of the CUDA Toolkit is expected to be available in early 2014. Members of the CUDA-GPU Computing Registered Developer Program will be notified when it is available for download. To join the program, register here.
For more information about the CUDA 6 platform, visit NVIDIA booth 613 at SC13, Nov. 18-21 in Denver, and the NVIDIA CUDA website.
48 Comments on NVIDIA Dramatically Simplifies Parallel Programming With CUDA 6
And while we're on the subject of stalking, Jorge my boy... the new AMD driver you so much praise in every other post, made the cards scream even louder then before. They made the cards be closer in performance, but still 10-15% variance, by simply making the fan spin faster...
That troll has grown to such high heights because we are feeding it... Let's just ignore it and soon it will die... JORGE R.I.P....
BTW, CUDA has supported a unified pool of memory since release 4.0 if I remember correctly (and yes, it was released in 2011, way before HUMA was even mentioned by AMD), all ver. 6 does is remove the burden from the programmers to access this unified pool of memory when writing using CUDA, it also eliminates some of the overhead caused by writing from system memory to GPU memory, but not all of it.
As for CUDA going the way of the dodo, I think Amazon, the NCSA and Oak Ridge National Laboratory would beg to differ among many others...
nVidia can only dream to have someone with your expertise onboard. :respect:
1. Tesla=Cuda
2. Fermi=FP64
3. Kepler=Dinamic Parallelism
4. Maxwell=Unified virtual Memory
5. Volta=Stacked DRAM
This of Unified Virtual Memory does looks like hUMA or someone understands how works the Unified Virtual Memory:wtf:
And in other theme fermi was seeing that FP64 phase but in Kepler all was removed ten all that work whas a fail o what .. :D
I have to agree that it seems like a copy of hUMA, but hUMA is for APUs. An example would be like the Intel Haswell and Haswell-e SoC. To say NVidia is copying it, would require NVidia to produce APU like AMD to make that statement more valid. Intel hasn't copied hUMA, and they produce APUs of their own, but don't call them APUs like AMD does. For the most part, they are the same thing... NVidia GPUs utilizing System Memory besides dedicated GPU Ram, doesn't seem innovative which is something NVidia has been synonymous for in a while. Now if the rumors were true. NVidia may eventually venture into the server market beside continuing it's push into the tablet/cellphone market, they will eventually have a copy-cat of hUMA. NVidia will be in competition with Intel again, besides AMD, in that market. This is another NVidia desperate move to produce more revenue returns...
Right now, G-Sync has Tegra4 chips on them. Mainly to help NVidia liquidate their leftover inventory since SHIELD and tablets containing those chips, aren't selling like hot-cakes. I suspect their 4thQ revenue reviews will start to shows signs of decline... I strongly feel that GTX 780 Ti or GTX 780 Titan, isn't selling highly either for 7% more Cuda Cores, marginally improved Core Frequencies, and D3D11.2 / Direcompute Full Support for another whooping $699.99 for a single unit. Especially when GTX Titan is still at $1000.00, and from other 3rd Party reviews, GTX 780 Ti in SLI has a tendency to drop frames on certain titles. Brain-dead NVidia fanboys won't admit it, but the kick to their privates after purchasing GTX Titan and 780--I bet it hurts. Pride + Stupidity = epic fails...
AMD right now reign supreme in multi-GPU solutions, and cost efficiency. CrossfireX through the PCIe Bus seem to have fixed AMD's issues with multi-GPU computing. Since AMD won the console wars with NVidia--sucks that NVidia doesn't produce APUs of their own, AMD in a way, has an ability to call the shots on up-coming console games for the next 10 years. Star Citizens, a highly anticipated MMO space-shooter, will be optimized for AMD GPUs with AMD Mantle supporting it... I suspect EQN will be optimized for AMD as well besides the idea that they will be using MS Havoc. Elder Scrolls Online might be another title that's optimized for AMD GPUs, if the rumors I heard about it are true...
The only problem is that AMD cards work better with float4 which is not the best case in many GPGPU applications.
for them all hail :respect: NVIDIA
:roll:
CUDA can be used and shown, AMDs implementations are just on paper so I don't get how people can draw conclusions lol.
Just quick reply for games: You maybe want to explain this.
AMD fanboy burns AMG GPU pretends it's 780Ti