Monday, February 28th 2011
![NVIDIA](https://tpucdn.com/images/news/nvidia-v1739475473466.png)
New CUDA 4.0 Release Makes Parallel Programming Easier
NVIDIA today announced the latest version of the NVIDIA CUDA Toolkit for developing parallel applications using NVIDIA GPUs. The NVIDIA CUDA 4.0 Toolkit was designed to make parallel programming easier, and enable more developers to port their applications to GPUs. This has resulted in three main features:
"Having access to GPU computing through the standard template interface greatly increases productivity for a wide range of tasks, from simple cashflow generation to complex computations with Libor market models, variable annuities or CVA adjustments," said Peter Decrem, director of Rates Products at Quantifi. "The Thrust C++ library has lowered the barrier of entry significantly by taking care of low-level functionality like memory access and allocation, allowing the financial engineer to focus on algorithm development in a GPU-enhanced environment."
The CUDA 4.0 architecture release includes a number of other key features and capabilities, including:
For more information on the features and capabilities of the CUDA Toolkit and on GPGPU applications, please visit: http://www.nvidia.com/cuda
- NVIDIA GPUDirect 2.0 Technology -- Offers support for peer-to-peer communication among GPUs within a single server or workstation. This enables easier and faster multi-GPU programming and application performance.
- Unified Virtual Addressing (UVA) -- Provides a single merged-memory address space for the main system memory and the GPU memories, enabling quicker and easier parallel programming.
- Thrust C++ Template Performance Primitives Libraries -- Provides a collection of powerful open source C++ parallel algorithms and data structures that ease programming for C++ developers. With Thrust, routines such as parallel sorting are 5X to 100X faster than with Standard Template Library (STL) and Threading Building Blocks (TBB).
"Having access to GPU computing through the standard template interface greatly increases productivity for a wide range of tasks, from simple cashflow generation to complex computations with Libor market models, variable annuities or CVA adjustments," said Peter Decrem, director of Rates Products at Quantifi. "The Thrust C++ library has lowered the barrier of entry significantly by taking care of low-level functionality like memory access and allocation, allowing the financial engineer to focus on algorithm development in a GPU-enhanced environment."
The CUDA 4.0 architecture release includes a number of other key features and capabilities, including:
- MPI Integration with CUDA Applications -- Modified MPI implementations automatically move data from and to the GPU memory over Infiniband when an application does an MPI send or receive call.
- Multi-thread Sharing of GPUs -- Multiple CPU host threads can share contexts on a single GPU, making it easier to share a single GPU by multi-threaded applications.
- Multi-GPU Sharing by Single CPU Thread -- A single CPU host thread can access all GPUs in a system. Developers can easily coordinate work across multiple GPUs for tasks such as "halo" exchange in applications.
- New NPP Image and Computer Vision Library -- A rich set of image transformation operations that enable rapid development of imaging and computer vision applications.
o New and Improved Capabilities
o Auto performance analysis in the Visual Profiler
o New features in cuda-gdb and added support for MacOS
o Added support for C++ features like new/delete and virtual functions
o New GPU binary disassembler
For more information on the features and capabilities of the CUDA Toolkit and on GPGPU applications, please visit: http://www.nvidia.com/cuda
77 Comments on New CUDA 4.0 Release Makes Parallel Programming Easier
I meant assisting the same CUDA community into transferring their knowledge to OpenCL instead, and gradually terminate the CUDA brand.
This new CUDA 4.0 announcement shows they're doing the exact opposite. It shows they're still trying to force a vendor-specific API, while there are already capable and "open" alternatives.
They're actively trying to cock-block every other GPU maker from doing GPGPU, and their support for OpenCL is just the Plan B.
That's why I call it "evil".
Edit: I do not believe that Nvidia is forcing the adoption of CUDA and cockblocking other GPU maker, because while they have not offered CUDA to the public, Intel has not made x86 a "freeware" either, despite the overwhelming penetration of x86 based "general computers". Only AMD need not pay license to Intel for x86, because of some fancy agreement they had last time, and even VIA pays Intel licensing on a product which should have been "public domain".
This kind of tit-for-tat is exactly why I like TPU.
One never stops learning about the pros and cons of <insert argument here>.
A question, if I may (as I'm not brand-loyal to any camp - hell, give me Matrox purity anytime):
why is there such a disparity between the different camps?
Is it truly because one is classed as a more established platform (free or not), or because one platform has a more established user-base over the other (CUDA vs OpenCL)?
/me is curious
It should be pretty obvious that a bunch of bickering vendors naturally produce something inferior. Quit hoping the superior solution to die and actually improve your favorite one.
Well, Nvidia offered CUDA first, and advertised heavily about it (and also provided quite a bit of support). OpenCL came later (ever so slightly later), and since that OpenCL is "open", as in everybody can use it as long as the hardware supports it, regardless or royalty, people viewed it as "the right path", and they are largely right. But as you correctly pointed out, by the time OpenCL became mainstream, the CUDA base has grown to quite a large proportion of GPGPU users, and hence OpenCL is almost to the point being ignored. Hence the community decided to become free adverts for the greater good: supporting OpenCL, and this is where it got ugly. CUDA users still want their support, but OpenCL should be the future. Kudos to Nvidia for supporting both, but people thought Nvidia is still stifling OpenCL. That might be the case I do not know, but for now I am contented with the fact that Nvidia supports OpenCL, regardless of the amount of flame Nvidia is throwing at OpenCL (I have yet to see any).
We need to have less of these posts: If you know something, or want to voice your opinion in a sensible way (even if it is wrong), by all means, do it. But coming in and shouting "blow up Khronos Group" (exagerrated for effects) and things like that should better be kept in GN, I do not wish for this thread to descend to amateurish egg pelting.
It's not Nvidia's (nor AMD's, nor Intel's, nor Apple's) responsability to make the shift to OpenCL, it's developers responsablitily. It's not their right to do so even, forcing them to spend more money and time into something they don't really need at this point (by stopping to support it). All the people who invested in CUDA (and right now that's a lot of people in the scientific and accounting bussiness to name a few), invested in Nvidia cards too, for obvious reasons* so there's absolutely no need for them to move to an alternative that would cost them more (because of the change) and would have zero benefits, or even hurt their performance.
Developers will move to OpenCL when and if they want to, which is going to be when that change supposes an advantage to them.
* In case it's not so obvious, it was the only alternative back then.
I don't know there but here Voodoo 3 sold much much more than any other cards including the Riva TNT and TNT2. The only thing that the TNT2 was better was 32 bit support and that's all.
At 16 bit (90%++ of games) the Voodoo3 was a lot faster and back at the time that made it more successful, again, at least here. The Glide mode that was present in every game I owned back then was far superior to the OpenGL or Directx counterparts. Granted, you may call those games old by the time the Voodoo3 launched, since I'm refering to UT and Half-Life...
www.guru3d.com/review/3dfx/voodoo3_3000/index3.html
The thing is that back at the time I bought a TNT2 because the seller adviced me to do so, but had to return soon after because the drivers sucked (artifacts) and there was some kind of incompatibility or something with my Athlon PC. After returning the card to the store 2 times because we couldn't find the problem, and even bringing my PC there to see if they could fix it**. Nothing worked so they gave me the option to get a Voodoo 3 and I never looked back. It was significantly faster in the games I had (I played mainly UT, Half-Life and DID flight simulators EF2000 and F-22 ADF) and had superb antialiaing which I don't remember the TNT having.
**it was not a normal store, they were geeks that helped you, an amazing concept for consumers, that apparently failed because they helped you with the best deals you could get and not the best deals for them. If they were doing so they wouldn't be the first ones giving out drivers to everybody... even one of the AMD's most mentioned OpenCL application started with Nvidia's OpenCL drivers before they got AMD drivers. Bullet Physics.
Your own biased perception of how things are (i.e. OpenCL is plan B etc) does not make it true. It is not true, at all, and if you have the smallest proof af that, please you are free to post it. In the meantime the facts point out that you are wrong. Nvidia is the first one releasing OpenCL drivers for every OpenCL version and that let's everybody develop for OpenCL months in advance of what they could do if they had to wait for other's drivers. How releasin OpenCL drivers 3 months earlier than compatiotion is hurting OpenCL in the benefit of CUDA just scapes my comprehension. You would think that if they wanted to slow down OpenCL they would release it after the competition or maybe 1 week before the competition, in order to brag about, but 3 months. No, no.
This means only some 46% of discrete desktop cards support it, and an even lesser number of IGPs (and this is obviously going to drop drastically, since nVidia has quit the IGP business).
And if we talk about mobile SoCs, Tegra 3 is probably the only next-gen mobile GPU that's not OpenCL-capable (they're not even going unified shaders for T3).
And OpenCL should become BIG in handhelds, in years to come.
1. As you said, to promote their (NVIDIA) own cards.
2. Further GPGPU development.
What people (average users, and even the above average users) don't know is that NVIDIA is backporting features from CUDA into OpenCL development. This is evident in the OpenCL 1.1 man pages, in regards to address space and the built-in functions.
And why are you guys arguing about Glide and non-related topics in a CUDA thread? :P
Tegra 3 might not be supporting OpenCL, but that's Nvidia's fault: its almost like AMD not supporting x86 implementation and that's just plain stupid rather than being an ass.
It just so happens that CUDA is the preferred among "the masses", so of course they're going to roll with it since it's popular.
Again, if you need reasons why it's popular among "the masses":
1. Open source, as in software, not in hardware.
2. Dedicated development and support.
3. Ease of implementation to existing GPGPU applications.
4. Easy portability between itself and OpenCL.
To summarize, it's because it works as it should in an efficient manner, not because NVIDIA is helping humanity or some other bullshit. History might repeat itself with OpenCL here, where CUDA = DirectX and OpenCL = OpenGL. :wtf:
All AMD has to do is allow full low-level access to the memory buffer on their cards (not possible as of the HD 6xxx series) and support bitwise and integer functions (Why they don't, I have no idea. Probably has to do with their stream processor implementation.). Intel got it right with Sandy Bridge.
- Nvidia supports CUDA for the obvious reasons you mentioned.
- Developers who use CUDA use it for their own reasons, which you already explained.
- Users. I'm neither Nvidia nor a GPGPU developer so I just explained why I, as a potential user, support it, it's because I find that the apps being made with CUDA (and that I'm sure will be ported to OpenCL as soon as it becomes an equal ecosystem) are very beneficial to humanity and that's something worth supporting.
I thought I had to make this point clear.
I think we can all agree that PhysX is the evil one here, since it is proprietary and wholly restricted to NVIDIA cards. :mad:
open source alternative of PhotoShop is Gimp why not support it
opensource Alternative of DirectX is OpenGL why not support it
why instead of office don't use open office
Why because you can't use a program on your video card bash CUDA after all
do you considered that you are not the target market for this technology ? and
the target population have the correct tools to run these programs and more
importantly they don't care if the program is based on open source technology
they want it to be fast and have the feature they need they are paying money for
it and expect receive what they paid for it .
By the way the Irony is that NVIDIA implementation of OpenCL is better than other
Solutions
your argument do not make sense to me .
cuda must vanish only when it does not have the ability to compete
with opencl but right now the situation is completely different .
I use Paint.net rather than Gimp.
I use Abiword rather than Office...
I dislike proprietary marketing, and everybody should care whether or not something is open source, unless they have a product to sell or feel that a particular company's merchandise deserves blind adulation. I don't mind being dependent on a specific technology, but I do not want to be dependent on a specific corporation, where I can help it.
CUDA might work and might work well, but the primacy of an open alternative would be better for all of us, from a consumer's point of view. I hope that clarifies my position.
Even if CUDA is faster at some things, if it only works on one specific type of hardware, it's never going to be as widely used as something cross-platform would be.