Monday, February 28th 2011

New CUDA 4.0 Release Makes Parallel Programming Easier

Press Release by

Feb 28th, 2011 10:57 Discuss (77 Comments)

NVIDIA today announced the latest version of the NVIDIA CUDA Toolkit for developing parallel applications using NVIDIA GPUs. The NVIDIA CUDA 4.0 Toolkit was designed to make parallel programming easier, and enable more developers to port their applications to GPUs. This has resulted in three main features:

NVIDIA GPUDirect 2.0 Technology -- Offers support for peer-to-peer communication among GPUs within a single server or workstation. This enables easier and faster multi-GPU programming and application performance.
Unified Virtual Addressing (UVA) -- Provides a single merged-memory address space for the main system memory and the GPU memories, enabling quicker and easier parallel programming.
Thrust C++ Template Performance Primitives Libraries -- Provides a collection of powerful open source C++ parallel algorithms and data structures that ease programming for C++ developers. With Thrust, routines such as parallel sorting are 5X to 100X faster than with Standard Template Library (STL) and Threading Building Blocks (TBB).

"Unified virtual addressing and faster GPU-to-GPU communication makes it easier for developers to take advantage of the parallel computing capability of GPUs," said John Stone, senior research programmer, University of Illinois, Urbana-Champaign.

"Having access to GPU computing through the standard template interface greatly increases productivity for a wide range of tasks, from simple cashflow generation to complex computations with Libor market models, variable annuities or CVA adjustments," said Peter Decrem, director of Rates Products at Quantifi. "The Thrust C++ library has lowered the barrier of entry significantly by taking care of low-level functionality like memory access and allocation, allowing the financial engineer to focus on algorithm development in a GPU-enhanced environment."

The CUDA 4.0 architecture release includes a number of other key features and capabilities, including:

MPI Integration with CUDA Applications -- Modified MPI implementations automatically move data from and to the GPU memory over Infiniband when an application does an MPI send or receive call.
Multi-thread Sharing of GPUs -- Multiple CPU host threads can share contexts on a single GPU, making it easier to share a single GPU by multi-threaded applications.
Multi-GPU Sharing by Single CPU Thread -- A single CPU host thread can access all GPUs in a system. Developers can easily coordinate work across multiple GPUs for tasks such as "halo" exchange in applications.
New NPP Image and Computer Vision Library -- A rich set of image transformation operations that enable rapid development of imaging and computer vision applications.
o New and Improved Capabilities
o Auto performance analysis in the Visual Profiler
o New features in cuda-gdb and added support for MacOS
o Added support for C++ features like new/delete and virtual functions
o New GPU binary disassembler

A release candidate of CUDA Toolkit 4.0 will be available free of charge beginning March 4, 2011, by enrolling in the CUDA Registered Developer Program here. The CUDA Registered Developer Program provides a wealth of tools, resources, and information for parallel application developers to maximize the potential of CUDA.

For more information on the features and capabilities of the CUDA Toolkit and on GPGPU applications, please visit: http://www.nvidia.com/cuda

Add your own comment

77 Comments on New CUDA 4.0 Release Makes Parallel Programming Easier

iLLz

This is interesting. I would love to see some real world implementations tested.

ToTTenTranz

As an owner of both CUDA-enabled nVidia GPUs and ATI GPUs, I say down with CUDA. Just make way for better OpenCL implementations.

In 2011, I see CUDA solely as nVidia's "evil" commitment to try to keep GPGPU to themselves and closed-source.

mdm-adph

ToTTenTranzAs an owner of both CUDA-enabled nVidia GPUs and ATI GPUs, I say down with CUDA. Just make way for better OpenCL implementations.

In 2011, I see CUDA solely as nVidia's "evil" commitment to try to keep GPGPU to themselves and closed-source.

I'm the same way. I really do respect Nvidia for the work they did with CUDA -- supporting developers in this type of programming when no one else was -- but the time has come for a truly open GPU computing method that everyone can support without paying someone royalty fees, or being at the mercy of a competitor's development practices.

KainXS

4.0 . . . . . is there a 3.0

edit

oh the newest is 3.2

dir_d

mdm-adphI'm the same way. I really do respect Nvidia for the work they did with CUDA -- supporting developers in this type of programming when no one else was -- but the time has come for a truly open GPU computing method that everyone can support without paying someone royalty fees, or being at the mercy of a competitor's development practices.

I agree and AMD has stepped out big with OpenCL, i really hope Nvidia mans up and meets them half way.

DigitalUK

they will grasp at cuda until the very end, its been around for years and still nothing much to show for it.

Cheeseball

Not a Potato

ToTTenTranzIn 2011, I see CUDA solely as nVidia's "evil" commitment to try to keep GPGPU to themselves and closed-source.

Why would CUDA be evil? It may be proprietary (e.g. works with only with NVIDIA CUDA cores) but it is open and anybody can utilize the SDK. It's also more developed than OpenCL at the moment. (Offers more functions, just like NVIDIA's OpenGL Extensions.)

CUDA is easily portable to OpenCL, but there will be performance issues when compiling using AMD cards.

dir_dI agree and AMD has stepped out big with OpenCL, i really hope Nvidia mans up and meets them half way.

NVIDIA has supported OpenCL just as long as AMD has. In fact, from personal experience, their implementation seems more solid compared to AMD's current SDK.

ToTTenTranz

CheeseballWhy would CUDA be evil?
(...)
CUDA is easily portable to OpenCL, but there will be performance issues when compiling using AMD cards.

You just answered yourself.

Benetanegia

Nvidia is doing for OpenCL as much as AMD if not more, AMD is just being more vocal about it now that they can finally use it as an advantage (i.e they have Fusion and Intel has nothing). But because Nvidia supports OpenCL, that does not mean they should stop development on CUDA. It's the absolute opposite. Creating and evolving an open source API takes a lot of time, because of all the parties involved. i.e. not only matters WHAT things the API does but also HOW they are done and everyone involved wants it to be their way, so it takes time and the API is always one step behind what the actual users NEED. This is less of a problem in mature markets and APIs like DirectX/OpenGL* because the market is "stagnated" and it's the users who are one step behind. But on a emerging market like GPGPU new needs are created on a daily basis and for the actual people using them it's critical to get them ASAP. Nvidia actually helps them by evolving CUDA and exposing to their hardware all those things that developers need, without the requirement to go through months or years of certifications and whatnot. It's because of this that CUDA is successful and REQUIRED in the industry. For actual users is imperative to have those features now. Let's discuss this in a few years.

*And even then it's more than known that OpenGL has been 1 even 2 steps behind and still is in many way. It's also known how that has affected the market and most people would agree that advancement in DX has been a good thing. Well it is.

ToTTenTranzYou just answered yourself.

That works the other way around too. That's the most stupid thing that people don't seem to understand. OpenCL may be cross-platform, but its optimizations certainly aren't. Code optimized for Nvidia GPUs would be slow on AMD GPUs and code optimized for AMD would be slow on Nvidia. Developers still have to code specifically for every platform, so what's so bad about Nvidia offering a much better and mature solution again? Nvidia should deliberately botch down their development so that the open for all platform can catch up? The enterprise world (i.e medical/geological imaging) should wait 2 years more in order to get what they could have now just because you don't want to feel in disadvantage in that little meaningless application or that stupid game? Come on...

"To hell the ability to best diagnose cancer or predict hearthquakes/tornados, I want this post process filter run as fast in my card as in that other one. That surely should be way up on their list, and to hell the rest. After all, I spend millions helping in the development of GPGPU and/or paying for the program afterwards... NO. Wait. That's the enterprises :banghead:, I'm actually the little whinny boy that demands that the FREE feature I get with my $200 GPU is "fair".

#10

ToTTenTranz

BenetanegiaNvidia is doing for OpenCL as much as AMD if not more, AMD is just being more vocal about it now that they can finally use it as an advantage (i.e they have Fusion and Intel has nothing). But because Nvidia supports OpenCL, that does not mean they should stop development on CUDA. It's the absolute opposite. Creating and evolving an open source API takes a lot of time, because of all the parties involved. i.e. not only matters WHAT things the API does but also HOW they are done and everyone involved wants it to be their way, so it takes time and the API is always one step behind what the actual users NEED. This is less of a problem in mature markets and APIs like DirectX/OpenGL* because the market is "stagnated" and it's the users who are one step behind. But on a emerging market like GPGPU new needs are created on a daily basis and for the actual people using them it's critical to get them ASAP. Nvidia actually helps them by evolving CUDA and exposing to their hardware all those things that developers need, without the requirement to go through months or years of certifications and whatnot. It's because of this that CUDA is successful and REQUIRED in the industry. For actual users is imperative to have those features now. Let's discuss this in a few years.

Vendor specific APIs have never had good results in the long term.
No matter what you say about CUDA being more developed than OpenCL, the truth is that nVidia works on CUDA in order to differentiate its GPUs, and not just to help the computing community.

BenetanegiaThat works the other way around too. That's the most stupid thing that people don't seem to understand. OpenCL may be cross-platform, but its optimizations certainly aren't.

Neither are DirectX and OpenGL vendor-specific graphics optimizations. But at least in that case all the participants get a fighting chance through driver optimization.
What is so odd and stupid to you seems pretty simple to me.

BenetanegiaCode optimized for Nvidia GPUs would be slow on AMD GPUs and code optimized for AMD would be slow on Nvidia. Developers still have to code specifically for every platform, so what's so bad about Nvidia offering a much better and mature solution again?

It only works in their GPUs. It's in all customers' interest to have competitive choices from various brands.

BenetanegiaNvidia should deliberately botch down their development so that the open for all platform can catch up?

No, they should redirect their efforts in CUDA because it is a vendor-specific API, and as such it has no long-term future.

BenetanegiaThe enterprise world (i.e medical/geological imaging) should wait 2 years more in order to get what they could have now just because you don't want to feel in disadvantage in that little meaningless application or that stupid game? Come on...
"To hell the ability to best diagnose cancer or predict hearthquakes/tornados, I want this post process filter run as fast in my card as in that other one. That surely should be way up on their list, and to hell the rest.

LOL yeah convince yourself that's the reason why nVidia is pushing CUDA, in an era where a dozen of GPU makers (nVidia, AMD, VIA, PowerVR, ARM Mali, Vivante, Broadcom, Qualcomm, etc) are supporting OpenCL in their latest GPUs.

#11

Benetanegia

ToTTenTranzVendor specific APIs have never had good results in the long term.
No matter what you say about CUDA being more developed than OpenCL, the truth is that nVidia works on CUDA in order to differentiate its GPUs, and not just to help the computing community.

Neither are DirectX and OpenGL vendor-specific graphics optimizations. But at least in that case all the participants get a fighting chance through driver optimization.
What is so odd and stupid to you seems pretty simple to me.

It only works in their GPUs. It's in all customers' interest to have competitive choices from various brands.

No, they should redirect their efforts in CUDA because it is a vendor-specific API, and as such it has no long-term future.

LOL yeah convince yourself that's the reason why nVidia is pushing CUDA, in an era where a dozen of GPU makers (nVidia, AMD, VIA, PowerVR, ARM Mali, Vivante, Broadcom, Qualcomm, etc) are supporting OpenCL in their latest GPUs.

What do you fail to understand? Nvidia IS supporting OpenCL. It's not hurting the development of OpenCL one bit. IN THE MEANTIME CUDA is the best option for the developers that's why they use CUDA to begin with.

Vendor specific is meaningless in the enterprise world and has always been. EVERYTHING is vendor specific in the enterprise world. They compile their code, x86 code for the specific CPU brand they chose for their server, using the best compiler available for there needs, they've been doing for decades, but now it's bad because it's Nvidia...

SOOOO once again what's wrong about Nvidia delivering the best API they can to those customers?

What you fail to understand is that Nvidia does not need to drop CUDA in order to support OpenCL. In fact every single feature, every single optimization they make for CUDA can help develop and evolve OpenCL.

It only works in their GPUs. It's in all customers' interest to have competitive choices from various brands.

And when the most competitive, robust and easy to use combo right now is Nvidia GPU+CUDA is in customers best interest to get that and not wait 2+ years until OpenCL is in the same state for either AMD or Nvidia... really it's not that hard to understand...:shadedshu

#12

Fourstaff

Coming from an academic's point of view, CUDA is easier to work with, OpenCL takes a lot more effort to learn, and support sometimes is not there. Hence sticking with CUDA for the time being because of ease of use and also its "standard" (as in a lot more people are using CUDA than others). Not from me, its from GPU programmers' here that I have met.

"How about OpenCL then? Isn't it better to support an open source project?" Reply: I don't give a s**t as long as I can finish my work with the least amount of hassle, and CUDA supports that view.

Probably in the future OpenCL will be the leader, but for now CUDA does the job more efficiently than OpenCL.

#13

Mussels

Freshwater Moderator

CUDA is still strong because it has better support, but we all want openCL to win out in the long run (even if they have to update it majorly before that happens)

#14

Benetanegia

MusselsCUDA is still strong because it has better support, but we all want openCL to win out in the long run (even if they have to update it majorly before that happens)

Everyone wants that, but not because the strongest combatant leaves the arena (playing Oblivion :laugh:). OpenCL must win when it's better, by it's own merits, not because Nvidia drops CUDA or because they purposely slow down it's development, which is what some people here want apparently.

#15

Cheeseball

Not a Potato

ToTTenTranzLOL yeah convince yourself that's the reason why nVidia is pushing CUDA, in an era where a dozen of GPU makers (nVidia, AMD, VIA, PowerVR, ARM Mali, Vivante, Broadcom, Qualcomm, etc) are supporting OpenCL in their latest GPUs.

They're pushing CUDA because it is successful, especially with a backing of huge software developers like Adobe (After Effects and Premiere are perfect examples) and MATLAB.

What you fail to understand is that Nvidia does not need to drop CUDA in order to support OpenCL. In fact every single feature, every single optimization they make for CUDA can help develop and evolve OpenCL.

Exactly. One thing you may not know about CUDA is that the applications can be ported very easily to OpenCL since they share the same exact functions (both support C99 and a ton of other languages) plus more. It's computational range is in fact similar to each other, and hell, even DirectCompute.

If anything, AMD/ATI should've took the offer to utilize CUDA in their GPUs back when NVIDIA was giving the chance. With that kind of backing, it could've formed a true basis for OpenCL, especially since even Apple was even thinking about using it as it's foundation in the beginning before the Khronos Group adopted it.

#16

Wile E

Power User

Have to agree with Benetanegia on this one. CUDA is not a bad thing. It is leagues ahead of OpenCL right now, not only in terms of abilities, but also market adoption and ease of development for it.

When OpenCL catches up, then we can talk about how CUDA might be a hindrance to the market.

#17

Steevo

If some people put green in gas and called it Nvidgas........

#18

ToTTenTranz

BenetanegiaWhat do you fail to understand? Nvidia IS supporting OpenCL. It's not hurting the development of OpenCL one bit.

And you fail to understand that it is hurting the development of OpenCL while feeding a vendor-specific competitor API to the developers.

You also fail to understand that this has been nVidia's strategy for quite some time.
As Jen-Hsu Huang said, "were a software company".

BenetanegiaIN THE MEANTIME CUDA is the best option for the developers that's why they use CUDA to begin with.

Sure, it's been there for longer.
And so was Glide, when it came down.

BenetanegiaVendor specific is meaningless in the enterprise world and has always been. EVERYTHING is vendor specific in the enterprise world.

lol, wrong. Costs go way down if you adopt open source software.

BenetanegiaThey compile their code, x86 code for the specific CPU brand they chose for their server, using the best compiler available for there needs

x86 code that can be run by all x86 cpu makers. Hence why sometimes we see design wins for AMD, others we see the same for Intel.
Well, there was this instruction-set specific tryout from Intel to the server market. Look how well that went, lol.

Benetanegiathey've been doing for decades, but now it's bad because it's Nvidia...

SOOOO once again what's wrong about Nvidia delivering the best API they can to those customers?

Because there are non-vendor-exclusivie alternatives, open source or not.

BenetanegiaWhat you fail to understand is that Nvidia does not need to drop CUDA in order to support OpenCL. In fact every single feature, every single optimization they make for CUDA can help develop and evolve OpenCL.

And what you fail to understand is that nVidia could do that same optimization in OpenCL to start with.

BenetanegiaAnd when the most competitive, robust and easy to use combo right now is Nvidia GPU+CUDA is in customers best interest to get that and not wait 2+ years until OpenCL is in the same state for either AMD or Nvidia... really it's not that hard to understand...:shadedshu

2 years?!?? LOL. I just made a list of eight GPU vendors pushing OpenCL 1.1 compatibility in their latest GPUs right now.

#19

Fourstaff

ToTTenTranzAnd you fail to understand that it is hurting the development of OpenCL while feeding a vendor-specific competitor API to the developers.

Sure, it's been there for longer.
And so was Glide, when it came down.

lol, wrong. Costs go way down if you adopt open source software.

Well, there was this instruction-set specific tryout from Intel to the server market. Look how well that went, lol.

And what you fail to understand is that nVidia could do that same optimization in OpenCL to start with.

Well, its not like Nvidia is not offering OpenCL, and by that argument you might as well say Windows is the ultimate evil, it hurts Linux, which is absurd.

Yes, CUDA has been around longer, receives more support, and is a better product in almost all ways then OpenCL. That alone should be enough reason why people choose CUDA: not everybody is bothered about "open source" and things like that, they just want to complete their work.

Initial costs for open source is low, but once you factor in support it goes right back up. Also, I don't really see the difference between CUDA and OpenCL: Both are "free", not in the traditional sense, but in the relative sense.

Intel tried to break away from the x86, its own standard. It failed hard. Not applicable here.

Yes, Nvidia can do the same optimisation at start, but on the other hand, OpenCL was still in its infancy when Nvidia started pushing CUDA. I think its because it doesn't want to be bothered with "external standards" and prefer to have its own list of requirements.

#20

JEskandari

I wonder Why these Cuda haters that claim it's not open source
are not so Vocative when it come to OpenGl and DirectX maybe because
one GPU vendor run OpenGL better than the other one

it's right that cuda is not opensource but as I understand it's royalty
free and the only reason that the programs written for it are not able
to use it is because AMD did not wanted to come of it's high hours
and develop a CUDA driver for it's card or probably their software
engineers could not do it who knows ?

by the way Cuda is portable to everywhere if you like you can tomorrow
make a toaster that utilize CUDA for it's work and the Good point is that
it's royalty free not as something like DirectX that relay on Bloat-Ware to
run

#21

Mussels

Freshwater Moderator

JEskandariI wonder Why these Cuda haters that claim it's not open source
are not so Vocative when it come to OpenGl and DirectX

it's right that cuda is not opensource but as I understand it's royalty
free and the only reason that the programs written for it are not able
to use it is because AMD did not wanted to come of it's high hours
and develop a CUDA driver for it's card or probably their software
engineers could not do it who knows ?

by the way Cuda is portable to everywhere if you like you can tomorrow
make a toaster that utilize CUDA for it's work and the Good point is that
it's royalty free not as something like DirectX that relay on Bloat-Ware to
run

because everyone is welcome to use directX (video card manufacturers). the same is not true for Cuda. CUDA is exclusive to nvidia hardware. they never offered it to AMD, that was a rumour that had zero fact behind it.

also, where is this info about directX being bloatware? the only bloat about it is that it requires windows...

#22

JEskandari

Musselsbecause everyone is welcome to use directX. the same is not true for Cuda. CUDA is exclusive to nvidia hardware. they never offered it to AMD, that was a rumour that had zero fact behind it.

also, where is this info about directX being bloatware? the only bloat about it is that it requires windows...

Well you Said it DirectX need a Bloat Ware Called Windows ,By the way let's just say the people who
use Linux or Mac are not that welcome the world of DirectX

by the way did ATI really need an offer ?
isn't it free to develop your CUDA Hardware and software implementation ?

#23

ToTTenTranz

FourstaffWell, its not like Nvidia is not offering OpenCL, and by that argument you might as well say Windows is the ultimate evil, it hurts Linux, which is absurd.

Windows allows hardware differentiation and promotes hardware competiteveness.
CUDA does not.
At most, you could compare it to MacOS X, since it only supports whatever hardware that Apple choses to include in their computers at a given time.

Regardless of how well seen it is from a developer's point of view, it's just one more method for nVidia to try to sell more hardware with an exclusive computing API.

#24

Mussels

Freshwater Moderator

JEskandariWell you Said it DirectX need a Bloat Ware Called Windows ,By the way let's just say the people who
use Linux or Mac are not that welcome the world of DirectX

on that whole front (i guess you're a linux or mac user), none of the OS's are software compatible between themselves. directX is one of... well, everything. its illogical thinking to make an example of one specific part of windows, when NOTHING that runs on windows runs on mac or linux, and vice versa.

JEskandariby the way did ATI really need an offer ?
isn't it free to develop your CUDA Hardware and software implementation ?

uhhh... no. if someone started slapping CUDA on their products in any way, even advertising on the box, nvidia would sue their asses off.

you have to pay, and get nvidias approval to use cuda for a commercial product. Hell, look how much of a tightarse they've been with hardware accelerated physX, which runs on CUDA.

#25

Cheeseball

Not a Potato

Why is everyone saying CUDA is not open source? It IS open source and free to use, just as much as OpenCL. If it wasn't, then my company (and my alma mater) would be paying NVIDIA out the nose, which we aren't (except in hardware). We're not "sponsored" by them either, unless you count developer e-mails and forums posts as "sponsorship".

The only thing people are complaining about is the fact that CUDA is "locked" to NVIDIA cards only, which I wholeheartedly agree with. Personally, it's the only reason why I have a GTX 460 768MB along side my Crossfire setup.

What everyone is failing to understand is that optimization is already existing for NVIDIA's implementation of OpenCL (they have 100% compatibility with OpenCL 1.1 as much as AMD has), it's just that CUDA is more in use because of the wide array of functions and support. (e.g. optimizations, direct video memory usage, static code analysis, etc.)

Musselsyou have to pay, and get nvidias approval to use cuda for a commercial product. Hell, look how much of a tightarse they've been with hardware accelerated physX, which runs on CUDA.

Again, usage of CUDA is free, just like using *nix. A lot of open source (and commercial) developers would not be using it if it wasn't.

Add your own comment

New CUDA 4.0 Release Makes Parallel Programming Easier

77 Comments on New CUDA 4.0 Release Makes Parallel Programming Easier

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

New CUDA 4.0 Release Makes Parallel Programming Easier

Related News

77 Comments on New CUDA 4.0 Release Makes Parallel Programming Easier

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts