NVIDIA Dramatically Simplifies Parallel Programming With CUDA 6

Cristian_25H · Nov 14, 2013

NVIDIA today announced NVIDIA CUDA 6, the latest version of the world's most pervasive parallel computing platform and programming model.

The CUDA 6 platform makes parallel programming easier than ever, enabling software developers to dramatically decrease the time and effort required to accelerate their scientific, engineering, enterprise and other applications with GPUs.

It offers new performance enhancements that enable developers to instantly accelerate applications up to 8X by simply replacing existing CPU-based libraries. Key features of CUDA 6 include:

Unified Memory -- Simplifies programming by enabling applications to access CPU and GPU memory without the need to manually copy data from one to the other, and makes it easier to add support for GPU acceleration in a wide range of programming languages.
Drop-in Libraries -- Automatically accelerates applications' BLAS and FFTW calculations by up to 8X by simply replacing the existing CPU libraries with the GPU-accelerated equivalents.
Multi-GPU Scaling -- Re-designed BLAS and FFT GPU libraries automatically scale performance across up to eight GPUs in a single node, delivering over nine teraflops of double precision performance per node, and supporting larger workloads than ever before (up to 512 GB). Multi-GPU scaling can also be used with the new BLAS drop-in library.

"By automatically handling data management, Unified Memory enables us to quickly prototype kernels running on the GPU and reduces code complexity, cutting development time by up to 50 percent," said Rob Hoekstra, manager of Scalable Algorithms Department at Sandia National Laboratories. "Having this capability will be very useful as we determine future programming model choices and port more sophisticated, larger codes to GPUs."

"Our technologies have helped major studios, game developers and animators create visually stunning 3D animations and effects," said Paul Doyle, CEO at Fabric Engine, Inc. "They have been urging us to add support for acceleration on NVIDIA GPUs, but memory management proved too difficult a challenge when dealing with the complex use cases in production. With Unified Memory, this is handled automatically, allowing the Fabric compiler to target NVIDIA GPUs and enabling our customers to run their applications up to 10X faster."

In addition to the new features, the CUDA 6 platform offers a full suite of programming tools, GPU-accelerated math libraries, documentation and programming guides.

Version 6 of the CUDA Toolkit is expected to be available in early 2014. Members of the CUDA-GPU Computing Registered Developer Program will be notified when it is available for download. To join the program, register here.

For more information about the CUDA 6 platform, visit NVIDIA booth 613 at SC13, Nov. 18-21 in Denver, and the NVIDIA CUDA website.

View at TechPowerUp Main Site

Jorge · Nov 14, 2013

Unfortunately this is a desperation move as AMD's HuMA/HSA and APUs change the PC langscape for good and forever. CUDA is going to disappear in a few years as it's not the best solution.

RCoon · Nov 14, 2013

Jorge said:
...change the PC langscape for good and...

Making up words for the glory of AMD I see

shinkueagle · Nov 14, 2013

RCoon said:
Making up words for the glory of AMD I see

So I see you're stalking Jorge from the previous article about the AMD 12GB card... Hyuck hyuck hyuck... :laugh:

omnimodis78 · Nov 14, 2013

Jorge said:
...for good and forever...

What seems like "good and forever" at the moment can sometimes be lame and forgotten in no time. How you assume that NVIDIA is desperate with something that has been out now for 7 years is a bit creepy, sorry.

Frick · Nov 14, 2013

shinkueagle said:
So I see you're stalking Jorge from the previous article about the AMD 12GB card... Hyuck hyuck hyuck...

That would be like stalking a streaking skyscraper.

Recus · Nov 14, 2013

Jorge said:
Unfortunately this is a desperation move as AMD's HuMA/HSA and APUs change the PC langscape for good and forever. CUDA is going to disappear in a few years as it's not the best solution.

CUDA already showed that it's worth in supercomputing. AMD showed PR slides and won? :rolleyes:

Sihastru · Nov 14, 2013

Jorge said:
Unfortunately this is a desperation move as AMD's HuMA/HSA and APUs change the PC langscape for good and forever. CUDA is going to disappear in a few years as it's not the best solution.

So basically CUDA 6 is closer to hUMA/HSA then AMD only dreams of in PowerPoint slides and you call this a desperate move? Do you even know what hUMA/HSA stand for? If you're going to be a fanboy, be a better fanboy. Do some research before posting.

And while we're on the subject of stalking, Jorge my boy... the new AMD driver you so much praise in every other post, made the cards scream even louder then before. They made the cards be closer in performance, but still 10-15% variance, by simply making the fan spin faster...

Cheeseball · Nov 14, 2013

Jorge said:
Unfortunately this is a desperation move as AMD's HuMA/HSA and APUs change the PC langscape for good and forever. CUDA is going to disappear in a few years as it's not the best solution.

More like AMD playing catch up to NVIDIA's CUDA platform.

shinkueagle · Nov 14, 2013

Frick said:
That would be like stalking a streaking skyscraper.

Yeah I know what you mean... I'm just new here but a wise man told me what a troll Jorge is...

That troll has grown to such high heights because we are feeding it... Let's just ignore it and soon it will die... JORGE R.I.P....

15th Warlock · Nov 14, 2013

Don't feed the trolls, this guy in particular almost always posts degrading comments on all Intel and nvidia related threads, and then dissapears without any facts to back his claims :shadedshu

BTW, CUDA has supported a unified pool of memory since release 4.0 if I remember correctly (and yes, it was released in 2011, way before HUMA was even mentioned by AMD), all ver. 6 does is remove the burden from the programmers to access this unified pool of memory when writing using CUDA, it also eliminates some of the overhead caused by writing from system memory to GPU memory, but not all of it.

As for CUDA going the way of the dodo, I think Amazon, the NCSA and Oak Ridge National Laboratory would beg to differ among many others...

Rebel333 · Nov 14, 2013

Woooááá Nvidia :laugh:

. Forget it guys, it is nowhere to Mantle.

Death Star · Nov 14, 2013

OpenCL > CUDA.

Fluffmeister · Nov 14, 2013

Death Star said:
OpenCL > CUDA.

Well that's conclusive then.

nVidia can only dream to have someone with your expertise onboard. :respect:

Death Star · Nov 15, 2013

Fluffmeister said:
Well that's conclusive then.

nVidia can only dream to have someone with your expertise onboard.

Damn straight!

ensabrenoir · Nov 15, 2013

Rebel333 said:
Woooááá Nvidia. Forget it guys, it is nowhere to Mantle.

......did i miss something.....never mind.... it would be like screaming in outer space.......

nem · Nov 15, 2013

nVidea Roadmap

1. Tesla=Cuda
2. Fermi=FP64
3. Kepler=Dinamic Parallelism
4. Maxwell=Unified virtual Memory
5. Volta=Stacked DRAM

This of Unified Virtual Memory does looks like hUMA or someone understands how works the Unified Virtual Memory :wtf:

And in other theme fermi was seeing that FP64 phase but in Kepler all was removed ten all that work whas a fail o what ..

Serpent of Darkness · Nov 15, 2013

Re:

Jorge said:
Unfortunately this is a desperation move as AMD's HuMA/HSA and APUs change the PC langscape for good and forever. CUDA is going to disappear in a few years as it's not the best solution.

On a non-trolling, mature, civil point of view, I have to agree with you. NVidia's moves for the past month has been nothing but desperation. GTX 780 Ti is proof of that, G-Sync is another example. After R9-290x toppled their former, over-priced king, the GTX Titan, NVidia, looks like they went into panic mode. All the consumers QQ about competition not being around, not driving prices down, smacked NVidia on it's butt because they didn't take what has been happening with AMD seriously. They scrapped Titan Ultra and Lite. Currently researching on Stacked Ram aka Volta. Something AMD has already done in the past and refined... Not feeling confident about AMD Mantle even though NVidia user can utilize it.

I have to agree that it seems like a copy of hUMA, but hUMA is for APUs. An example would be like the Intel Haswell and Haswell-e SoC. To say NVidia is copying it, would require NVidia to produce APU like AMD to make that statement more valid. Intel hasn't copied hUMA, and they produce APUs of their own, but don't call them APUs like AMD does. For the most part, they are the same thing... NVidia GPUs utilizing System Memory besides dedicated GPU Ram, doesn't seem innovative which is something NVidia has been synonymous for in a while. Now if the rumors were true. NVidia may eventually venture into the server market beside continuing it's push into the tablet/cellphone market, they will eventually have a copy-cat of hUMA. NVidia will be in competition with Intel again, besides AMD, in that market. This is another NVidia desperate move to produce more revenue returns...

Right now, G-Sync has Tegra4 chips on them. Mainly to help NVidia liquidate their leftover inventory since SHIELD and tablets containing those chips, aren't selling like hot-cakes. I suspect their 4thQ revenue reviews will start to shows signs of decline... I strongly feel that GTX 780 Ti or GTX 780 Titan, isn't selling highly either for 7% more Cuda Cores, marginally improved Core Frequencies, and D3D11.2 / Direcompute Full Support for another whooping $699.99 for a single unit. Especially when GTX Titan is still at $1000.00, and from other 3rd Party reviews, GTX 780 Ti in SLI has a tendency to drop frames on certain titles. Brain-dead NVidia fanboys won't admit it, but the kick to their privates after purchasing GTX Titan and 780--I bet it hurts. Pride + Stupidity = epic fails...

AMD right now reign supreme in multi-GPU solutions, and cost efficiency. CrossfireX through the PCIe Bus seem to have fixed AMD's issues with multi-GPU computing. Since AMD won the console wars with NVidia--sucks that NVidia doesn't produce APUs of their own, AMD in a way, has an ability to call the shots on up-coming console games for the next 10 years. Star Citizens, a highly anticipated MMO space-shooter, will be optimized for AMD GPUs with AMD Mantle supporting it... I suspect EQN will be optimized for AMD as well besides the idea that they will be using MS Havoc. Elder Scrolls Online might be another title that's optimized for AMD GPUs, if the rumors I heard about it are true...

Sihastru · Nov 15, 2013

What planet are you guys from? Seriously...

SIGSEGV · Nov 15, 2013

Sihastru said:
What planet are you guys from? Seriously...

Earth

Cheeseball · Nov 15, 2013

Death Star said:
OpenCL > CUDA.

This would make sense if CUDA wasn't portable to OpenCL, but then again, it is. If anything, CUDA = OpenCL + OpenCL "Extensions", just like how OpenGL Extensions is.

The only problem is that AMD cards work better with float4 which is not the best case in many GPGPU applications.

TheHunter · Nov 15, 2013

yea its already in driver level since r331

chinmi · Nov 15, 2013

Serpent of Darkness said:
Brain-dead NVidia fanboys won't admit it, but the kick to their privates after purchasing GTX Titan and 780--I bet it hurts. Pride + Stupidity = epic fails...

they will admit nothing !!! :banghead:

for them all hail :respect:

NVIDIA

radrok · Nov 15, 2013

Many people don't realize how much CUDA is getting radicated into professional software, I suggest you to take a look at CUDA developer zone before crapping into threads with consequent humiliation by showing how much clueless you are.

CUDA can be used and shown, AMDs implementations are just on paper so I don't get how people can draw conclusions lol.

NeoXF · Nov 15, 2013

radrok said:
AMDs implementations are just on paper so I don't get how people can draw conclusions lol.

http://sites.amd.com/us/business/promo/Pages/machete-movie-premiere.aspx

System Name	HP Omen 17
Processor	i7 7700HQ
Memory	16GB 2400Mhz DDR4
Video Card(s)	GTX 1060
Storage	Samsung SM961 256GB + HGST 1TB
Display(s)	1080p IPS G-SYNC 75Hz
Audio Device(s)	Bang & Olufsen
Power Supply	230W
Mouse	Roccat Kone XTD+
Software	Win 10 Pro

System Name	PC
Processor	Intel i5-3570K @ 4.4Ghz
Motherboard	ASRock Z77 Extreme4
Cooling	Cooler Master V8
Memory	G.SKILL RipjawsX 4x4GB DDR3 1600Mhz (F3-12800CL9Q-16GBXM)
Video Card(s)	Gigabyte GTX 970 (GV-N970G1 GAMING-4GD, rev. 1.0)
Storage	Intel 530 180GB + Intel 520 120GB + multiple HDDs
Display(s)	Samsung SyncMaster P2770 via DVI
Case	Antec Eleven Hundred
Audio Device(s)	Sound Blaster ZxR
Power Supply	CORSAIR AX760
Software	Windows 7 Pro x64 (w/ all updates + SP1)

System Name	White DJ in Detroit
Processor	Ryzen 5 5600
Motherboard	Asrock B450M-HDV
Cooling	Be Quiet! Pure Rock 2
Memory	2 x 16GB Kingston Fury 3400mhz
Video Card(s)	XFX 6950XT Speedster MERC 319
Storage	Kingston A400 240GB \| WD Black SN750 2TB \|WD Blue 1TB x 2 \| Toshiba P300 2TB \| Seagate Expansion 8TB
Display(s)	Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case	Fractal Design Define R4
Audio Device(s)	Line6 UX1 + Sony MDR-10RC, Nektar SE61 keyboard
Power Supply	Corsair RM850x v3
Mouse	Logitech G602
Keyboard	Cherry MX Board 1.0 TKL Brown
Software	Windows 10 Pro
Benchmark Scores	Rimworld 4K ready!

Processor	Intel
Motherboard	MSI
Cooling	Cooler Master
Memory	Corsair
Video Card(s)	Nvidia
Storage	Western Digital/Kingston
Display(s)	Samsung
Case	Thermaltake
Audio Device(s)	On Board
Power Supply	Seasonic
Mouse	Glorious
Keyboard	UniKey
Software	Windows 10 x64

System Name	Prometheus
Processor	Intel i7 14700K
Motherboard	ASUS ROG STRIX B760-I
Cooling	Noctua NH-D12L
Memory	Corsair 32GB DDR5-7200
Video Card(s)	MSI RTX 4070Ti Ventus 3X OC 12GB
Storage	WD Black SN850 1TB
Display(s)	DELL U4320Q 4K
Case	SSUPD Meshroom D Fossil Gray
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Corsair SF750 Platinum SFX
Mouse	Razer Orochi V2
Keyboard	Nuphy Air75 V2 White
Software	Windows 11 Pro x64

NVIDIA Dramatically Simplifies Parallel Programming With CUDA 6

Cristian_25H

Jorge

RCoon

shinkueagle

New Member

omnimodis78

Frick

Fishfaced Nincompoop

Recus

Sihastru

Cheeseball

Not a Potato

shinkueagle

New Member

15th Warlock

Rebel333

New Member

Death Star

Fluffmeister

Death Star

ensabrenoir

nem

Serpent of Darkness

Sihastru

SIGSEGV

Cheeseball

Not a Potato

TheHunter

chinmi

radrok

NeoXF

System Name	Titan
Processor	AMD Ryzen™ 7 7950X3D
Motherboard	ASRock X870 Taichi Lite
Cooling	Thermalright Phantom Spirit 120 EVO CPU
Memory	TEAMGROUP T-Force Delta RGB 2x16GB DDR5-6000 CL30
Video Card(s)	ASRock Radeon RX 7900 XTX 24 GB GDDR6 (MBA) / NVIDIA RTX 4090 Founder's Edition
Storage	Crucial T500 2TB x 3
Display(s)	LG 32GS95UE-B, ASUS ROG Swift OLED (PG27AQDP), LG C4 42" (OLED42C4PUA)
Case	HYTE Hakos Baelz Y60
Audio Device(s)	Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Cloud Alpha Wireless
Power Supply	Corsair SF1000L
Mouse	Logitech Pro Superlight 2 (White), G303 Shroud Edition
Keyboard	Wooting 60HE+ / 8BitDo Retro Mechanical Keyboard (N Edition) / NuPhy Air75 v2
VR HMD	Occulus Quest 2 128GB
Software	Windows 11 Pro 64-bit 23H2 Build 22631.4317

Processor	Intel Core i9 13900KF
Motherboard	Asus ROG Maximus Z690 Hero EVA Edition
Cooling	Asus Ryujin II 360 EVA Edition
Memory	4x16GBs DDR5 6800MHz G.Skill Trident Z5 Neo Series
Video Card(s)	Zotac RTX 4090 AMP Extreme Airo
Storage	2TB Samsung 980 Pro OS - 4TB Nextorage G Series Games - 8TBs WD Black Storage
Display(s)	LG C2 OLED 42" 4K 120Hz HDR G-Sync enabled TV
Case	Asus ROG Helios EVA Edition
Audio Device(s)	Denon AVR-S910W - 7.1 Klipsch Dolby ATMOS Speaker Setup - Audeze Maxwell
Power Supply	EVGA Supernova G2 1300W
Mouse	Asus ROG Keris EVA Edition - Asus ROG Scabbard II EVA Edition
Keyboard	Asus ROG Strix Scope EVA Edition
VR HMD	Samsung Odyssey VR
Software	Windows 11 Pro 64bit

Processor	Haswell i7 4770
Motherboard	Asus Z87-PRO
Memory	32GB DDR3-2133 10-10-10-30
Video Card(s)	2x Radeon R9 390X
Storage	Samsung SSD M840 Pro 256GB, 4x320GB mechanical RAID 5

Processor	AMD Ryzen 7 3700X
Motherboard	MSI MAG B550 TOMAHAWK
Cooling	AMD Wraith Prism
Memory	Team Group Dark Pro 8Pack Edition 3600Mhz CL16
Video Card(s)	NVIDIA GeForce RTX 3080 FE
Storage	Kingston A2000 1TB + Seagate HDD workhorse
Display(s)	Samsung 50" QN94A Neo QLED
Case	Antec 1200
Power Supply	Seasonic Focus GX-850
Mouse	Razer Deathadder Chroma
Keyboard	Logitech UltraX
Software	Windows 11

System Name	iJayo
Processor	i7 14700k
Motherboard	Asus ROG STRIX z790-E wifi
Cooling	Pearless Assasi
Memory	32 gigs Corsair Vengence
Video Card(s)	Nvidia RTX 2070 Super
Storage	1tb 840 evo, Itb samsung M.2 ssd 1 & 3 tb seagate hdd, 120 gig Hyper X ssd
Display(s)	42" Nec retail display monitor/ 34" Dell curved 165hz monitor
Case	O11 mini
Audio Device(s)	M-Audio monitors
Power Supply	LIan li 750 mini
Mouse	corsair Dark Saber
Keyboard	Roccat Vulcan 121
Software	Window 11 pro
Benchmark Scores	meh... feel me on the battle field!

Processor	Intel i7 4960x Ivy-Bridge E @ 4.6 Ghz @ 1.42V
Motherboard	x79 AsRock Extreme 11.0
Cooling	EK Supremacy Copper Waterblock
Memory	65.5 GBs Corsair Platinum Kit @ 666.7Mhz
Video Card(s)	PCIe 3.0 x16 -- Asus GTX Titan Maxwell
Storage	Samsung 840 500GBs + OCZ Vertex 4 500GBs 2x 1TB Samsung 850
Audio Device(s)	Soundblaster ZXR
Power Supply	Corsair 1000W
Mouse	Razer Naga
Keyboard	Corsair K95
Software	Zbrush, 3Dmax, Maya, Softimage, Vue, Sony Vegas Pro, Acid, Soundforge, Adobe Aftereffects, Photoshop

System Name	-aLiEn beaTs-
Processor	Intel i7 11700kf @ 5.055Ghz
Motherboard	MSI Z490 Unify
Cooling	Corsair H115i Pro RGB
Memory	G.skill Royal Silver 4400 cl17 @ 4403mhz
Video Card(s)	Zotac GTX 980TI AMP!Omega Factory OC 1418MHz
Storage	Intel SSD 330, Crucial SSD MX300 & MX500
Display(s)	Samsung C24FG73 144HZ
Case	CoolerMaster HAF 932 USB3.0
Audio Device(s)	X-Fi Titanium HD @ 2.1 Bose acoustimass 5
Power Supply	CoolerMaster 850W v2 gold atx 2.52
Mouse	Razer viper 8k
Keyboard	Logitech G19s
Software	Windows 11 Pro 21h2 64Bit
Benchmark Scores	► ♪♫♪♩♬♫♪♭

Processor	8700k Intel
Motherboard	z370 MSI Godlike Gaming
Cooling	Triple Aquacomputer AMS Copper 840 with D5
Memory	TridentZ RGB G.Skill C16 3600MHz
Video Card(s)	GTX 1080 Ti
Storage	Crucial MX SSDs
Display(s)	Dell U3011 2560x1600 + Dell 2408WFP 1200x1920 (Portrait)
Case	Core P5 Thermaltake
Audio Device(s)	Essence STX
Power Supply	AX 1500i
Mouse	Logitech
Keyboard	Corsair
Software	Win10

System Name	[WIP]
Processor	Intel Pentium G3420 [i7-4790K SOON(tm)]
Motherboard	MSI Z87-GD65 Gaming
Cooling	[Corsair H100i]
Memory	G.Skill TridentX 2x8GB-2400-CL10 DDR3
Video Card(s)	[MSI AMD Radeon R9-290 Gaming]
Storage	Seagate 2TB Desktop SSHD / [Samsung 256GB 840 PRO]
Display(s)	[BenQ XL2420Z]
Case	[Corsair Obsidian 750D]
Power Supply	Corsair RM750
Software	Windows 8.1 x64 Pro / Linux Mint 15 / SteamOS

System Name	SIGSEGV
Processor	INTEL i7-7700K \| AMD Ryzen 2700X
Motherboard	QUANTA \| ASUS Crosshair VII Hero
Cooling	Air cooling 4 heatpipes \| Corsair H115i \| Noctua NF-A14 IndustrialPPC Fan 3000RPM
Memory	Micron 16 Gb DDR4 2400 \| GSkill Ripjaws 32Gb DDR4 3200 3400(OC) 14-14-14-34 @1.38v
Video Card(s)	Nvidia 1060 6GB \| Gigabyte 1080Ti Aorus
Storage	1TB 7200/256 SSD PCIE \| ~ TB \| 970 Evo
Display(s)	15,5" / 27"
Case	Black & Grey \| Phanteks P400S
Audio Device(s)	Realtek
Power Supply	Li Battery \| Seasonic Focus Gold 750W
Mouse	g402
Keyboard	Leopold\|Ducky
Software	LinuxMint KDE \|UBUNTU \| Windows 10 PRO
Benchmark Scores	i dont care about scores