• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Launches The "Boltzmann Initiative," Brings NVIDIA CUDA to FirePro

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,427 (7.51/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Building on its strategic investments in heterogeneous system architecture (HSA), AMD (NASDAQ: AMD) announced a suite of tools designed to ease development of high-performance, energy efficient heterogeneous computing systems. The "Boltzmann Initiative" leverages HSA's ability to harness both central processing units (CPU) and AMD FirePro graphics processing units (GPU) for maximum compute efficiency through software. The first results of the initiative are featured this week at SC15 and include the Heterogeneous Compute Compiler (HCC); a headless Linux driver and HSA runtime infrastructure for cluster-class, High Performance Computing (HPC); and the Heterogeneous-compute Interface for Portability (HIP) tool for porting CUDA-based applications to a common C++ programming model. The tools are designed to drive application performance across markets ranging from machine learning to molecular dynamics, and from oil and gas to visual effects and computer-generated imaging.

"AMD's Heterogeneous-compute Interface for Portability enables performance portability for the HPC community. The ability to take code that was written for one architecture and transfer it to another architecture without a negative impact on performance is extremely powerful," said Jim Belak, co-lead of the U.S. Department of Energy's Exascale Co-design Center in Extreme Materials and senior computational materials scientist at Lawrence Livermore National Laboratory. "The work AMD is doing to produce a high-performance compiler that sits below high-level programming models enables researchers to concentrate on solving problems and publishing groundbreaking research rather than worrying about hardware-specific optimizations."

New Compiler for Heterogeneous Computing
The promise of combining multi-core, serial processing CPUs with parallel-processing GPUs to maximize compute efficiency is already being seen in the industry, as driven by the Heterogeneous Systems Architecture (HSA) Foundation that counts AMD as a founding member. One of the goals for HSA is easing the development of parallel applications through use of higher level languages. The new AMD "Boltzmann Initiative" suite includes an HCC compiler for C++ development, greatly expanding the field of programmers who can leverage HSA. The new HCC C++ compiler is a key tool in enabling developers to easily and efficiently apply the hardware resources in heterogeneous systems. The compiler offers more simplified development via single source execution, with both the CPU and GPU code in the same file. The compiler automates the placement code that executes on both processing elements for maximum execution efficiency.

"Just as our customers are excited about our hardware innovation, including the introduction of the first GPU with High Bandwidth Memory this year and our new x86 core architecture coming next year, our innovations in software development are equally as important to them," said Mark Papermaster, senior vice president and chief technology officer, AMD. "The challenge has always been to unlock the hardware's capabilities and make them easily accessible to developers working to solve difficult problems. AMD's newest offering provides the keys to more readily access our parallel computing engines -- both multicore CPUs and GPUs -- and to make these benefits available to the mainstream of developers across a broad spectrum of computing platforms, from embedded to supercomputing."

Linux Driver and Runtime Focused on the Needs of HPC Cluster-Class Computing
To complement the new compilation tools, AMD has developed a new HPC-focused driver and system runtime. This new headless Linux driver brings key capabilities to address core high-performance computing needs, including low latency compute dispatch and PCIe data transfers; peer-to-peer GPU support; Remote Direct Memory Access (RDMA) from InfiniBand that interconnects directly to GPU memory; and Large Single Memory Allocation support.

HIP-ifying CUDA Application to Run on AMD GPUs
To bring applications written for CUDA onto AMD platforms, AMD announces the new HIP tool. AMD testing shows that in many cases 90 percent or more of CUDA code can be automatically converted into C++ by HIP with the final 10 percent converted manually in the widely popular C++ language. This greatly expands the installed hardware base available to run what were formerly exclusively CUDA-based applications. At SC15, AMD is demonstrating the potential for HIP, running the CUDA-generated Rodinia benchmark suite on AMD GPUs.

Availability
An early access program for the "Boltzmann Initiative" tools is planned for Q1 2016.

View at TechPowerUp Main Site
 
Joined
Feb 23, 2015
Messages
186 (0.05/day)
System Name $580 Ebay Deal Abomination
Processor Intel Xeon W3680 (3.33GHz 6-core)
Motherboard HP Z400
Cooling 92mm exhaust, 92mm intake
Memory 12GB ECC DDR3 (triple channel)
Video Card(s) EVGA GTX 760
Storage 2x Crucial M500 240GB (RAID 0)
Display(s) BenQ XL2720Z
Case HP Z400
Audio Device(s) X-Fi Titanium HD
Power Supply HP Z400 (475W)
Mouse Razer DeathAdder
Keyboard Logitech Illuminated
More of a problem of slow adoption of AMD standards, I think. If OpenCL had Nvidia's stamp it would be another story.

Instead of trying to make a sinking ship float AMD is at least putting a workaround for CUDA in place.
 
Joined
Oct 30, 2012
Messages
187 (0.04/day)
Processor Intel® Celeron® Processor G1101
Motherboard Supermicro® MBD-C7SIM-Q-B
Memory 8 GB Silicon Power SP004GBLTU133N02/W02
Video Card(s) Sapphire FirePro™ 2270 + AMD Radeon™ HD 8740
Storage 1000 GB Toshiba P300 HDWD110UZSVA
Display(s) 29" LG 29UM57-P
Case Chieftec LBX-02B-U3
Power Supply 650W XFX XXX Edition (P1-650X-XXB9)
Software Windows Server 2016
Apparently it's not if we still need proprietary crap...
I'd pick "proprietary crap" over free stuff any day. The tooling is just so much ahead, you wouldn't want to go back to OSS products even if somebody was pointing a gun at your head. At the end of the day, it's all about what makes you as a developer productive, not whether you follow Stallman's ideology or not.

There's C++ AMP though, which can be compiled on Linux (LLVM), so you're right about something: there is an alternative that works on, uhm, everything, yet the adoption is extremely low. Almost non-existent. And the reason for that is the fact that you could never achieve REAL performance when targeting multiple families of GPU's. The amount of work that profilers & related tools (like NVIDIA CUPTI) do for you is immeasurable. There's some design-time analysis services available, too, that will help with re-writing some of your code based on estimations performed by CUDA engineers (lower the size of this, get rid of that, you know).

Anyway. This is amazing news, although it would be great if they provided some information in regards to just how common it is to see FirePro S being used in servers. They never really talked about that, which is weird... I can get S10000 at a reasonable price here, and it's ready-to-use even (3 fans pre-installed).
 
Joined
Aug 20, 2007
Messages
21,633 (3.40/day)
Location
Olympia, WA
System Name Pioneer
Processor Ryzen 9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon, Phanteks and Corsair Maglev blower fans...
Memory 64GB (2x 32GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 5800X Optane 800GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
There would be no need for all this if OpenCL was up to the task. Apparently it's not if we still need proprietary crap...

My understanding is this directly converts the code into corresponding C++ (OpenCL?) code. So it IS up the task, apparently.

It's not really a performance issue, it's an adoption issue. NVIDIA's CUDA has existed much longer, and thus has much much larger market penetration, than OpenCL.
 
Joined
Sep 6, 2013
Messages
3,466 (0.83/day)
Location
Athens, Greece
System Name 3 desktop systems: Gaming / Internet / HTPC
Processor Ryzen 5 7600 / Ryzen 5 4600G / Ryzen 5 5500
Motherboard X670E Gaming Plus WiFi / MSI X470 Gaming Plus Max (1) / MSI X470 Gaming Plus Max (2)
Cooling Aigo ICE 400SE / Segotep T4 / Νoctua U12S
Memory Kingston FURY Beast 32GB DDR5 6000 / 16GB JUHOR / 32GB G.Skill RIPJAWS 3600 + Aegis 3200
Video Card(s) ASRock RX 6600 / Vega 7 integrated / Radeon RX 580
Storage NVMes, ONLY NVMes / NVMes, SATA Storage / NVMe, SATA, external storage
Display(s) Philips 43PUS8857/12 UHD TV (120Hz, HDR, FreeSync Premium) / 19'' HP monitor + BlitzWolf BW-V5
Case Sharkoon Rebel 12 / CoolerMaster Elite 361 / Xigmatek Midguard
Audio Device(s) onboard
Power Supply Chieftec 850W / Silver Power 400W / Sharkoon 650W
Mouse CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Keyboard CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Software Windows 10 / Windows 10&Windows 11 / Windows 10
There would be no need for all this if OpenCL was up to the task. Apparently it's not if we still need proprietary crap...

Based on the Anandtech article, what I understood was that OpenCL 2.1 is very close to be up to the task. But with Nvidia staying at OpenCL 1.2, no one cares if OpenCL 2.1 is good enough or if OpenCL 3-4-10 will be phenomenal.
 
Joined
Oct 30, 2008
Messages
1,768 (0.30/day)
System Name Lailalo
Processor Ryzen 9 5900X Boosts to 4.95Ghz
Motherboard Asus TUF Gaming X570-Plus (WIFI
Cooling Noctua
Memory 32GB DDR4 3200 Corsair Vengeance
Video Card(s) XFX 7900XT 20GB
Storage Samsung 970 Pro Plus 1TB, Crucial 1TB MX500 SSD, Segate 3TB
Display(s) LG Ultrawide 29in @ 2560x1080
Case Coolermaster Storm Sniper
Power Supply XPG 1000W
Mouse G602
Keyboard G510s
Software Windows 10 Pro / Windows 10 Home
This is great news. Means competition is coming to a market nVidia has had a monopoly on. If anyone can buy an AMD card to run CUDA for less than a NV card, it's gonna make a dent.

I'm surprised nVidia hasn't unleashed their lawyers to try and stop this.
 
Joined
Aug 20, 2007
Messages
21,633 (3.40/day)
Location
Olympia, WA
System Name Pioneer
Processor Ryzen 9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon, Phanteks and Corsair Maglev blower fans...
Memory 64GB (2x 32GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 5800X Optane 800GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
This is great news. Means competition is coming to a market nVidia has had a monopoly on. If anyone can buy an AMD card to run CUDA for less than a NV card, it's gonna make a dent.

I'm surprised nVidia hasn't unleashed their lawyers to try and stop this.

It's not that simple. From how it reads all applications will have to be compiled against AMD libraries. Meaning you need source code access and time / work to make it happen.
 
Joined
Sep 1, 2015
Messages
152 (0.04/day)
"The work AMD is doing to produce a high-performance compiler that sits below high-level programming models enables researchers to concentrate on solving problems and publishing groundbreaking research rather than worrying about hardware-specific optimizations."
Does he really believe in what he say. He should say " Sorry scientists we failed to offer you proper tools to write your applications, we failed to offer you any technical support, we couldn't offer programming specialist to help you write coding lines and these greedy bastards at nVidia offer all that for free or for little money just to make you buy their hardware and because CUDA is open and expanding in every scientific and engineering field we decide to port CUDA to our hardware even if that mean you need to do some coding work and alot of debugging, So please buy some of our hardware, It's cheaper"
 
Joined
Apr 18, 2015
Messages
234 (0.07/day)
"AMD testing shows that in many cases 90 percent or more of CUDA code can be automatically converted into C++"

And who will do the rest of 10%?
From my experience sometimes fixing bad code, especially if you speak about 10% takes as much time as rewriting it from scratch, if not more.

" because CUDA is open and expanding in every scientific and engineering "

Not so sure CUDA is open. Otherwise they would implement support directly without the need to reconvert code.
 
Joined
Sep 1, 2015
Messages
152 (0.04/day)
"AMD testing shows that in many cases 90 percent or more of CUDA code can be automatically converted into C++"

And who will do the rest of 10%?
From my experience sometimes fixing bad code, especially if you speak about 10% takes as much time as rewriting it from scratch, if not more. code.
They don't have people to help like what nVidia offer, in other word we gave you our legendary sword go kill the dragon we will sit hear to watch you.

Not so sure CUDA is open. Otherwise they would implement support directly without the need to reconvert code.
nVidia say so
 
Joined
Jul 9, 2015
Messages
3,413 (0.98/day)
System Name M3401 notebook
Processor 5600H
Motherboard NA
Memory 16GB
Video Card(s) 3050
Storage 500GB SSD
Display(s) 14" OLED screen of the laptop
Software Windows 10
Benchmark Scores 3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.
There would be no need for all this if OpenCL was up to the task. Apparently it's not if we still need proprietary crap...

DVORAK keyboard layout is vastly superior to QWERTY, but world is stuck on the latter, how does it fit in your theory?
 
Joined
Dec 16, 2014
Messages
421 (0.11/day)
DVORAK keyboard layout is vastly superior to QWERTY, but world is stuck on the latter, how does it fit in your theory?
The same reason that we do not use better layouts is that most of the countries have different keyboard layout and we still speak oh so many different languages when the time for learning other languages could be used to do other better things. People just do not want to get used to something new that would benefit them for the rest of their lives and most importantly future generations. But to answer your question:
There would be no need for all this if OpenCL was up to the task. Apparently it's not if we still need proprietary crap...
Proprietary software and capabilities that this software offers can be controlled and I think this is something NVIDIA wants!
You have to go through NVIDIA if you want something and why would NVIDIA invest into development of OpenCL if everyone else can use it also?
OpenCL vs. CUDA is the same as Free-Sync vs. G-Sync.
 
Last edited:
Joined
Oct 2, 2004
Messages
13,791 (1.86/day)
How do you define keyboard layout as "better"? It doesn't matter what setup it has, it's how well you adapt to it. And only thing that affects that is practice (or simply using it a lot). I can type blindly by lifting my fingers off the keyboard. Meaning I can hit the right letters without actually permanently having my fingers resting on the keyboard in what people would call a preferred blind typing position. And I can still type faster than most people with probably the same amount of errors or even less. I'm so used to QWERTZ layout that I'm using that I can hit keys by knowing where they should be without using any reference points other than my mind. At this point, DVORAK would be absolutely inefficient for me...
 
Joined
Jul 9, 2015
Messages
3,413 (0.98/day)
System Name M3401 notebook
Processor 5600H
Motherboard NA
Memory 16GB
Video Card(s) 3050
Storage 500GB SSD
Display(s) 14" OLED screen of the laptop
Software Windows 10
Benchmark Scores 3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.
The same reason that we do not use better layouts is that most of the countries have different keyboard layout.
So, US doesn't use DVORAK because Russians use ЙЦУКЕН (which IS "dvorak" in a way, as you mostly use your strongest fingers to type)
Mind boggling.

Let me elaborate as point seems to have missed you: something, that most people are using, is not necessarily the best thing. The same applies to proprietary CUDA vs standard OpenCL.

How do you define keyboard layout as "better"? It doesn't matter what setup it has, it's how well you adapt to it.
You might want to check what was the goal behind QWERTY.
 
Joined
Oct 30, 2012
Messages
187 (0.04/day)
Processor Intel® Celeron® Processor G1101
Motherboard Supermicro® MBD-C7SIM-Q-B
Memory 8 GB Silicon Power SP004GBLTU133N02/W02
Video Card(s) Sapphire FirePro™ 2270 + AMD Radeon™ HD 8740
Storage 1000 GB Toshiba P300 HDWD110UZSVA
Display(s) 29" LG 29UM57-P
Case Chieftec LBX-02B-U3
Power Supply 650W XFX XXX Edition (P1-650X-XXB9)
Software Windows Server 2016
because Russians use ЙЦУКЕН
...Which is exactly the same as QWERTY:
https://upload.wikimedia.org/wikipedia/commons/a/a5/KB_Eng-Rus_QWERTY(ЙЦУКЕН).svg

The same applies to proprietary CUDA vs standard OpenCL
It is true that CUDA is all about GPGPU (which hypothetically allows for a direct comparison of two), yet OpenCL makes for a small speck when laid on top of CUDA. It's an ecosystem of various tools, libraries and practices that enable you to do anything that people have been doing for years with both CPU and GPU, ranging from accelerated conversion of media content to fluid dynamics. It is supercomputer-first, again, just like OpenCL, but while CUDA gives you everything pre-baked (performance analysis, thousands of libraries, language bindings), the latter still only resembles a specification and nothing more.

Just my opinion: if everything was perfect with OpenCL, we wouldn't have had FireStream SDK, CUDA and (later) C++ AMP in the first place. Again, think about what enables developers to build stuff fast and with adequate maintainability, not what would be "great for free-minded people". I too would love if Khronos provided an IDE-like experience for their solution, on par with tooling, professional community hub and reference guides, but as of right now they just keep making stuff like WebCL that dies w/o seeing a single implementation at all. Vulkan's destiny is bright (and it'd definitely have compute shaders), but it only exists on paper right now, and I need a compiler. A book. A place where I could talk to their engineers. And lots, lots of other things. It's a "big" kind of programming, nothing like your generic "create object A, call method B". If you don't approach it like a crazy scientist, what's it worth?
 
Joined
Jul 9, 2015
Messages
3,413 (0.98/day)
System Name M3401 notebook
Processor 5600H
Motherboard NA
Memory 16GB
Video Card(s) 3050
Storage 500GB SSD
Display(s) 14" OLED screen of the laptop
Software Windows 10
Benchmark Scores 3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.

What on PLANET EARTH did you expect, new, second layer of keys? =)))
Or how would that change if you show Russian layout over DVORAK layout?

I'm blind typer on both layouts, when typing in Russian I rarely use ring finger and the pinky even less frequently.
That's exactly what DVORAK does vs QWERTY.

But, apparently, the reasons why one is used over the other goes well beyond "which one is better".
 
Top