• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Announces Exciting DirectX 12 Game Engine Developer Partnerships

the54thvoid

Super Intoxicated Moderator
Staff member
Joined
Dec 14, 2009
Messages
13,197 (2.39/day)
Location
Glasgow - home of formal profanity
Processor Ryzen 7800X3D
Motherboard MSI MAG Mortar B650 (wifi)
Cooling be quiet! Dark Rock Pro 4
Memory 32GB Kingston Fury
Video Card(s) Gainward RTX4070ti
Storage Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s) LG 32" 165Hz 1440p GSYNC
Case Asus Prime AP201
Audio Device(s) On Board
Power Supply be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software W10
Yeah, about async compute ... it's super easy on GCN because it is perfectly happy to accept compute commands in the 3D queue. There is no penalty for mixing draw calls and compute commands in the 3D queue.
With Maxwell you have performance penalties from using compute commands concurrently with draw calls, so compute queues are mostly used to offload and execute compute commands in batch.
Essentially if you want to use async compute efficiently on nvidia, you gotta cleanly separate the render pipeline into batches and even consider including CUDA.dll to fully use high priority jobs and independent scheduling (with GK110 and later, CUDA bypasses the graphics command processor and is handled by a dedicated function unit in hardware which runs uncoupled from the regular compute or graphics engine. It even supports multiple asynchronous queues in hardware). It's a complete mess, and all detailed here: http://ext3h.makegames.de/DX12_Compute.html

Nice read. Sort of.

It's possible for a dev to work with CUDA to make async work then. That would require Nvidia to sponsor titles and help with coding for CUDA to prioritise the batches to suit the hardware. The article said that would mean worse case for AMD but good gains for Nvidia as the CUDA route allows the hardware to do async batches better. Vice versa is the hardware only solution as AMD has designed GCN for, which is worse case for Nvidia.

So, AMD can sponsor titles and Nvidia lose out or Nvidia can sponsor titles and AMD can lose out.

No change then! :rolleyes:
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,263 (4.42/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
To use CUDA is to disable the 3D pipeline. NVIDIA cards require switching between the two. That doesn't go over well in games because the 3D pipeline is far more important.

GCN doesn't care if it is compute or 3D, it queues it into the same pipeline.

NVIDIA needs to fix it otherwise games just won't use it.
 
Joined
Sep 17, 2014
Messages
22,959 (6.07/day)
Location
The Washing Machine
System Name Tiny the White Yeti
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
VR HMD HD 420 - Green Edition ;)
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
Getting worked up about perf differences with current gen cards in DX12 mode:

1. Pointless
2. Predictable
3. Rather naive

I have underlined this with the release of maxwell and the PR around current gen in terms of being dx12 ready, every single time: NONE of the current gen cards are really ready and buying into them and expecting the opposite is pretty short sighted. We all knew big changes were close and here they are. New arch/next gen gpu will be ready for it and current gen is already bottlenecked in many other ways, most notably CPU load and VRAM.

Stop worrying for nothing because it only underlines how uneducated your purchase has been. And it displays a lack of insight in the way the industry works.

Next gen selling points are going to be bigger leaps in DX12 performance; thats how they get us to buy new product.
 
Joined
Apr 30, 2012
Messages
3,881 (0.83/day)
Straight consoles ports will benefit AMD. Nvidia would have to sponsor to null (DX12 to DX11) and implement its beneficial code like the new Tomb Raider.
 

the54thvoid

Super Intoxicated Moderator
Staff member
Joined
Dec 14, 2009
Messages
13,197 (2.39/day)
Location
Glasgow - home of formal profanity
Processor Ryzen 7800X3D
Motherboard MSI MAG Mortar B650 (wifi)
Cooling be quiet! Dark Rock Pro 4
Memory 32GB Kingston Fury
Video Card(s) Gainward RTX4070ti
Storage Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s) LG 32" 165Hz 1440p GSYNC
Case Asus Prime AP201
Audio Device(s) On Board
Power Supply be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software W10
Straight consoles ports will benefit AMD. Nvidia would have to sponsor to null (DX12 to DX11) and implement its beneficial code like the new Tomb Raider.

Straight console ports are abominations. Frame locked, key bindings screwed and PC graphics settings not there.
 
Joined
Apr 10, 2012
Messages
1,400 (0.30/day)
Location
78°55' N, 11°56' E
System Name -aLiEn beaTs-
Processor Intel i7 11700kf @ 5.055Ghz
Motherboard MSI Z490 Unify
Cooling Corsair H115i Pro RGB
Memory G.skill Royal Silver 4400 cl17 @ 4403mhz
Video Card(s) Zotac GTX 980TI AMP!Omega Factory OC 1418MHz
Storage Intel SSD 330, Crucial SSD MX300 & MX500
Display(s) Samsung C24FG73 144HZ
Case CoolerMaster HAF 932 USB3.0
Audio Device(s) X-Fi Titanium HD @ 2.1 Bose acoustimass 5
Power Supply CoolerMaster 850W v2 gold atx 2.52
Mouse Razer viper 8k
Keyboard Logitech G19s
Software Windows 11 Pro 21h2 64Bit
Benchmark Scores ► ♪♫♪♩♬♫♪♭
My system runs all my games at max possible settings at all times, it's this specific shit that always fucks up everything. And I'm not even blaming AMD here. They've done DX12 right, it's NVIDIA that was lazy. But I love Deus Ex franchise, that's why I'm worrying.

Then again, Deus Ex Human Revolution looked amazing so if I get that level of graphics I'm fine with it anyway. So yeah, chilling...
DeusEx human revolution was a AMD evolved game too and it ran perfectly fine on Nvidia.. actually it ran much better then amd later.
I have no doubt it will be the same now.

As for these games list.. nothing but yawn fest.. only this new DeusEx looks interesting, rest not worth mentioning.
 
Joined
May 4, 2012
Messages
985 (0.21/day)
Location
Ireland
Wake me up when devs switch to Vulkan and drop DX completely.
DirectX 12 performs better than Vulkan, according to first tests BUT... Vulkan works on most available platforms and only because of that DirectX should be ditched for good!
 
Joined
Feb 8, 2012
Messages
3,014 (0.64/day)
Location
Zagreb, Croatia
System Name Windows 10 64-bit Core i7 6700
Processor Intel Core i7 6700
Motherboard Asus Z170M-PLUS
Cooling Corsair AIO
Memory 2 x 8 GB Kingston DDR4 2666
Video Card(s) Gigabyte NVIDIA GeForce GTX 1060 6GB
Storage Western Digital Caviar Blue 1 TB, Seagate Baracuda 1 TB
Display(s) Dell P2414H
Case Corsair Carbide Air 540
Audio Device(s) Realtek HD Audio
Power Supply Corsair TX v2 650W
Mouse Steelseries Sensei
Keyboard CM Storm Quickfire Pro, Cherry MX Reds
Software MS Windows 10 Pro 64-bit
To use CUDA is to disable the 3D pipeline. NVIDIA cards require switching between the two. That doesn't go over well in games because the 3D pipeline is far more important.
It is doable, Just Cause 2 had a water simulation done in pure CUDA.
Quoted from the article:
Ask your personal Nvidia engineer for how to share GPU side buffers between DX12 and CUDA.
NVIDIA needs to fix it otherwise games just won't use it.
Agreed. After all Just Cause 2 is only game I could think of that used CUDA.
 
Joined
Dec 22, 2014
Messages
101 (0.03/day)
Processor i5-4670k
Motherboard MSI Z87-GD65
Cooling Magicool G2 Slim 240+360, Watercool Heatkiller 3.0, Alphacool GPX 290 M05
Memory G.Skill RipjawsX DDR3 2x4GB 2133MHz CL9
Video Card(s) Gigabyte Radeon R9 290
Storage Samsung 850 EVO 250GB, 320GB+1TB HDDs
Case SilentiumPC Aquarius X90 Pure Black
Power Supply Chieftec Navitas GPM-1000C
Great, async compute for Deus Ex. Which means it'll run like shit on GTX 900 cards. Thanks NVIDIA for your "complete" DX12 support.
I don't think you understand how AC works.
For example if it wasn't used at all in AoS, the results for nvidia would be EXACTLY THE SAME, not better. It simply helps the performance on GCN cards. It doesn't make cards that are not able to do it perform worse.
 
Joined
Apr 25, 2012
Messages
3 (0.00/day)
Location
Israel
Processor Intel® Core™ i5-2500K
Motherboard GIGABYTE Z68XP-UD3-iSSD
Cooling Scythe Mugen 3
Memory G.Skill RipjawsX 2x4GB DDR3 1600mhz 7-8-8-24
Video Card(s) Galaxy Geforce GTX 680
Storage Western Digital VelociRaptor WD3000GLFS 300GB Accelerated via SRT w/ Intel SSD 311 Series 20GB mSATA
Display(s) LG W2363D Full HD 120Hz 3D
Case Gigabyte 3D Aurora 570 + 2x Scythe S-FLEX
Audio Device(s) Creative X-Fi XtremeMusic
Power Supply SeaSonic S12II-620w
Software Microsoft Windows 7 64-bit
It is doable, Just Cause 2 had a water simulation done in pure CUDA.

Agreed. After all Just Cause 2 is only game I could think of that used CUDA.

if by "pure CUDA" you mean game devs leverage CUDA directly (no middle-ware) - add these:
CUDA/GPU Transcode used in Rage & Wolfenstein (maybe available to other ID Tech 5 titles by ini tweak)
Nascar '14 use CUDA to accelerate in-house particle effects.

however other ~45 games are using CUDA for advanced PhysX acceleration. a middle-ware designed by the creators of the architecture itself is "pure" enough for me.
 
Last edited:

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,263 (4.42/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
I don't think you understand how AC works.
For example if it wasn't used at all in AoS, the results for nvidia would be EXACTLY THE SAME, not better. It simply helps the performance on GCN cards. It doesn't make cards that are not able to do it perform worse.
Async on causes NVIDIA cards to lose performance across the board: http://www.anandtech.com/show/10067/ashes-of-the-singularity-revisited-beta/6


Developers need to enable async compute on GCN cards and disable it on NVIDIA cards to get the best framerate for their respective platforms.
 
Joined
Oct 2, 2004
Messages
13,791 (1.86/day)
It's the same situation as with tessellation. If you use it to gain performance, you'd have gains on better hardware and no penalty on "unsupported" hardware. But when you use a feature t cram more "details" into a game, that simply isn't true anymore. And the same is with async compute. It's not there solely to boost identical graphics quality on all graphic cards, I bet they'll use it to cram more details into the game thanks to those gains and in that case, performance just wont' be the same.

What I'm saying is that they won't be using async compute to achieve insane framerate across the board, they'll use it to make more details and sacrifice performance with it. It has always been like this. Instead of making current game more adoptable by more players they want to make it more appealing to the rich crowd with beefed up PC's. And then they wonder why sales aren't up there. Lol...
 
Joined
Dec 22, 2014
Messages
101 (0.03/day)
Processor i5-4670k
Motherboard MSI Z87-GD65
Cooling Magicool G2 Slim 240+360, Watercool Heatkiller 3.0, Alphacool GPX 290 M05
Memory G.Skill RipjawsX DDR3 2x4GB 2133MHz CL9
Video Card(s) Gigabyte Radeon R9 290
Storage Samsung 850 EVO 250GB, 320GB+1TB HDDs
Case SilentiumPC Aquarius X90 Pure Black
Power Supply Chieftec Navitas GPM-1000C
Joined
Feb 8, 2012
Messages
3,014 (0.64/day)
Location
Zagreb, Croatia
System Name Windows 10 64-bit Core i7 6700
Processor Intel Core i7 6700
Motherboard Asus Z170M-PLUS
Cooling Corsair AIO
Memory 2 x 8 GB Kingston DDR4 2666
Video Card(s) Gigabyte NVIDIA GeForce GTX 1060 6GB
Storage Western Digital Caviar Blue 1 TB, Seagate Baracuda 1 TB
Display(s) Dell P2414H
Case Corsair Carbide Air 540
Audio Device(s) Realtek HD Audio
Power Supply Corsair TX v2 650W
Mouse Steelseries Sensei
Keyboard CM Storm Quickfire Pro, Cherry MX Reds
Software MS Windows 10 Pro 64-bit
if by "pure CUDA" you mean game devs leverage CUDA directly (no middle-ware) - add these:
CUDA/GPU Transcode used in Rage & Wolfenstein (maybe available to other ID Tech 5 titles by ini tweak)
Nascar '14 use CUDA to accelerate in-house particle effects.

however other ~45 games are using CUDA for advanced PhysX acceleration. a middle-ware designed by the creators of the architecture itself is "pure" enough for me.
I meant "pure" CUDA in a sense that one part of rendering pipeline (e.g. simulated water geometry in JC2) that would normally be done via geometry shader is done via shared side buffers using CUDA only.
Incidentally that code in JC2 was written by nvidia themselves and probably most of the cuda specific code from your examples and half of the code from 45 instances of middleware integration. The other half came from SDK examples also written by them. So, yes, it's also "pure" that way too. Untainted by all the different middleware modifications or independent CUDA+DirectX engine implementations throughout the world.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,263 (4.42/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
Afaik Nvidia cards lost some performance there because they tried to emulate ac(and we see how that went).
They did not. Async shaders is part of the DirectX 11 API. AMD cards handles async workloads asynchronously where NVIDIA cards handles async workloads synchronously; hence, performance boost for the former and performance hit for the latter. If those 2-4% frames are important, the only solution on NVIDIA is to not use async shaders at all.
 
Joined
Feb 8, 2012
Messages
3,014 (0.64/day)
Location
Zagreb, Croatia
System Name Windows 10 64-bit Core i7 6700
Processor Intel Core i7 6700
Motherboard Asus Z170M-PLUS
Cooling Corsair AIO
Memory 2 x 8 GB Kingston DDR4 2666
Video Card(s) Gigabyte NVIDIA GeForce GTX 1060 6GB
Storage Western Digital Caviar Blue 1 TB, Seagate Baracuda 1 TB
Display(s) Dell P2414H
Case Corsair Carbide Air 540
Audio Device(s) Realtek HD Audio
Power Supply Corsair TX v2 650W
Mouse Steelseries Sensei
Keyboard CM Storm Quickfire Pro, Cherry MX Reds
Software MS Windows 10 Pro 64-bit
NVIDIA cards handles async workloads synchronously ... the only solution on NVIDIA is to not use async shaders at all.
That way too oversimplified. Both architectures benefit from async shaders and they will be used simply because it's in direct x. Difference is GCN is more efficient when 3d and compute commands are mixed, which is worst case for nvidia. Nvidia arch likes them separated (in lowest number of batches possible) which is worst case for AMD.
All serious engines will commit workload for a single frame differently on different gpu architectures to achieve peak efficiency.
GCN is great because it allows more simple/flexible approach where you can simply keep brute force saturating command queues. With nvidia currently you gotta treat async task invokes like draw calls, manage and batch them because of the time overhead from context switch. Not only time overhead because synchronicity here is side effect from context switch that may wait on longer running async tasks to finish.
 
Top