AMD Announces Exciting DirectX 12 Game Engine Developer Partnerships

the54thvoid · Mar 24, 2016

BiggieShady said:
Yeah, about async compute ... it's super easy on GCN because it is perfectly happy to accept compute commands in the 3D queue. There is no penalty for mixing draw calls and compute commands in the 3D queue.
With Maxwell you have performance penalties from using compute commands concurrently with draw calls, so compute queues are mostly used to offload and execute compute commands in batch.
Essentially if you want to use async compute efficiently on nvidia, you gotta cleanly separate the render pipeline into batches and even consider including CUDA.dll to fully use high priority jobs and independent scheduling (with GK110 and later, CUDA bypasses the graphics command processor and is handled by a dedicated function unit in hardware which runs uncoupled from the regular compute or graphics engine. It even supports multiple asynchronous queues in hardware). It's a complete mess, and all detailed here: http://ext3h.makegames.de/DX12_Compute.html

Nice read. Sort of.

It's possible for a dev to work with CUDA to make async work then. That would require Nvidia to sponsor titles and help with coding for CUDA to prioritise the batches to suit the hardware. The article said that would mean worse case for AMD but good gains for Nvidia as the CUDA route allows the hardware to do async batches better. Vice versa is the hardware only solution as AMD has designed GCN for, which is worse case for Nvidia.

So, AMD can sponsor titles and Nvidia lose out or Nvidia can sponsor titles and AMD can lose out.

No change then! :rolleyes:

FordGT90Concept · Mar 24, 2016

To use CUDA is to disable the 3D pipeline. NVIDIA cards require switching between the two. That doesn't go over well in games because the 3D pipeline is far more important.

GCN doesn't care if it is compute or 3D, it queues it into the same pipeline.

NVIDIA needs to fix it otherwise games just won't use it.

Vayra86 · Mar 24, 2016

Getting worked up about perf differences with current gen cards in DX12 mode:

1. Pointless
2. Predictable
3. Rather naive

I have underlined this with the release of maxwell and the PR around current gen in terms of being dx12 ready, every single time: NONE of the current gen cards are really ready and buying into them and expecting the opposite is pretty short sighted. We all knew big changes were close and here they are. New arch/next gen gpu will be ready for it and current gen is already bottlenecked in many other ways, most notably CPU load and VRAM.

Stop worrying for nothing because it only underlines how uneducated your purchase has been. And it displays a lack of insight in the way the industry works.

Next gen selling points are going to be bigger leaps in DX12 performance; thats how they get us to buy new product.

Xzibit · Mar 24, 2016

Straight consoles ports will benefit AMD. Nvidia would have to sponsor to null (DX12 to DX11) and implement its beneficial code like the new Tomb Raider.

the54thvoid · Mar 24, 2016

Xzibit said:
Straight consoles ports will benefit AMD. Nvidia would have to sponsor to null (DX12 to DX11) and implement its beneficial code like the new Tomb Raider.

Straight console ports are abominations. Frame locked, key bindings screwed and PC graphics settings not there.

TheHunter · Mar 24, 2016

RejZoR said:
My system runs all my games at max possible settings at all times, it's this specific shit that always fucks up everything. And I'm not even blaming AMD here. They've done DX12 right, it's NVIDIA that was lazy. But I love Deus Ex franchise, that's why I'm worrying.

Then again, Deus Ex Human Revolution looked amazing so if I get that level of graphics I'm fine with it anyway. So yeah, chilling...

DeusEx human revolution was a AMD evolved game too and it ran perfectly fine on Nvidia.. actually it ran much better then amd later.
I have no doubt it will be the same now.

As for these games list.. nothing but yawn fest.. only this new DeusEx looks interesting, rest not worth mentioning.

FYFI13 · Mar 25, 2016

TheGuruStud said:
Wake me up when devs switch to Vulkan and drop DX completely.

DirectX 12 performs better than Vulkan, according to first tests BUT... Vulkan works on most available platforms and only because of that DirectX should be ditched for good!

BiggieShady · Mar 25, 2016

FordGT90Concept said:
To use CUDA is to disable the 3D pipeline. NVIDIA cards require switching between the two. That doesn't go over well in games because the 3D pipeline is far more important.

It is doable, Just Cause 2 had a water simulation done in pure CUDA.
Quoted from the article:

Ask your personal Nvidia engineer for how to share GPU side buffers between DX12 and CUDA.

FordGT90Concept said:
NVIDIA needs to fix it otherwise games just won't use it.

Agreed. After all Just Cause 2 is only game I could think of that used CUDA.

Hiryougan · Mar 25, 2016

RejZoR said:
Great, async compute for Deus Ex. Which means it'll run like shit on GTX 900 cards. Thanks NVIDIA for your "complete" DX12 support.

I don't think you understand how AC works.
For example if it wasn't used at all in AoS, the results for nvidia would be EXACTLY THE SAME, not better. It simply helps the performance on GCN cards. It doesn't make cards that are not able to do it perform worse.

applejack · Mar 25, 2016

BiggieShady said:
It is doable, Just Cause 2 had a water simulation done in pure CUDA.

Agreed. After all Just Cause 2 is only game I could think of that used CUDA.

if by "pure CUDA" you mean game devs leverage CUDA directly (no middle-ware) - add these:
CUDA/GPU Transcode used in Rage & Wolfenstein (maybe available to other ID Tech 5 titles by ini tweak)
Nascar '14 use CUDA to accelerate in-house particle effects.

however other ~45 games are using CUDA for advanced PhysX acceleration. a middle-ware designed by the creators of the architecture itself is "pure" enough for me.

FordGT90Concept · Mar 25, 2016

Hiryougan said:
I don't think you understand how AC works.
For example if it wasn't used at all in AoS, the results for nvidia would be EXACTLY THE SAME, not better. It simply helps the performance on GCN cards. It doesn't make cards that are not able to do it perform worse.

Async on causes NVIDIA cards to lose performance across the board: http://www.anandtech.com/show/10067/ashes-of-the-singularity-revisited-beta/6

Developers need to enable async compute on GCN cards and disable it on NVIDIA cards to get the best framerate for their respective platforms.

RejZoR · Mar 25, 2016

It's the same situation as with tessellation. If you use it to gain performance, you'd have gains on better hardware and no penalty on "unsupported" hardware. But when you use a feature t cram more "details" into a game, that simply isn't true anymore. And the same is with async compute. It's not there solely to boost identical graphics quality on all graphic cards, I bet they'll use it to cram more details into the game thanks to those gains and in that case, performance just wont' be the same.

What I'm saying is that they won't be using async compute to achieve insane framerate across the board, they'll use it to make more details and sacrifice performance with it. It has always been like this. Instead of making current game more adoptable by more players they want to make it more appealing to the rich crowd with beefed up PC's. And then they wonder why sales aren't up there. Lol...

Hiryougan · Mar 25, 2016

FordGT90Concept said:
Async on causes NVIDIA cards to lose performance across the board: http://www.anandtech.com/show/10067/ashes-of-the-singularity-revisited-beta/6

Developers need to enable async compute on GCN cards and disable it on NVIDIA cards to get the best framerate for their respective platforms.

Afaik Nvidia cards lost some performance there because they tried to emulate ac(and we see how that went).

BiggieShady · Mar 26, 2016

applejack said:
if by "pure CUDA" you mean game devs leverage CUDA directly (no middle-ware) - add these:
CUDA/GPU Transcode used in Rage & Wolfenstein (maybe available to other ID Tech 5 titles by ini tweak)
Nascar '14 use CUDA to accelerate in-house particle effects.

however other ~45 games are using CUDA for advanced PhysX acceleration. a middle-ware designed by the creators of the architecture itself is "pure" enough for me.

I meant "pure" CUDA in a sense that one part of rendering pipeline (e.g. simulated water geometry in JC2) that would normally be done via geometry shader is done via shared side buffers using CUDA only.
Incidentally that code in JC2 was written by nvidia themselves and probably most of the cuda specific code from your examples and half of the code from 45 instances of middleware integration. The other half came from SDK examples also written by them. So, yes, it's also "pure" that way too. Untainted by all the different middleware modifications or independent CUDA+DirectX engine implementations throughout the world.

FordGT90Concept · Mar 26, 2016

Hiryougan said:
Afaik Nvidia cards lost some performance there because they tried to emulate ac(and we see how that went).

They did not. Async shaders is part of the DirectX 11 API. AMD cards handles async workloads asynchronously where NVIDIA cards handles async workloads synchronously; hence, performance boost for the former and performance hit for the latter. If those 2-4% frames are important, the only solution on NVIDIA is to not use async shaders at all.

BiggieShady · Mar 27, 2016

FordGT90Concept said:
NVIDIA cards handles async workloads synchronously ... the only solution on NVIDIA is to not use async shaders at all.

That way too oversimplified. Both architectures benefit from async shaders and they will be used simply because it's in direct x. Difference is GCN is more efficient when 3d and compute commands are mixed, which is worst case for nvidia. Nvidia arch likes them separated (in lowest number of batches possible) which is worst case for AMD.
All serious engines will commit workload for a single frame differently on different gpu architectures to achieve peak efficiency.
GCN is great because it allows more simple/flexible approach where you can simply keep brute force saturating command queues. With nvidia currently you gotta treat async task invokes like draw calls, manage and batch them because of the time overhead from context switch. Not only time overhead because synchronicity here is side effect from context switch that may wait on longer running async tasks to finish.

Processor	Ryzen 7800X3D
Motherboard	MSI MAG Mortar B650 (wifi)
Cooling	be quiet! Dark Rock Pro 4
Memory	32GB Kingston Fury
Video Card(s)	MSI RTX 5080 Vanguard SOC
Storage	Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s)	LG 32" 165Hz 1440p GSYNC
Case	Asus Prime AP201
Audio Device(s)	On Board
Power Supply	be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software	W10

System Name	BY-2021
Processor	AMD Ryzen 7 5800X (65w eco profile)
Motherboard	MSI B550 Gaming Plus
Cooling	Scythe Mugen (rev 5)
Memory	2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s)	AMD Radeon RX 7900 XT
Storage	Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s)	Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case	Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s)	Realtek ALC1150, Micca OriGen+
Power Supply	Enermax Platimax 850w
Mouse	Nixeus REVEL-X
Keyboard	Tesoro Excalibur
Software	Windows 10 Home 64-bit
Benchmark Scores	Faster than the tortoise; slower than the hare.

System Name	Tiny the White Yeti
Processor	7800X3D
Motherboard	MSI MAG Mortar b650m wifi
Cooling	CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory	32GB Corsair Vengeance 30CL6000
Video Card(s)	ASRock RX7900XT Phantom Gaming
Storage	Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s)	Gigabyte G34QWC (3440x1440)
Case	Lian Li A3 mATX White
Audio Device(s)	Harman Kardon AVR137 + 2.1
Power Supply	EVGA Supernova G2 750W
Mouse	Steelseries Aerox 5
Keyboard	Lenovo Thinkpad Trackpoint II
VR HMD	HD 420 - Green Edition ;)
Software	W11 IoT Enterprise LTSC
Benchmark Scores	Over 9000

Processor	Ryzen 7800X3D
Motherboard	MSI MAG Mortar B650 (wifi)
Cooling	be quiet! Dark Rock Pro 4
Memory	32GB Kingston Fury
Video Card(s)	MSI RTX 5080 Vanguard SOC
Storage	Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s)	LG 32" 165Hz 1440p GSYNC
Case	Asus Prime AP201
Audio Device(s)	On Board
Power Supply	be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software	W10

System Name	-aLiEn beaTs-
Processor	Intel i7 11700kf @ 5.055Ghz
Motherboard	MSI Z490 Unify
Cooling	Corsair H115i Pro RGB
Memory	G.skill Royal Silver 4400 cl17 @ 4403mhz
Video Card(s)	Inno3d RTX 3080TI Ichill black @ UV 1815MHz
Storage	nvme WD KC3000 2TB, Crucial MX300 & MX500
Display(s)	Samsung C24FG73 144HZ
Case	CoolerMaster HAF 932 USB3.0
Audio Device(s)	X-Fi Titanium HD @ 2.1 Bose acoustimass 5
Power Supply	CoolerMaster 850W v2 gold atx 2.52
Mouse	Razer viper 8k
Keyboard	Logitech G19s
Software	Windows 11 Pro 24h2
Benchmark Scores	► ♪♫♪♩♬♫♪♭

AMD Announces Exciting DirectX 12 Game Engine Developer Partnerships

the54thvoid

Super Intoxicated Moderator

FordGT90Concept

"I go fast!1!11!1!"

Vayra86

Xzibit

the54thvoid

Super Intoxicated Moderator

TheHunter

FYFI13

BiggieShady

Hiryougan

applejack

FordGT90Concept

"I go fast!1!11!1!"

RejZoR

Hiryougan

BiggieShady

FordGT90Concept

"I go fast!1!11!1!"

BiggieShady

System Name	Windows 10 64-bit Core i7 6700
Processor	Intel Core i7 6700
Motherboard	Asus Z170M-PLUS
Cooling	Corsair AIO
Memory	2 x 8 GB Kingston DDR4 2666
Video Card(s)	Gigabyte NVIDIA GeForce GTX 1060 6GB
Storage	Western Digital Caviar Blue 1 TB, Seagate Baracuda 1 TB
Display(s)	Dell P2414H
Case	Corsair Carbide Air 540
Audio Device(s)	Realtek HD Audio
Power Supply	Corsair TX v2 650W
Mouse	Steelseries Sensei
Keyboard	CM Storm Quickfire Pro, Cherry MX Reds
Software	MS Windows 10 Pro 64-bit

Processor	i5-4670k
Motherboard	MSI Z87-GD65
Cooling	Magicool G2 Slim 240+360, Watercool Heatkiller 3.0, Alphacool GPX 290 M05
Memory	G.Skill RipjawsX DDR3 2x4GB 2133MHz CL9
Video Card(s)	Gigabyte Radeon R9 290
Storage	Samsung 850 EVO 250GB, 320GB+1TB HDDs
Case	SilentiumPC Aquarius X90 Pure Black
Power Supply	Chieftec Navitas GPM-1000C

Processor	Intel® Core™ i5-2500K
Motherboard	GIGABYTE Z68XP-UD3-iSSD
Cooling	Scythe Mugen 3
Memory	G.Skill RipjawsX 2x4GB DDR3 1600mhz 7-8-8-24
Video Card(s)	Galaxy Geforce GTX 680
Storage	Western Digital VelociRaptor WD3000GLFS 300GB Accelerated via SRT w/ Intel SSD 311 Series 20GB mSATA
Display(s)	LG W2363D Full HD 120Hz 3D
Case	Gigabyte 3D Aurora 570 + 2x Scythe S-FLEX
Audio Device(s)	Creative X-Fi XtremeMusic
Power Supply	SeaSonic S12II-620w
Software	Microsoft Windows 7 64-bit

System Name	Dark Monolith
Processor	AMD Ryzen 7 5800X3D
Motherboard	ASUS Strix X570-E
Cooling	Arctic Cooling Freezer II 240mm + 2x SilentWings 3 120mm
Memory	64 GB G.Skill Ripjaws V Black 3600 MHz
Video Card(s)	XFX Radeon RX 9070 XT Mercury OC Magnetic Air
Storage	Seagate Firecuda 530 4 TB SSD + Samsung 850 Pro 2 TB SSD + Seagate Barracuda 8 TB HDD
Display(s)	ASUS ROG Swift PG27AQDM 240Hz OLED
Case	Silverstone Kublai KL-07
Audio Device(s)	Sound Blaster AE-9 MUSES Edition + Altec Lansing MX5021 2.1 Nichicon Gold
Power Supply	BeQuiet DarkPower 11 Pro 750W
Mouse	Logitech G502 Proteus Spectrum
Keyboard	UVI Pride MechaOptical
Software	Windows 11 Pro