First, Vulkan works inherently better on AMD Graphics CoreNext GPU architecture because it's been largely derived from Mantle
This is just PR BS from AMD.
First, there is no evidence supporting the claim that it work inherently better on AMD hardware. In fact, the only "evidence" is games specifically targeting AMD hardware which has later been ported.
Secondly, Vulkan is
not based on Mantle. As you can read in the specs, Vulkan is built on SPIR-V. SPIR-V is the compiler infrastructure and intermediate representation of a shader language which is the basis for OpenCL (2.1) and Vulkan. The features of Vulkan is built on top of this, and this architecture has nothing in common with either Mantle nor Direct3D*. What Vulkan has inherited from Mantle is not the underlaying architecture, but some aspects of the front end. To claim that one piece of software is based on another for implementing similar features is obviously gibberish, just like no one is claiming that Chrome is based on IE for implementing similar features. Any coder will understand this.
AMD have no real advantage on Vulkan compared to it's rivals. Nvidia were in fact the first vendor to demonstrate a working Vulkan driver, and the first to release one (both PC and Android). AMD were the last to get certification, and had to write a driver from scratch like everyone else.
*) In fact, the next Shader Model of Direct3D will adapt a similar architecture. I would expect that you knew this, since you
actually covered it on this news site.
AMD has already "opened" up much of its GPU IP to game developers through its GPUOpen initiative. Here, developers will find detailed technical resources on how to take advantage of not just AMD-specific GPU IP, but also some industry standards. Vulkan is among the richly differentiated resources AMD is giving away through the initiative.
Nvidia has also done the same for more than a decade. Contrary to popular belief, most of GameWorks is actually open, and it's the most extensive collection of examples, tutorials and best practices for graphics development.
Do not believe everything a PR spokesman says.
A lot will also depend on NVIDIA, which holds about 70% in PC discrete GPU market share, to support the API. Over-customizing Vulkan would send it the way of OpenGL. Too many vendor-specific extensions to keep up drove game developers to Direct3D in the first place.
Nvidia is already offering excellent Vulkan support on all platforms.
Extensions have never been a problem for OpenGL, the problem has been the slow standardization process.
-----
OpenGL was always problematic on Linux, for example. Even now with their new, open source driver, OpenGL performance is still poor.
Nvidia has offered superb OpenGL support for Linux for more than a decade, but they've been the only one. You are talking about the "hippie" drivers, nobody who cares about stability, features or performance cares about those. The new "open" drivers are based on Gallium which is a generic GPU abstraction layer, so just forget about optimized support for any API on those.
-----
What Nvidia has done is ignore the new APIs until they become an issue for their customers.
Despite the planned road map for Volta in 2017 which will probably scale with DX12 and Vulkan, they released an unscheduled "new" architecture in Pascal, which is really Maxwell 3.0 that doesn't improve with these APIs.
Nvidia's philosophy is simply sell their customers a whole new architecture when the deficiencies become too problematic, making the previous generation obsolete in a very short time.
But as long as their loyal fans slavishly buy their product at their command, they will continue to be short-sighted about building their hardware for up-comming technical developments.
I don't know which fantasy world you live in, but since AMD released their last major architecture, Nvidia has released Maxwell and Pascal.
Pascal was introduced because Nvidia was unable to complete Volta by 2016, bringing some of the features of Volta. This was done primarily for the compute oriented customers (Tesla).
There is no major disadvantage with Nvidia's architectures vs. GCN in terms of modern APIs.
-----
Meanwhile, with even the latest and greatest AAA title shipping as DX11 and with DX12 support patched in, there is no rush to buy a DX12 card right now. Given the programmed visuals on DX11 are the same as DX12 in Deus Ex MD, why would anyone need to move to DX12?
Yes, the good Direct3D 12 titles will come in a while, perhaps early next year. It always takes 2-3 years before the "good" games arrive.
Does anyone remember Crysis?
Yes, an 8+ TFlop card matching another 8+ TFlop card..... Fury X should perform this well. This is the whole point of everything I post. It's the most over specced and (in DX11) under performing card. Nvidia cards do what they are meant to do in their given time frame. Fury X needs DX12 and Vulkan to work but those API's aren't yet the normal scene. By the time DX12 and/or Vulkan is the norm and DX11 is long forgotten we will be on what? Navi and Volta?
The under-performing GCN cards has
nothing to do with the APIs.
We all know Nvidia's architectures are much more advanced, and one of it's advantages is more flexible compute cores and a very powerful scheduler. AMD has a more simple approach; more simpler cores and a simple scheduler. When you compare GTX 980 Ti to Fury X you'll see that Nvidia is able to saturate it's GPU while Fury is more than 1/3 unutilized. So AMD have typically ~50% more resources for comparable performance, but are there workloads which benefits from AMD's more simple brute force approach? Yes, of course. A number of compute workloads actually perform very well on GCN. This consists of workloads which are more a stream of independent data. AMD clearly have more computational power, so if their GPUs are saturated they can perform very well. The problem is that rendering typically have a lot of internal dependencies. E.g. resources(textures, meshes) are reused several times in a single frame, and if 5 cores requests the same data they will have to wait in turn. That's why scheduling is essential to saturate a GPU during rendering. I would actually draw a parallel with AMD Bulldozer vs. Intel Sandy-Bridge and newer, AMD has clearly more computational power for competing products, but is only able to utilize them in certain (uncommon) workloads. AMD is finally bringing Zen with necessary improvements, and they need to do a similar thing with GCN.
In addition; Nvidia does a number of smart implementations of rendering. E.g; Maxwell and Pascal rasterizes and processes fragments in tiles while AMD process in screen space. This allows Nvidia to use less memory bandwidth, and keep all the important data in L2 cache, to ensure the GPU is completely saturated. With AMD on the other hand, the data has to travel back and forth between GPU memory and L2 cache, causing bottlenecks and cache misses. For those who are not familiar with programming GPUs; fragment shading easily takes up 60-80% or more of rendering time, so a bottleneck here makes a
huge impact. This is one of the primary reasons why Nvidia can perform better with much lower memory bandwidth.
We also know Nvidia has a much more powerful tessellation engine, etc.
-----
At this moment, people are using exposed parts by DX12 to better optimize for AMD because frankly, there's a lot of optimizing to do compared to their DX11 renderer. There is some valid argument that async compute IS better supported on AMD's side, but it's not a valid argument for the way you are using it as NVIDIA also supports several things AMD doesn't:
More games are optimized for AMD this time around because of the major gaming consoles.
Async compute is fully supported by Nvidia, but the advantage is dependent on unutilized GPU resources. In many cases games tries to utilize the same resources for two queues, and since Nvidia is already better at saturating their GPUs, they will get "smaller improvements".