Friday, March 22nd 2019
AMD Ryzen 3000 "Zen 2" BIOS Analysis Reveals New Options for Overclocking & Tweaking
AMD will launch its 3rd generation Ryzen 3000 Socket AM4 desktop processors in 2019, with a product unveiling expected mid-year, likely on the sidelines of Computex 2019. AMD is keeping its promise of making these chips backwards compatible with existing Socket AM4 motherboards. To that effect, motherboard vendors such as ASUS and MSI began rolling out BIOS updates with AGESA-Combo 0.0.7.x microcode, which adds initial support for the platform to run and validate engineering samples of the upcoming "Zen 2" chips.
At CES 2019, AMD unveiled more technical details and a prototype of a 3rd generation Ryzen socket AM4 processor. The company confirmed that it will implement a multi-chip module (MCM) design even for their mainstream-desktop processor, in which it will use one or two 7 nm "Zen 2" CPU core chiplets, which talk to a 14 nm I/O controller die over Infinity Fabric. The two biggest components of the IO die are the PCI-Express root complex, and the all-important dual-channel DDR4 memory controller. We bring you never before reported details of this memory controller.AMD has two big reasons to take the MCM route for even its mainstream desktop platform. The first is that it lets them mix-and-match silicon production technologies. AMD bean-counters reckon that it's more economical to build only those components on a shrunk 7 nanometer production process, which can benefit from the shrink; namely the CPU cores. Other components like the memory controller can continue to be built on existing 14 nm technologies, which by now are highly mature (= cost-efficient). AMD is also competing with other companies for its share of 7 nanometer allocation at TSMC.
The 14 nm I/O controller die could, in theory, be sourced from GlobalFoundries to honor the wafer-supply agreement. The second big reason is the economics of downscaling. AMD is expected to increase CPU core counts beyond 8 and cramming 12-16 cores on a single 7 nm slab will make carving out cheaper SKUs by disabling cores costly, because AMD isn't always harvesting dies with faulty cores. These mid-range SKUs sell in higher volumes, and beyond a point AMD is forced to disable perfectly functional cores. It makes more sense to build 8-core or 6-core chiplets, and on SKUs with 8 cores or fewer, physically deploy only one chiplet. This way AMD is maximizing its utilization of precious 7 nm wafers.The downside of this approach is the memory controller is no longer physically integrated with the processor cores. The 3rd generation Ryzen processor (and all other Zen 2 CPUs), hence have an "integrated-discrete" memory controller. The memory controller is physically located inside the processor, but is not on the same piece of silicon as the CPU cores. AMD isn't the first to come up with such a contraption. Intel's 1st generation Core "Clarkdale" processor took a similar route, with CPU cores on a 32 nm die, and the memory controller plus an integrated GPU on a separate 45 nm die.
Intel used its Quick Path Interconnect (QPI), which was cutting-edge at the time. AMD is tapping into Infinity Fabric, its latest high-bandwidth scalable interconnect that's heavily implemented on "Zen" and "Vega" product lines. We have learned that with "Matisse," AMD will be introducing a new version of Infinity Fabric that offers twice the bandwidth compared to the first generation, or up to 100 GB/s. AMD needs this because a single I/O controller die must now interface with up to two 8-core CPU dies, and up to 64 cores in their "EPYC" server line SKU.
Our resident Ryzen Memory Guru Yuri "1usmus" Bubliy took a really close look at one of these BIOS updates with AGESA 0.0.7.x and found several new controls and options that will be exclusive to "Matisse," and possibly the next-generation Ryzen Threadripper processors. AMD has changed the CBS section title from "Zen Common Options" to "Valhalla Common Options." We have seen this codename on the web quite a bit over the past few days, associated with "Zen 2." We have learned that "Valhalla" could be the codename of the platform consisting of a 3rd generation Ryzen "Matisse" AM4 processor and its companion AMD 500-series chipset based motherboard, specifically the successor to X470 which is being developed in-house by AMD as opposed to sourcing from ASMedia.
When doing serious memory overclocking, it can happen that the Infinity Fabric can't handle the increased memory speed. Remember, Infinity Fabric runs at a frequency synchronized to memory. For example, with DDR-3200 memory (which runs at 1600 MHz), Infinity Fabric will operate at 1600 MHz. This is the default of Zen, Zen+ and also Zen 2. Unlike earlier generations, the new BIOS offers UCLK options for "Auto", "UCLK==MEMCLK" and "UCLK==MEMCLK/2". The last option is new and will come in handy when overclocking your memory, to achieve stability, but at the cost of some Infinity Fabric bandwidth.
Precision Boost Overdrive will receive more fine-grained control at the BIOS level, and AMD is making significant changes to this feature to make the boost setting more flexible and improve the algorithm. Early adopters of AGESA Combo 0.0.7.x on AMD 400-series chipset motherboards noticed that PBO broke or became buggy on their machines. This is because of poor integration of the new PBO algorithm with the existing one compatible with "Pinnacle Ridge." AMD also implemented "Core Watchdog", a feature that resets the system in case address or data errors destabilize the machine.
The "Matisse" processor will also provide users with finer control over active cores. Since the AM4 package has two 8-core chiplets, you will have the option to disable an entire chiplet, or adjust the core-count in decrements of 2, since each 8-core chiplet consists of two 4-core CCX (compute complexes), much like existing AMD designs. At the chiplet-level you can dial down core counts from 4+4 to 3+3, 2+2, and 1+1, but never asymmetrically, such as 4+0 (which was possible on first-generation Zen). AMD is synchronizing CCX core counts for optimal utilization of L3 cache and memory access. For the 64-core Threadripper that has eight 8-core chiplets, you will be able to disable chiplets as long as you have at least two chiplets enabled.
CAKE, or "coherent AMD socket extender" received an additional setting, namely "CAKE CRC performance Bounds". AMD is implementing IFOP (Infinity Fabric On Package,) or the non-socketed version of IF, in three places on the "Matisse" MCM. The I/O controller die has 100 GB/s IFOP links to each of the two 8-core chiplets, and another 100 GB/s IFOP link connects the two chiplets to each other. For multi-socket implementations of "Zen 2," AMD will provide NUMA node controls, namely "NUMA nodes per socket," with options including "NPS0", "NPS1", "NPS2", "NPS4" and "Auto".
With "Zen 2," AMD is introducing a couple of major new DCT-level features. The first one is called "DRAM Map Inversion," with options including "Disabled", "Enabled" and "Auto". The motherboard vendor description of this option goes like "Properly utilize the parallelism within a channel and DRAM device. Bits that flip more frequently should be used to map resources of greater parallelism within the system." Another is "DRAM Post Package Repair," with options including "Enabled", "Disabled", and "Auto." This new special mode (which is a JEDEC standard) lets the memory manufacturer increase DRAM yields by selectively disabling bad memory cells, to replace them automatically with working ones from a spare area, similar to how storage devices map out bad sectors. We're not sure why such a feature is being exposed to end-users, especially from the client-segment. Perhaps it will be removed on production motherboards.
We've also come across an interesting option related to the I/O controller that lets you select PCI-Express generation up to "Gen 4.0". This could indicate some existing 400-series chipset motherboards could receive PCI-Express Gen 4.0, given that we're examining a 400-series chipset motherboard's firmware. We've heard through credible sources that AMD's PCIe Gen 4.0 implementation involves the use of external re-driver devices on the motherboard. These don't come cheap. Texas Instruments sells Gen 3.0 redrivers for $1.5 a piece in 1,000-unit reel quantities. Motherboard vendors will have to fork out quite at least $15-20 on socket AM4 motherboards with Gen 4.0 slots, given that you need 20 of these redrivers, one per lane. We've come across several other common controls, including "RCD Parity" and "Memory MBIST" (a new memory self-test program).
One of the firmware setup program pages is titled "SoC Miscellaneous Control," and includes the following settings, many of which are industry-standard:
At CES 2019, AMD unveiled more technical details and a prototype of a 3rd generation Ryzen socket AM4 processor. The company confirmed that it will implement a multi-chip module (MCM) design even for their mainstream-desktop processor, in which it will use one or two 7 nm "Zen 2" CPU core chiplets, which talk to a 14 nm I/O controller die over Infinity Fabric. The two biggest components of the IO die are the PCI-Express root complex, and the all-important dual-channel DDR4 memory controller. We bring you never before reported details of this memory controller.AMD has two big reasons to take the MCM route for even its mainstream desktop platform. The first is that it lets them mix-and-match silicon production technologies. AMD bean-counters reckon that it's more economical to build only those components on a shrunk 7 nanometer production process, which can benefit from the shrink; namely the CPU cores. Other components like the memory controller can continue to be built on existing 14 nm technologies, which by now are highly mature (= cost-efficient). AMD is also competing with other companies for its share of 7 nanometer allocation at TSMC.
The 14 nm I/O controller die could, in theory, be sourced from GlobalFoundries to honor the wafer-supply agreement. The second big reason is the economics of downscaling. AMD is expected to increase CPU core counts beyond 8 and cramming 12-16 cores on a single 7 nm slab will make carving out cheaper SKUs by disabling cores costly, because AMD isn't always harvesting dies with faulty cores. These mid-range SKUs sell in higher volumes, and beyond a point AMD is forced to disable perfectly functional cores. It makes more sense to build 8-core or 6-core chiplets, and on SKUs with 8 cores or fewer, physically deploy only one chiplet. This way AMD is maximizing its utilization of precious 7 nm wafers.The downside of this approach is the memory controller is no longer physically integrated with the processor cores. The 3rd generation Ryzen processor (and all other Zen 2 CPUs), hence have an "integrated-discrete" memory controller. The memory controller is physically located inside the processor, but is not on the same piece of silicon as the CPU cores. AMD isn't the first to come up with such a contraption. Intel's 1st generation Core "Clarkdale" processor took a similar route, with CPU cores on a 32 nm die, and the memory controller plus an integrated GPU on a separate 45 nm die.
Intel used its Quick Path Interconnect (QPI), which was cutting-edge at the time. AMD is tapping into Infinity Fabric, its latest high-bandwidth scalable interconnect that's heavily implemented on "Zen" and "Vega" product lines. We have learned that with "Matisse," AMD will be introducing a new version of Infinity Fabric that offers twice the bandwidth compared to the first generation, or up to 100 GB/s. AMD needs this because a single I/O controller die must now interface with up to two 8-core CPU dies, and up to 64 cores in their "EPYC" server line SKU.
Our resident Ryzen Memory Guru Yuri "1usmus" Bubliy took a really close look at one of these BIOS updates with AGESA 0.0.7.x and found several new controls and options that will be exclusive to "Matisse," and possibly the next-generation Ryzen Threadripper processors. AMD has changed the CBS section title from "Zen Common Options" to "Valhalla Common Options." We have seen this codename on the web quite a bit over the past few days, associated with "Zen 2." We have learned that "Valhalla" could be the codename of the platform consisting of a 3rd generation Ryzen "Matisse" AM4 processor and its companion AMD 500-series chipset based motherboard, specifically the successor to X470 which is being developed in-house by AMD as opposed to sourcing from ASMedia.
When doing serious memory overclocking, it can happen that the Infinity Fabric can't handle the increased memory speed. Remember, Infinity Fabric runs at a frequency synchronized to memory. For example, with DDR-3200 memory (which runs at 1600 MHz), Infinity Fabric will operate at 1600 MHz. This is the default of Zen, Zen+ and also Zen 2. Unlike earlier generations, the new BIOS offers UCLK options for "Auto", "UCLK==MEMCLK" and "UCLK==MEMCLK/2". The last option is new and will come in handy when overclocking your memory, to achieve stability, but at the cost of some Infinity Fabric bandwidth.
Precision Boost Overdrive will receive more fine-grained control at the BIOS level, and AMD is making significant changes to this feature to make the boost setting more flexible and improve the algorithm. Early adopters of AGESA Combo 0.0.7.x on AMD 400-series chipset motherboards noticed that PBO broke or became buggy on their machines. This is because of poor integration of the new PBO algorithm with the existing one compatible with "Pinnacle Ridge." AMD also implemented "Core Watchdog", a feature that resets the system in case address or data errors destabilize the machine.
The "Matisse" processor will also provide users with finer control over active cores. Since the AM4 package has two 8-core chiplets, you will have the option to disable an entire chiplet, or adjust the core-count in decrements of 2, since each 8-core chiplet consists of two 4-core CCX (compute complexes), much like existing AMD designs. At the chiplet-level you can dial down core counts from 4+4 to 3+3, 2+2, and 1+1, but never asymmetrically, such as 4+0 (which was possible on first-generation Zen). AMD is synchronizing CCX core counts for optimal utilization of L3 cache and memory access. For the 64-core Threadripper that has eight 8-core chiplets, you will be able to disable chiplets as long as you have at least two chiplets enabled.
CAKE, or "coherent AMD socket extender" received an additional setting, namely "CAKE CRC performance Bounds". AMD is implementing IFOP (Infinity Fabric On Package,) or the non-socketed version of IF, in three places on the "Matisse" MCM. The I/O controller die has 100 GB/s IFOP links to each of the two 8-core chiplets, and another 100 GB/s IFOP link connects the two chiplets to each other. For multi-socket implementations of "Zen 2," AMD will provide NUMA node controls, namely "NUMA nodes per socket," with options including "NPS0", "NPS1", "NPS2", "NPS4" and "Auto".
With "Zen 2," AMD is introducing a couple of major new DCT-level features. The first one is called "DRAM Map Inversion," with options including "Disabled", "Enabled" and "Auto". The motherboard vendor description of this option goes like "Properly utilize the parallelism within a channel and DRAM device. Bits that flip more frequently should be used to map resources of greater parallelism within the system." Another is "DRAM Post Package Repair," with options including "Enabled", "Disabled", and "Auto." This new special mode (which is a JEDEC standard) lets the memory manufacturer increase DRAM yields by selectively disabling bad memory cells, to replace them automatically with working ones from a spare area, similar to how storage devices map out bad sectors. We're not sure why such a feature is being exposed to end-users, especially from the client-segment. Perhaps it will be removed on production motherboards.
We've also come across an interesting option related to the I/O controller that lets you select PCI-Express generation up to "Gen 4.0". This could indicate some existing 400-series chipset motherboards could receive PCI-Express Gen 4.0, given that we're examining a 400-series chipset motherboard's firmware. We've heard through credible sources that AMD's PCIe Gen 4.0 implementation involves the use of external re-driver devices on the motherboard. These don't come cheap. Texas Instruments sells Gen 3.0 redrivers for $1.5 a piece in 1,000-unit reel quantities. Motherboard vendors will have to fork out quite at least $15-20 on socket AM4 motherboards with Gen 4.0 slots, given that you need 20 of these redrivers, one per lane. We've come across several other common controls, including "RCD Parity" and "Memory MBIST" (a new memory self-test program).
One of the firmware setup program pages is titled "SoC Miscellaneous Control," and includes the following settings, many of which are industry-standard:
- DRAM Address Command Parity Retry
- Max Parity Error Replay
- Write CRC Enable
- DRAM Write CRC Enable and Retry Limit
- Max Write CRC Error Replay
- Disable Memory Error Injection
- DRAM UECC Retry
- ACPI Settings:
o ACPI SRAT L3 Cache As NUMA Domain
o ACPI SLIT Distance Control
o ACPI SLIT remote relative distance
o ACPI SLIT virtual distance
o ACPI SLIT same socket distance
o ACPI SLIT remote socket distance
o ACPI SLIT local SLink distance
o ACPI SLIT remote SLink distance
o ACPI SLIT local inter-SLink distance
o ACPI SLIT remote inter-SLink distance - CLDO_VDDP Control
- Efficiency Mode
- Package Power Limit Control
- DF C-states
- Fixed SOC P-state
- CPPC
- 4-link xGMI max speed
- 3-link xGMI max speed
73 Comments on AMD Ryzen 3000 "Zen 2" BIOS Analysis Reveals New Options for Overclocking & Tweaking
As you move down to the 100fps range the differences become much less worthwhile especially in 1440p or 4k. The only way you will see 1440p or 4k showing a bigger difference is if the GPU can basically run the game with ease and max it out to where the CPU starts to become the bottleneck. Though its important to note that this gap widens rather gradually as you move to higher FPS so that's another aspect people tend to ignore. its not like a cutoff where the load shifts to the CPU and that the higher IPC advantage suddenly appears.
But to address your main point, yes 700mhz is not a small amount by any means. Thats 16% higher than 2700x (4.3ghz) and on an extremely refined 14++++++ process that can sustain high clock speeds closer to advertised max. Also account for the other 4-5% IPC advantage that intel still has, and there you have at least a 20% single thread/per core advantage.
So I made a clear point that you can, regardless of Refresh Rate, regardless if its 1080p, 1440p or 2160p. What ever is possible on Intel CPUs are possible on Ryzen CPUs. You may not get identical frame rates, but surely they are more than playable on Ryzen CPUs. That is the Misconception I am talking about. I can care less about metric ton of data available, because I speak of facts.
I took myself as an example, because people (Intel Fanboys) called me a liar in a couple other forums for claiming my particular setup (1700X & RX 580 8GB) was incapable of running 1440p even at moderate to low picture quality settings. :roll: When I told them I averaged around 70 to 144 FPS on Ultra High Settings, the trolling started of course.
NOW Do you see the utter nonsense being spread across the internet?
You probably know this site, quite popular to compare stuff.
UserBenchmark
The reason you may see that alot is because of the places you visit, enthusiast forums and especially gamers will favor Intel. Don't mistake a niche for mainstream. In mainstream, people barely have an idea what they want, and if someone they trust tells them Ryzen is good, they go with it, just as they would go with someone telling them Intel's the way.
There is also of course the brand awareness, that is overall a bit higher for Intel. Uphill battle sure, but not the way you put it.
But another thing. There are also examples, and they are not few and far between, where you actually need that fast Intel CPU with an OC. This goes for games like Total War, many strategy games that go towards endgame, but also builder/survival games with huge worlds and any (older) title that leans heavily on single thread. MMO's are another good example of games that really like every Mhz you throw at them. It pays off bigtime and Ryzen actually does fall short - it will dive under 60 FPS faster and more readily. This is further exaggerated by the problem of the last decade, which is that GPUs have progressively become faster every gen with as much as 30%; while CPU performance stagnated entirely on single thread. This increases the need for a top-end CPU to support fast GPUs. Another aspect people tend to forget is that you also get higher FPS across the board out of your GPU at all levels of performance. Its minor, but its there. While your average FPS may well be above 100, the minimums never are, and having a lot headroom counts in those situations where the FPS takes a nosedive. This effect gets greater with faster GPUs - so even at 4K you will see an advantage from more CPU grunt.
Socket 754 was a weird one though, like it was a test for the 1st CPU to integrate the Memory Controller. Or the budget socket which serves people quite well due to the lower prices.
Vega 56 has 70% usage, while my cpu has 50 ~ 60 usage.
I want to get 240fps constant for sync and low inputlag with my 240hz monitor.
I am hopeful that this zen2 R7 3700X will push millions of data to my Vega, and I can get 240fps fluid!
:nutkick:
Socket 754 didn't have an appealing selection of processors after socket 939 arrived.
Anyhow, since we are on the topic of ZEN2, I found this rather silly image off Google Images when I searched.
And Found This too. What can we confirm about this one? Is there any truth to it. The interesting part is that the PCIe will be on each chiplet it seems and with a 100-150 GB/s for Infinity Fabric 2.
It is the same thing as " Nvidia has STABLE graphics driver".
And something like:
" Memory problems on Intel platform? Memory problem.
Memory problems on AMD platform? AMD problem. "
These kind of misconceptions are floating around and never seems to stop.
- DDR4 controllers, IF2 are confirmed.
- Latency issues are claimed to be resolved, we will have to see.
- IF2 is tied to memory clock but it does have divider now.
Chiplet:
- IF2 yes.
- 4 cores per CCX, not 8.
- PCI-e per CCX is unlikely.
Edit:
The first image is messy and there are too many IF links. 5 links per die? I would expect them to still go with three.
I don't see why they would need to have it tied if there is a cache in the I/O chip. It could be for latency/sync reasons I suppose...
I don't think sheer bandwidth numbers will do the trick here if they mean more latency at the same time.
they've got 15% to make up to 8600k/9600k in gaming performance,they're not gonna do that with going 8c->12c and 4.2GHz->4.7/4.8GHz
Different workloads have different performance characteristics, and therefore also different effects from architectural improvements. Zen 2 offer several improvements, including front-end changes, doubling of float throughput(AVX workloads), and more. Gaming performance will probably not benefit a lot from improvements in Infinity Fabric, nothing from the increased float throughput, perhaps some from cache improvements, but a lot from frond-end changes (if they are substantial). For branching heavy code like gaming, front-end changes can sometimes even help more than the average performance gain.
There is one important performance characteristic about gaming performance though; the CPU just needs to be fast enough not to bottleneck the GPU, so scaling forever here is actually pointless. We see it clearly with Skylake-based CPUs; for every 100 MHz beyond 4 GHz, the gains are decreasing. And even some of the lower clocked Coffee Lakes do very well vs. higher clocked Zen+ in gaming. But this is actually good news for AMD, as they don't have to be completely on par with Intel to do a good job. I would say if they get the performance gap in gaming down to 2-3% it will be perfectly fine for normal enthusiasts, but if it's in the range of 8-10%, then combining these CPUs with an expensive GPU will quickly be a waste of money.
One quick note on core count. When Zen 2 arrives, I expect we get another wave of "but this have more cores, so it must be better (in the long run)". For synchronous workloads like gaming, more cores will not compensate for slower cores, and that's not going to change anytime soon.
Do a little Google searching, you’ll find it’s up to 20-30% in some games, esp min FPS
wccftech.com/gigabyte-bios-update-pcie-4-am4-motherboards-amd-ryzen-3000-cpus/
I see 1.0.0.1 or 2 appearing on other boards, so I think folks should wait for that one at least.