• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD's Ryzen Cache Analyzed - Improvements; Improveable; CCX Compromises

Joined
Sep 15, 2007
Messages
3,946 (0.63/day)
Location
Police/Nanny State of America
Processor OCed 5800X3D
Motherboard Asucks C6H
Cooling Air
Memory 32GB
Video Card(s) OCed 6800XT
Storage NVMees
Display(s) 32" Dull curved 1440
Case Freebie glass idk
Audio Device(s) Sennheiser
Power Supply Don't even remember
I am going to wait to see Ryzen 5 in action. We have not concrete information about those chips and how they OC. Overclocking 8 cores is a different animal than overclocking 4 cores...historically at least. And I don't think the limit in OC is entirely the architecture, but we will find out.

I assume the arch is fine. It's the LPP process that wasn't intended for such clocks.
 
Joined
Dec 21, 2015
Messages
42 (0.01/day)
AIDA64 Build 5.80.4089 shows much better reults for Ryzen now.
https://forums.aida64.com/topic/3768-aida64-compatibility-with-amd-ryzen-processors/

 
Joined
Jan 17, 2006
Messages
932 (0.14/day)
Location
Ireland
System Name "Run of the mill" (except GPU)
Processor R9 3900X
Motherboard ASRock X470 Taich Ultimate
Cooling Cryorig (not recommended)
Memory 32GB (2 x 16GB) Team 3200 MT/s, CL14
Video Card(s) Radeon RX6900XT
Storage Samsung 970 Evo plus 1TB NVMe
Display(s) Samsung Q95T
Case Define R5
Audio Device(s) On board
Power Supply Seasonic Prime 1000W
Mouse Roccat Leadr
Keyboard K95 RGB
Software Windows 11 Pro x64, insider preview dev channel
Benchmark Scores #1 worldwide on 3D Mark 99, back in the (P133) days. :)
Is this some beta d/l as my AIDA is not offering an update from 5.08.40 at the moment? Thanks!

And "2 x Octal core" doesn't seem right. When a 4790k says "quadcore" so maybe it needs some more fixing. :)
 
Last edited:
Joined
Jan 17, 2006
Messages
932 (0.14/day)
Location
Ireland
System Name "Run of the mill" (except GPU)
Processor R9 3900X
Motherboard ASRock X470 Taich Ultimate
Cooling Cryorig (not recommended)
Memory 32GB (2 x 16GB) Team 3200 MT/s, CL14
Video Card(s) Radeon RX6900XT
Storage Samsung 970 Evo plus 1TB NVMe
Display(s) Samsung Q95T
Case Define R5
Audio Device(s) On board
Power Supply Seasonic Prime 1000W
Mouse Roccat Leadr
Keyboard K95 RGB
Software Windows 11 Pro x64, insider preview dev channel
Benchmark Scores #1 worldwide on 3D Mark 99, back in the (P133) days. :)
That's a bit cheeky to not put it out for Extreme!

Thanks Dave.
 
Joined
Jan 17, 2006
Messages
932 (0.14/day)
Location
Ireland
System Name "Run of the mill" (except GPU)
Processor R9 3900X
Motherboard ASRock X470 Taich Ultimate
Cooling Cryorig (not recommended)
Memory 32GB (2 x 16GB) Team 3200 MT/s, CL14
Video Card(s) Radeon RX6900XT
Storage Samsung 970 Evo plus 1TB NVMe
Display(s) Samsung Q95T
Case Define R5
Audio Device(s) On board
Power Supply Seasonic Prime 1000W
Mouse Roccat Leadr
Keyboard K95 RGB
Software Windows 11 Pro x64, insider preview dev channel
Benchmark Scores #1 worldwide on 3D Mark 99, back in the (P133) days. :)
The issue is I'm perfectly up for beta testing (and do some in other areas too), maybe they should add an opt-in for that.

In other news, my Asrock X370 (pro gaming) ships today.
 

Enlightnd

New Member
Joined
Mar 17, 2017
Messages
2 (0.00/day)
Question, I've read here and in other places that part of the CCX bus congestion issue for games is that PCIe data is also shoved over the CCX bus.

Has anyone done any tests to see if the issue is greater for GPU's on the chipset PCIe lanes vs GPU's on the CPU embedded PCIe lanes?

(EDIT: Fix CPU lanes with PCIe lanes)
 
Last edited:
Joined
Sep 2, 2011
Messages
1,019 (0.21/day)
Location
Porto
System Name No name / Purple Haze
Processor Phenom II 1100T @ 3.8Ghz / Pentium 4 3.4 EE Gallatin @ 3.825Ghz
Motherboard MSI 970 Gaming/ Abit IC7-MAX3
Cooling CM Hyper 212X / Scythe Andy Samurai Master (CPU) - Modded Ati Silencer 5 rev. 2 (GPU)
Memory 8GB GEIL GB38GB2133C10ADC + 8GB G.Skill F3-14900CL9-4GBXL / 2x1GB Crucial Ballistix Tracer PC4000
Video Card(s) Asus R9 Fury X Strix (4096 SP's/1050 Mhz)/ PowerColor X850XT PE @ (600/1230) AGP + (HD3850 AGP)
Storage Samsung 250 GB / WD Caviar 160GB
Display(s) Benq XL2411T
Audio Device(s) motherboard / Creative Sound Blaster X-Fi XtremeGamer Fatal1ty Pro + Front panel
Power Supply Tagan BZ 900W / Corsair HX620w
Mouse Zowie AM
Keyboard Qpad MK-50
Software Windows 7 Pro 64Bit / Windows XP
Benchmark Scores 64CU Fury: http://www.3dmark.com/fs/11269229 / X850XT PE http://www.3dmark.com/3dm05/5532432
Question, I've read here and in other places that part of the CCX bus congestion issue for games is that PCIe data is also shoved over the CCX bus.

Has anyone done any tests to see if the issue is greater for GPU's on the chipset PCIe lanes vs GPU's on the CPU embedded CPU lanes?

That would be a cool thing to test!
 
Joined
Mar 17, 2017
Messages
5 (0.00/day)
this is windows load balancing working like it id on nehalems and first gen skylakes

basicly windows treats ryzen as a massive 16 core cpu instead of 8c 16t

The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.
 
Joined
Sep 2, 2011
Messages
1,019 (0.21/day)
Location
Porto
System Name No name / Purple Haze
Processor Phenom II 1100T @ 3.8Ghz / Pentium 4 3.4 EE Gallatin @ 3.825Ghz
Motherboard MSI 970 Gaming/ Abit IC7-MAX3
Cooling CM Hyper 212X / Scythe Andy Samurai Master (CPU) - Modded Ati Silencer 5 rev. 2 (GPU)
Memory 8GB GEIL GB38GB2133C10ADC + 8GB G.Skill F3-14900CL9-4GBXL / 2x1GB Crucial Ballistix Tracer PC4000
Video Card(s) Asus R9 Fury X Strix (4096 SP's/1050 Mhz)/ PowerColor X850XT PE @ (600/1230) AGP + (HD3850 AGP)
Storage Samsung 250 GB / WD Caviar 160GB
Display(s) Benq XL2411T
Audio Device(s) motherboard / Creative Sound Blaster X-Fi XtremeGamer Fatal1ty Pro + Front panel
Power Supply Tagan BZ 900W / Corsair HX620w
Mouse Zowie AM
Keyboard Qpad MK-50
Software Windows 7 Pro 64Bit / Windows XP
Benchmark Scores 64CU Fury: http://www.3dmark.com/fs/11269229 / X850XT PE http://www.3dmark.com/3dm05/5532432
The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.

I think NUMA would require a separate memory controller for each CCX, which is shared between ccx's on ryzen. But yeah, somewhat of an hybrid thing would be the real deal. For now lets hope that 4000MHz memory support gets there...
 
Joined
Mar 20, 2017
Messages
13 (0.00/day)
Question, I've read here and in other places that part of the CCX bus congestion issue for games is that PCIe data is also shoved over the CCX bus.

Has anyone done any tests to see if the issue is greater for GPU's on the chipset PCIe lanes vs GPU's on the CPU embedded PCIe lanes?

(EDIT: Fix CPU lanes with PCIe lanes)

GPUs can only use the lanes on the Ryzen CPUs, they don't connect to the Southbridge. So 16x or 8x/8x, off the CPU.

The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.

I'm wondering if the higher speed of copy operations on the L3 was specifically tweaked to speed up copies between the two L3s, allowing both CCXs to work from the same data after copying things over, if that would even help... but looks like the new version of AIDA makes this whole CCX intercommunication "bug" a non-issue.

Naples has a ton of PCIe lanes connecting two sockets together on dual socket configs. Somewhere at AMD there must have been people who worked on intercommunication between the 2 CCXs. I don't buy the theory that AMD simply dropped the ball and put out a chip with a glaring architectural flaw. If there are limitations of Ryzen I expect to find compromises that were made after intense discussion. Although they don't have a foundry, they do have the ability to do limited production in house for testing and research purposes. It really feels like people are way underestimating AMD and the quality of their product.
 
Last edited by a moderator:

Enlightnd

New Member
Joined
Mar 17, 2017
Messages
2 (0.00/day)
I wonder if that is accurate (about the PCIe lanes). I'm in conversation on IRC with several people using pass-trough (for virtualization) and they are explicitly speaking about the issues they have between GPU's on the CPU based bus and ones on a chipset hosted PCIe slot. Seems some boards have crappy IOMMU groupings causing weirdness with GPUs.
 
Joined
Mar 20, 2017
Messages
13 (0.00/day)
I wonder if that is accurate (about the PCIe lanes). I'm in conversation on IRC with several people using pass-trough (for virtualization) and they are explicitly speaking about the issues they have between GPU's on the CPU based bus and ones on a chipset hosted PCIe slot. Seems some boards have crappy IOMMU groupings causing weirdness with GPUs.

edit: Sorry, I didn't read your post carefully enough. I'll leave the pic up though, maybe someone will find it useful. But, yea, I have no idea what those guys on IRC are talking about. Aren't they mistaken in thinking that one of their GPUs is running off the chipset?



Taken from
https://rog.asus.com/articles/techn...platform-and-its-x370-b350-and-a320-chipsets/
 
Joined
Mar 23, 2005
Messages
4,088 (0.57/day)
Location
Ancient Greece, Acropolis (Time Lord)
System Name RiseZEN Gaming PC
Processor AMD Ryzen 7 5800X @ Auto
Motherboard Asus ROG Strix X570-E Gaming ATX Motherboard
Cooling Corsair H115i Elite Capellix AIO, 280mm Radiator, Dual RGB 140mm ML Series PWM Fans
Memory G.Skill TridentZ 64GB (4 x 16GB) DDR4 3200
Video Card(s) ASUS DUAL RX 6700 XT DUAL-RX6700XT-12G
Storage Corsair Force MP500 480GB M.2 & MP510 480GB M.2 - 2 x WD_BLACK 1TB SN850X NVMe 1TB
Display(s) ASUS ROG Strix 34” XG349C 180Hz 1440p + Asus ROG 27" MG278Q 144Hz WQHD 1440p
Case Corsair Obsidian Series 450D Gaming Case
Audio Device(s) SteelSeries 5Hv2 w/ Sound Blaster Z SE
Power Supply Corsair RM750x Power Supply
Mouse Razer Death-Adder + Viper 8K HZ Ambidextrous Gaming Mouse - Ergonomic Left Hand Edition
Keyboard Logitech G910 Orion Spectrum RGB Gaming Keyboard
Software Windows 11 Pro - 64-Bit Edition
Benchmark Scores I'm the Doctor, Doctor Who. The Definition of Gaming is PC Gaming...
AMD will tighten up this L3 Latence. It will get better and better.
 
Joined
Jan 17, 2006
Messages
932 (0.14/day)
Location
Ireland
System Name "Run of the mill" (except GPU)
Processor R9 3900X
Motherboard ASRock X470 Taich Ultimate
Cooling Cryorig (not recommended)
Memory 32GB (2 x 16GB) Team 3200 MT/s, CL14
Video Card(s) Radeon RX6900XT
Storage Samsung 970 Evo plus 1TB NVMe
Display(s) Samsung Q95T
Case Define R5
Audio Device(s) On board
Power Supply Seasonic Prime 1000W
Mouse Roccat Leadr
Keyboard K95 RGB
Software Windows 11 Pro x64, insider preview dev channel
Benchmark Scores #1 worldwide on 3D Mark 99, back in the (P133) days. :)
edit: Sorry, I didn't read your post carefully enough. I'll leave the pic up though, maybe someone will find it useful. But, yea, I have no idea what those guys on IRC are talking about. Aren't they mistaken in thinking that one of their GPUs is running off the chipset?

No, they are not mistaken, you could for example have 3 GPUs in there.

2 from the CPU and one from the chipset (with the associated latency).

In fact what a lot of the folks using VM want to do is have all 3 cards in separate I/O groups so you can e.g. have one card for your host O/S and the others each dedicated to a VM.
If the groups/UEFI are right, you could have a slower card off the chipset and have that as the host OSes' card (boot graphics) and then two powerful cards connected to the VMs or whatever.
 
Joined
Sep 26, 2006
Messages
475 (0.07/day)
The day there is 4GHz ram, 4GHz chip and a nice high capacity (64GB sounds nice) I will be throwing cash at AMD.
 
Joined
Mar 23, 2005
Messages
4,088 (0.57/day)
Location
Ancient Greece, Acropolis (Time Lord)
System Name RiseZEN Gaming PC
Processor AMD Ryzen 7 5800X @ Auto
Motherboard Asus ROG Strix X570-E Gaming ATX Motherboard
Cooling Corsair H115i Elite Capellix AIO, 280mm Radiator, Dual RGB 140mm ML Series PWM Fans
Memory G.Skill TridentZ 64GB (4 x 16GB) DDR4 3200
Video Card(s) ASUS DUAL RX 6700 XT DUAL-RX6700XT-12G
Storage Corsair Force MP500 480GB M.2 & MP510 480GB M.2 - 2 x WD_BLACK 1TB SN850X NVMe 1TB
Display(s) ASUS ROG Strix 34” XG349C 180Hz 1440p + Asus ROG 27" MG278Q 144Hz WQHD 1440p
Case Corsair Obsidian Series 450D Gaming Case
Audio Device(s) SteelSeries 5Hv2 w/ Sound Blaster Z SE
Power Supply Corsair RM750x Power Supply
Mouse Razer Death-Adder + Viper 8K HZ Ambidextrous Gaming Mouse - Ergonomic Left Hand Edition
Keyboard Logitech G910 Orion Spectrum RGB Gaming Keyboard
Software Windows 11 Pro - 64-Bit Edition
Benchmark Scores I'm the Doctor, Doctor Who. The Definition of Gaming is PC Gaming...
The day there is 4GHz ram, 4GHz chip and a nice high capacity (64GB sounds nice) I will be throwing cash at AMD.
Seeing how Ram Speed makes a huge performance difference in Ryzen, yes Agreed.
 
Joined
Sep 26, 2014
Messages
68 (0.02/day)
Location
sydney australia
The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.

What an interesting suggestion.

Your paradigm of splitting, for coding purposes, the 8 cores into discrete 4 core ccxS & 8MB L3 cache blocks. & then minimising interaction between them, could speed some apps considerably.

I am a newb~, but i mused similarly in the context of a poor mans vega pro ssg (a 16GB $5000+ Vega w/ an onboard 4x 960 pro raid array).

if you install an Affordable 8 lane vega and an 8 lane 2x nvme adapter, so both link to the same 16 lane ccx (as a 16 lane card does e.g.) , then the gpu and the 2x nvme raid array may be able to talk very directly, and ~share the same 8MB cpu L3 cache. It doesnt bypass the shared pcie bus like Vega SSG, but it could be minimal latency, and enhanced by specialised large block size formatting for; swapping, workspace, temp files and graphics.

Vega 56/64 of course, have a dedicated HBCC subsystem for such gpu cache extension using nvme arrays. Done right, it promises a pretty good illusion of ~unlimited gpu memory/address space. Cool indeed.

As you see, a belated post from me. We now have evidence in the perf figures of single ccx zen/vega apuS. Yes, inter ccx interconnects have dragged Ryzen ~IPC down.
 
Joined
Sep 15, 2007
Messages
3,946 (0.63/day)
Location
Police/Nanny State of America
Processor OCed 5800X3D
Motherboard Asucks C6H
Cooling Air
Memory 32GB
Video Card(s) OCed 6800XT
Storage NVMees
Display(s) 32" Dull curved 1440
Case Freebie glass idk
Audio Device(s) Sennheiser
Power Supply Don't even remember
The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.

Devs probably won't have a choice. It's only a matter of time before intel announces their copy of Ryzen.
 
Top