• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

DDR5 CUDIMM Explained & Benched

That makes sense. I have hypervisor enabled because I have docker running and there doesn't seem to be a work around to run without that. Dropping from 100+ ns to mid 70s without real effort makes me question why Intel doesn't have the D2D speeds higher unless I missed stability issues in my testing.

With these tweak it will not be fast then the 9800X3D.
 
Last edited:
@tfp while I only have 1 CPU to test this out, I've found that the highest D2D ratio possible changes based on the memory speed and can also require extra voltage on the VNNAON rail. For example I can run 34x at 6400 without needing additional voltage. 32x for 8800 and 28x for 9600. Default is 26x.

Raising the voltage to 0.9v gives me 2,x more and 1.0v causes boot issues.

I wouldn't be surprised if the max for each CPu is different. So 26x is what Intel seems safe for all CPUs.
 
Last edited:
The bandwidth increase of 46% over 6400MT/s is nice: The benchmarks are missing the elephant in the room: Inferencing -- the AI age is here (well it's just algorithms, not actual AI, but it's a nice marketing term).
Power consumption measurements would have been nice. If I remember correctly, very recently here on TPU I saw an article which did exactly this and only the RAM power consumption was measured.
 
Testing the TeamGroup 8800MTs CUDIMM kit now in bypass mode on AM5 X670E Gene,.. I have two stable baseline profiles, will tighten timings, lower, adjust voltages next few days.

The CUDIMM kits work very well on AM5 in bypass mode.

View attachment 392633

View attachment 392634

View attachment 392635
@Braegnok
What is the performance increase with bypass mode?
Did you have to do a lot of tweaks to get it stable?
I am considering building a system with a MSI 870E Carbon WIFI and using CUDIMM but I dont know if it will be stable.
MSI says they support CUDIMM in bypass mode.


Testing the TeamGroup 8800MTs CUDIMM kit now in bypass mode on AM5 X670E Gene,.. I have two stable baseline profiles, will tighten timings, lower, adjust voltages next few days.

The CUDIMM kits work very well on AM5 in bypass mode.

View attachment 392633

View attachment 392634

View attachment 392635
 
@Braegnok
What is the performance increase with bypass mode?
Did you have to do a lot of tweaks to get it stable?
I am considering building a system with a MSI 870E Carbon WIFI and using CUDIMM but I dont know if it will be stable.
MSI says they support CUDIMM in bypass mode.
Hello crypto9, welcome to TPU,...

I did not see any performance increase running bypass mode with DDR5 CUDIMM kit vs DDR5 UDIMM kit on AM5 system,.. as both kits rely on the CPU for clock signals. However with Intel ARL system the CUDIMM kit in pass through mode can offer potential for higher speeds due to improved signal integrity, perhaps improved stability at high frequency vs UDIMM kit.

The CUDIMM kit was fully stable on my AM5 system, was plug & play in auto, default it runs @ 3200MTs 1.10v,.. overclocking in bypass mode does require some skills, tweaking and a strong CPU IMC on AM5 system for stable profile the same as UDIMM,.. with no advantage using the CUDIMM kit.

I bought the 8800MTs CUDIMM kit thinking it would have high bin ICs and OC very well in 1:1 profiles on AM5,.. but turns out the ICs are good, but not great.

You will be much better off buying the G.SKILL Trident Z5 Neo 6000CL26 1.40v kit for your MSI 870 Carbon WIFI system,.. the Hynix A-Die ICs on the 6000CL26 1.40v kit are great! :)

 
Last edited:
Forgive my ignorance, but what's the difference?
When the clock signal arrives at the destination, that is, the DRAM chips, it's already a bit corrupted because it's been affected by other high frequency signals on the PCB and the connectors. As a consequence, the DRAM chips can't precisely recognise the zero-one-zero transitions every 125 ps (if the clock is 4000 MHz and the speed is 8000 MT/s). They sometimes see, say, 120 ps or 130 ps. Doesn't seem much but when you have the entire system running close to its limits, it makes reading/writing/bus transfers less reliable. An amplifier can't solve that problem - certainly not completely. A CKD chip can. It re-generates (generates again) the 4000 MHz clock, which is precisely synchronised (locked, that's what phase-locked loop or PLL does) to the incoming clock, but different because it's clean. DRAM chips can then recognise the transitions every 125 ps with very little variation.
 
When the clock signal arrives at the destination, that is, the DRAM chips, it's already a bit corrupted because it's been affected by other high frequency signals on the PCB and the connectors. As a consequence, the DRAM chips can't precisely recognise the zero-one-zero transitions every 125 ps (if the clock is 4000 MHz and the speed is 8000 MT/s). They sometimes see, say, 120 ps or 130 ps. Doesn't seem much but when you have the entire system running close to its limits, it makes reading/writing/bus transfers less reliable. An amplifier can't solve that problem - certainly not completely. A CKD chip can. It re-generates (generates again) the 4000 MHz clock, which is precisely synchronised (locked, that's what phase-locked loop or PLL does) to the incoming clock, but different because it's clean. DRAM chips can then recognise the transitions every 125 ps with very little variation.
Ok. Doesn't explain how CKD can figure out the correct timing from the garbled signal, but it explains the nuance I was missing (amplification vs regeneration).
 
@bug this might help. From my understanding, the CKD isn't really in control of the clock, just taking the signal and making a new one. A fancy redriver, not just a dumb amplifier you get in the basic redriver.

From JEDEC CKD
The CKD is provided a clock by the entity that is controlling it, i.e., memory controller, or any test equipment. This
clock is used by the CKD to generate all the IO specific timing, and is the same clock used by the controlling entity as
the one and only deterministic source of all the response timings to and from the CKD. This allows the memory
controller to have a deterministic control of every event in the CKD. Spread Spectrum Clocking (SSC) Capability

The system platform uses a reference clock, which is used to synthesize the CKD clock. Spread Spectrum Clock
(SSC) with up to 0.5% down-spread in frequency must be supported by the clocking system. The frequency of the
reference clock, and therefore bit rate, can be modulated from 0% to –0.5% of the nominal data rate/frequency at a
modulation rate in the range of 30 KHz to 33 KHz. The modulation profile of SSC must provide optimal or close to
optimal EMI reduction. Typical profiles include a triangular profile. The CKD must ensure that it functions normally
even in the presence of SSC and truthfully lets SSC related components pass through to its output signals.
 
@bug this might help. From my understanding, the CKD isn't really in control of the clock, just taking the signal and making a new one. A fancy redriver, not just a dumb amplifier you get in the basic redriver.

From JEDEC CKD
I went Gemini on this.
To answer my own question, this is not generating proper clocks from a garbled signal at all. It is just a redriver, taking signal that is not garbled, but too weak/noisy to reach individual DRAM chips in good condition and regenerating that.
 
Last edited:
@bug I suggest taking a look at the JEDEC white paper yourself instead of trusting Ai for a answer. My interpretation is slightly different. But I am also not a electrical engineer either...

The quoted part is from the paper and indicates that it is generating a new clock from what it receives. However I wouldn't consider it "garbled" either like wirko suggested. It's not a magic chip, though it has to have some tolerance on what its receiving to lock on.
 
@bug I suggest taking a look at the JEDEC white paper yourself instead of trusting Ai for a answer. My interpretation is slightly different. But I am also not a electrical engineer either...

The quoted part is from the paper and indicates that it is generating a new clock from what it receives. However I wouldn't consider it "garbled" either like wirko suggested. It's not a magic chip, though it has to have some tolerance on what its receiving to lock on.
That is what a redriver does. The original signal is "driven", the redriver "drives" it again.
I am (supposed to be) an electrical engineer, but I didn't get that from the bit you selected. I also don't trust Gemini/GPT/Claude blindly, but that answer made sense. Normally my first stop would be the official docs, but I didn't know whether they're available for free. And tbh, I don't have a lot of time to spare atm either.
 
Driver is just amplifier of signal. Redriver makes a new signal from a previous generated one.

Im thinking what the question your asking is how bad can the signal be before it can no longer "lock on"?.

The answer is beyond me in my technical understanding of this stuff. Still though the clock signal cannot be completely "garbled" or the lock on fails. The CKD in its basic form is generating a new signal from a weaker one it has received.

It's a bit funny to me that memory vendors PR team can't get this right. Maybe it's a miss translation, Kingston website is a good example of making it confusing to consumers. CKD is making a new clock signal, but it is NOT the origin...

. Normally my first stop would be the official docs, but I didn't know whether they're available for free. And tbh, I don't have a lot of time to spare atm either.
Google the JEDEC CKD doc and just register a email to download. It's free and how I got it.


 
Last edited:
  • Like
Reactions: bug
Driver is just amplifier of signal. Redriver makes a new signal from a previous generated one.

Im thinking what the question your asking is how bad can the signal be before it can no longer "lock on"?.

The answer is beyond me in my technical understanding of this stuff. Still though the clock signal cannot be completely "garbled" or the lock on fails. The CKD in its basic form is generating a new signal from a weaker one it has received.


Google the JEDEC CKD doc and just register a email to download. It's free and how I got it.
My original question was wrt @Wirko 's assertion
When the clock signal arrives at the destination, that is, the DRAM chips, it's already a bit corrupted because it's been affected by other high frequency signals on the PCB and the connectors. As a consequence, the DRAM chips can't precisely recognise the zero-one-zero transitions every 125 ps

The nuance that was missing is that the signal reaching the DIMM is usable enough for it to be 'redriven", but not in a good enough shape to make its way to the actual DRAM chips in this usable form.

Side-note: For those not in the know, at those frequencies a 1cm trace acts the same as kilometers of wire act for regular 24/30Hz electrical current. Capacity, attenuation, jitter, echo... the whole enchilada.
 
V color is releasing a 64 GB CUDIMM memory kit.
G.Skill announced it as well. Take note they all blank out the IC manufacturer. I want to say Micron, but the 2x64GB I have barely goes above 6400 as is. We will find out eventually.
 
Last edited:
Your numbers look good here. With your tREFI at 65535 how are your memory temps. That settings is very temperature sensitive I believe anything thing over 60c you may start to see errors. You are using SR sticks and not Dual Rank so you can push them alittle abit more.

I mostly do not run windows. Only for windows gaming.

spd5118-i2c-1-53
Adapter: SMBus PIIX4 adapter port 0 at 0b00
temp1: +38.2°C (low = +0.0°C, high = +55.0°C)
(crit low = +0.0°C, crit = +85.0°C)

spd5118-i2c-1-51
Adapter: SMBus PIIX4 adapter port 0 at 0b00
temp1: +38.0°C (low = +0.0°C, high = +55.0°C)
(crit low = +0.0°C, crit = +85.0°C)

Reference: https://www.renesas.com/en/products...solutions/spd5118-spd-hub-ddr5-memory-modules
 
What would the best cudimms be for a 285k? I would like to get 64gb. 2x32gb sticks. What do you guys recommend? Right now I have 2x32gb 64gb of udimm 6400mhz memory.
 
Back
Top