AMD Ryzen Memory Tweaking & Overclocking Guide

on Mar 20th, 2019,

in Memory.

Manufacturer: AMD

Relations Between Memory Timings

DRAM access latency has become a critical bottleneck for system performance today, because modern computers primarily use DRAM memory. While DRAM capacity has increased due to manufacturing process technology scaling, access latency has not decreased significantly for decades. Due to a combination of increasing core counts, modern applications are increasingly data-intensive, and inherent limitations to increasing memory bandwidth and DRAM access latency are becoming a growing obstacle to improving overall system performance.

Fundamental DRAM Operations

There are five fundamental operations that need to be performed when accessing data in DRAM.

Activation opens one of the DRAM rows in a bank, and copies the data in the opened row into the row buffer.
Restoration ensures that the charge that is drained from each cell in the DRAM row during activation is restored to its full level, to prevent data loss.
Reads and writes can be performed once the data of an activated row is copied to the row buffer.
Precharge releases the data from the row buffer when the memory controller is done issuing reads and writes to the activated row, and prepares the bank to activate a different row.

Out of these, DRAM access latency is predominantly composed of the latency of three operations: activation, restoration, and precharge.

The picture above shows a timeline of the commands issued to perform a read (top) or a write (bottom) to a single cache line of data. The memory controller issues four commands: (1) ACT (activate), (2) READ or (3) WRITE, and (4) PRE (precharge). Note that restoration does not have an explicit command, and is instead triggered automatically after an ACT command. The time spent on each operation is dictated by a set of timing parameters that are determined by DRAM vendors. While each command operates at row granularity, for simplicity, we describe how the DRAM operations affect a single DRAM cell.

In the initial precharged state (1), the bitline is held at a voltage level of VDD/2, where VDD is the full DRAM supply voltage. The wordline is at 0 V, and therefore the bitline is disconnected from the capacitor. After the memory controller issues an ACT command (2), the wordline is raised to Vh, thereby connecting the DRAM cell capacitor to the bitline. As in this example the voltage of the capacitor is higher than that of the bitline, charge flows into the bitline, raising its voltage level up to VDD/2 + δ. This process is called charge sharing. The sense amplifier then measures the deviation on the bitline and amplifies that deviation correspondingly (3). This phase, referred to as sense amplification, eventually drives the voltage level of the bitline and the cell back to the original voltage state of the cell (VDD in this example).

As soon as the sense amplifier has sufficiently amplified the data on the bitline (e.g., the voltage level reaches 3VDD/4), the memory controller can issue a READ or WRITE command to access the cell's data in the row buffer. The time taken to reach this state (3) after the ACT command is specified by the timing parameter tRCD, as shown in first picture. After the READ or WRITE command is issued, the sense amplification phase continues to drive the voltage on the bitline (4) until the voltage level of the bitline and the cell reaches VDD. In other words, the original charge level of the cell is fully restored to its original value for a READ, or correctly updated to the new value for a WRITE.

For DRAM read requests, the latency for the cell to be fully restored after ACT is determined by the timing parameter tRAS. For DRAM write requests, the time taken to fully update the cell is determined by tWR. After restoration, the bitline can be precharged using the PRE command to prepare the subarray for a future access to a different row. This process disconnects the cell from the bitline by lowering the voltage on the wordline. It then resets the voltage of the bitline to VDD/2. The time to complete the precharge operation is specified by the timing parameter tRP.

The tRCD and tRAS timings can be significantly lower than in datasheets. How?

Conventional DRAM chips perform activation and restoration operations using a fixed latency, which is determined by the value of the timing parameters shown in the first picture. However, ways exist in which latencies for activation and restoration can be reduced by exploiting the current charge level of a cell. If a cell has a high charge level, the corresponding voltage perturbation process on the bitline during activation is faster, and consequently, the sense amplifier needs less time to reach states 3 and 4 in the second picture. "ChargeCache" is a state-of-the-art mechanism that uses this insight to safely reduce the tRCD and tRAS timing parameters for a highly-charged cell.

ChargeCache keeps tracks of rows that have been recently accessed, which means their cells have a high charge level, as only a short amount of time has elapsed since the cells were last restored to full charge level. Therefore, if a recently accessed row is activated again within a short time interval (e.g., 1 ms), ChargeCache uses lower tRCD and tRAS values for the row, which reduces overall DRAM access latency. A similar approach can be applied to reduce the restoration latency. In a conventional DRAM chip, each ACT command triggers a restoration operation that fully restores the charge level of the cells in the activated row. Likewise, each refresh operation fully restores the charge level of a cell at a fixed time interval (every 64 ms in DDRx DRAM).

There is also the Restore Truncation mechanism, which partially restores the cell's charge level just enough to retain correct data — until the next refresh of the cell. One of the controls for this mechanism is tWR and tRAS timings.

Some presets that have been published in my article use these mechanisms, so I advise you to forget about the typical formulas that you can find on the Internet.

Conclusions

Because a DRAM cell is made up of a capacitor, the cell leaks charge even when it is not accessed. In order to prevent data loss, the DRAM has to issue periodic refresh operations to all cells. A refresh operation brings the charge level of a cell back to its full value.
Modern memory chips allow you to set aggressive time intervals thanks to the Restore Truncation mechanism and ChargeCache.
SDRAM chips allow you to perform the third and fourth operations in parallel in a sense. To be precise, the PRECHARGE line recharging command can be sent for a certain number of ticks x before the moment at which the last data element of the requested packet is issued, without fear of the occurrence of a "broken" situation of the transmitted packet (the latter will occur if the PRECHARGE command is sent READ commands with a time period less than x).
To prevent data loss in the cells, you can increase the DRAM voltage or change the temporal characteristics that are responsible for precharging and refreshing. Adjusting tRP and tRFC will have the greatest impact, tWR and tRTP can also help. I do not advise raising the value for tWR above 12.
tRC >= tRAS + tRP. For most cases this should be the optimal formula.
tRAS = tRCD + tCL. I do not have a clear definition for this timing, it can be equal to tRCD + tCL, but sometimes significantly lower due to the mechanisms listed above. Also, do not forget about the margin, the limits of which are determined by purely experimental means, since each chip has cell characteristics that differ.
For high frequencies, I use the formula from the first picture. tRAS = tRCD + tBL + tWR (tuned, 12 or 10). tBL for DDR4 = 4 or 2.