# Post your Tiny Memory Benchmark results



## acraft (Oct 22, 2020)

Hi,

I recently bought AMD Ryzen 9 3950x - 32 GB system and I found that some of my applications are running 2x slower on this machine comparing with Intel i7-8750H 6 core - 32 GB system. 3950x's performance results are consistent what I see on the website. For example, AIDA64 cache and memory results are consistent. I tried to figure out why 3950x performing 2x slower and I found that some memory operations are much slower than intel.

I used tiny memory benchmark to figure out what is going on. i7-8750H 6-core CPU with 32 GB ram results are much higher than 3950x. I sometime think why AMD 3950 much cheaper than Intel and it looks they cut the power of some of the instructions to reduce the cost.

Here how you can compile and run the tiny memory benchmark under linux. (Or with WSL2 under windows also works) Instructions:

$ sudo apt update
$ sudo apt install clang make git
$ mkdir tmb
$ cd tmb
$ git clone https://github.com/ssvb/tinymembench .
$ CC=clang CFLAGS="-no-integrated-as" make
$ ./tinymembench


AMD Ryzen 9 3950x - 32 GB Ram






Here is the intel i7-8750H - 32 GB Ram Results:





Can you please run and post your results here.

*AMD Ryzen 9 3950X System Spec:*



















*INTEL i7-8750H System Spec:*


















Thanks!


----------



## mstenholm (Oct 22, 2020)

If you provide your settings for your two sets (speed and timings) I will consider. In general no post considering performance will be answered unless you fill in your system specification.


----------



## phill (Oct 22, 2020)

IT does seem to be that the 3 series of AMD wasn't the best for handling memory timings etc so the results aren't much of a shock.  Intel's CPU clock speed pushes them in first place for that...

You'll need to put a little more information down for what cooling, RAM, motherboards you use for the fact that there's so many variences that the test would be pointless unless we know what your timings etc were.  Can you not put a CPU-Z CPU, memory and motherboard tab with the results for better clarity?


----------



## lowrider_05 (Oct 22, 2020)

3600XT + 32GB @ 3753Mhz CL16 ... Read will be full speed but all Write Operations should be half of a 3900x because of cut off CCX


----------



## acraft (Oct 22, 2020)

I put all details about the the two systems.


----------



## Vya Domus (Oct 22, 2020)

This is to be expected, each CCX can read 32 bytes/clock or write 16 bytes/clock. In total the two CCXs can achieve the same write performance like any other Intel CPU just not using a single thread because then the instructions are issued from just one CCX and therefore limited to 16 bytes/clock. I suspect AIDA64 is multi-threaded and this benchmark isn't, you need multiple threads (that are scheduled on different CCXs) to get the full throughout. It wasn't really a cost saving measure, it's just how the I/O was configured.


----------



## acraft (Oct 22, 2020)

That's correct. The benchmark is not multi-threaded.


----------



## harm9963 (Oct 22, 2020)

Try this !


----------



## acraft (Oct 23, 2020)

harm9963 said:


> Try this !View attachment 172970




Can you please also put CPU, Caches and Motherboard details from CPU-Z.


----------



## Arctucas (Oct 23, 2020)




----------



## acraft (Oct 23, 2020)

Vya Domus said:


> This is to be expected, each CCX can read 32 bytes/clock or write 16 bytes/clock. In total the two CCXs can achieve the same write performance like any other Intel CPU just not using a single thread because then the instructions are issued from just one CCX and therefore limited to 16 bytes/clock. I suspect AIDA64 is multi-threaded and this benchmark isn't, you need multiple threads (that are scheduled on different CCXs) to get the full throughout. It wasn't really a cost saving measure, it's just how the I/O was configured.



The application I used (not tiny memory benchmark) is multi-threaded. I tried on Intel 6-core system and Intel 32 core system. It's very scalable on Intel based systems. The application is 100% scalable, no disk operations, no wait on synchronizations etc.. i7-8750H runs 2x faster than AMD 3950x. It scales as expected on intel core system without any issues. 

Here perf stat result which could be helpful:




Just look at how bad the IPC is on AMD.


----------



## harm9963 (Oct 23, 2020)

Arctucas said:


> View attachment 172992


----------



## 111frodon (Oct 23, 2020)

You should try to compare your memory using the same timings (ideally the same sticks). The Intel system is running tightier timings, this could explain at least some part of the difference...


----------



## Nuckles56 (Oct 23, 2020)

acraft said:


> The application I used (not tiny memory benchmark) is multi-threaded. I tried on Intel 6-core system and Intel 32 core system. It's very scalable on Intel based systems. The application is 100% scalable, no disk operations, no wait on synchronizations etc.. i7-8750H runs 2x faster than AMD 3950x. It scales as expected on intel core system without any issues.
> 
> Here perf stat result which could be helpful:
> View attachment 172991
> ...


The IPC isn't bad on 3000 series Ryzen at all, it seems more likely that this particular benchmark uses an instruction that Ryzen doesn't have and that's what makes all the difference here


----------

