cool, I see photoworxx is actually memory bandwidth bound, It cant be all L3 cache as I got 90 MB, and they're only slightly slower too,
also, if you turn SMT off can you leave NUMA on? It would better serve you in quick ram access situations.
Can you post a cinebecnh r15 result?
Regardless of SMT status I have 8 NUMA nodes, forcing me to run Windows Server.
I don't think the Cinebench score is anything special.... I optimized for memory bandwidth, not compute. This scores less than a single 32C workstation Threadripper, right? I estimated bumping from 32 to 64 total cores would only give us about 20% more performance in our CFD app, but would double the total cost of the server, bumping each CPU from $1k to $4k. I've yet to benchmark our target app yet, so we'll see how real-world scaling works.
When Rome gets released, I'll check out their 32 core dual-socket offerings to re-evaluate 64 total cores. I've heard they are doubling cache, doubling cores, and will be pushing near 4Ghz all-core turbo on 7nm.
(32/16)*(4.0/2.7) would be 3x our current compute power. Useful for ray-tracing, which is actually a minor/sometimes use case.....
CFD time is very expensive for us so this server should pay off quickly.