• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Linux Performance of AMD Rome vs Intel Cascade Lake, 1 Year On

Raevenlord

News Editor
Joined
Aug 12, 2016
Messages
3,755 (1.23/day)
Location
Portugal
System Name The Ryzening
Processor AMD Ryzen 9 5900X
Motherboard MSI X570 MAG TOMAHAWK
Cooling Lian Li Galahad 360mm AIO
Memory 32 GB G.Skill Trident Z F4-3733 (4x 8 GB)
Video Card(s) Gigabyte RTX 3070 Ti
Storage Boot: Transcend MTE220S 2TB, Kintson A2000 1TB, Seagate Firewolf Pro 14 TB
Display(s) Acer Nitro VG270UP (1440p 144 Hz IPS)
Case Lian Li O11DX Dynamic White
Audio Device(s) iFi Audio Zen DAC
Power Supply Seasonic Focus+ 750 W
Mouse Cooler Master Masterkeys Lite L
Keyboard Cooler Master Masterkeys Lite L
Software Windows 10 x64
Michael Larabel over at Phoronix posted an extremely comprehensive analysis on the performance differential between AMD's Rome-based EPYC and Intel's Cascade Lake Xeons one-year after release. The battery of tests, comprising more than 116 benchmark results, pits a Xeon Platinum 8280 2P system against an EPYC 7742 2P one. The tests were conducted pitting performance of both systems while running benchmarks under the Ubuntu 19.04 release, which was chosen as the "one year ago" baseline, against the newer Linux software stack (Ubuntu 20.10 daily + GCC 10 + Linux 5.8).

The benchmark conclusions are interesting. For one, Intel gained more ground than AMD over the course of the year, with the Xeon platform gaining 6% performance across releases, while AMD's EPYC gained just 4% over the same period of time. This means that AMD's system is still an average of 14% faster across all tests than the Intel platform, however, which speaks to AMD's silicon superiority. Check some benchmark results below, but follow the source link for the full rundown.



View at TechPowerUp Main Site
 
Joined
Mar 21, 2016
Messages
2,508 (0.79/day)
Hasn't traditionally Intel had better compiler support on the software side or at least more widely used. This seems to be what I'd expect though there is only so much extra leeway they'll be able to gain from just a compiler advantage alone.
 
Last edited:
Joined
Jun 19, 2010
Messages
409 (0.08/day)
Location
Germany
Processor Ryzen 5600X
Motherboard MSI A520
Cooling Thermalright ARO-M14 orange
Memory 2x 8GB 3200
Video Card(s) RTX 3050 (ROG Strix Bios)
Storage SATA SSD
Display(s) UltraHD TV
Case Sharkoon AM5 Window red
Audio Device(s) Headset
Power Supply beQuiet 400W
Mouse Mountain Makalu 67
Keyboard MS Sidewinder X4
Software Windows, Vivaldi, Thunderbird, LibreOffice, Games, etc.
When compared intel to AMD over the years, the difference in a scenario where the AMD gets pitted against an intel getting good code while AMD getting inferior code out of it, yeah the offset is bigger then.
Intels marketing and intel-tame software/hardware companies try to fool people wich don´t have their glasses as clean as they should.
Phoronix Michael Larabel does a great job everytime going as real as possible with benchmarks.
 
Joined
Mar 18, 2008
Messages
5,717 (0.93/day)
System Name Virtual Reality / Bioinformatics
Processor Undead CPU
Motherboard Undead TUF X99
Cooling Noctua NH-D15
Memory GSkill 128GB DDR4-3000
Video Card(s) EVGA RTX 3090 FTW3 Ultra
Storage Samsung 960 Pro 1TB + 860 EVO 2TB + WD Black 5TB
Display(s) 32'' 4K Dell
Case Fractal Design R5
Audio Device(s) BOSE 2.0
Power Supply Seasonic 850watt
Mouse Logitech Master MX
Keyboard Corsair K70 Cherry MX Blue
VR HMD HTC Vive + Oculus Quest 2
Software Windows 10 P
For programs the can leverage AVX512, Intel chip still reign supreme.
 
Joined
Feb 21, 2006
Messages
2,240 (0.33/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Prime X570-Pro BIOS 5013 AM4 AGESA V2 PI 1.2.0.Cc.
Cooling Corsair H150i Pro
Memory 32GB GSkill Trident RGB DDR4-3200 14-14-14-34-1T (B-Die)
Video Card(s) XFX Radeon RX 7900 XTX Magnetic Air (24.12.1)
Storage WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 20TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB SE
Keyboard Corsair K100
Software Windows 10 Pro x64 22H2
Benchmark Scores 3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c 5800X3D https://valid.x86.fr/b7d
For programs the can leverage AVX512, Intel chip still reign supreme.

like the 10 pieces of software that actually use AVX512 sure :)
 
Joined
Oct 2, 2015
Messages
3,152 (0.94/day)
Location
Argentina
System Name Ciel / Akane
Processor AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling ID-Cooling 224-XT Basic / Stock
Memory 2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s) Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s) AOC Q27G3XMN / Samsung S22F350
Case Cougar MX410 Mesh-G / Generic
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W / Gigabyte P450B
Mouse EVGA X15 / Logitech G203
Keyboard VSG Alnilam / Dell
Software Windows 11
I can't wait for AMD to finally standardize AVX-512, it seems Intel needs a decade to do so.
 
Joined
Mar 23, 2016
Messages
4,844 (1.52/day)
Processor Core i7-13700
Motherboard MSI Z790 Gaming Plus WiFi
Cooling Cooler Master RGB something
Memory Corsair DDR5-6000 small OC to 6200
Video Card(s) XFX Speedster SWFT309 AMD Radeon RX 6700 XT CORE Gaming
Storage 970 EVO NVMe M.2 500GB,,WD850N 2TB
Display(s) Samsung 28” 4K monitor
Case Phantek Eclipse P400S
Audio Device(s) EVGA NU Audio
Power Supply EVGA 850 BQ
Mouse Logitech G502 Hero
Keyboard Logitech G G413 Silver
Software Windows 11 Professional v23H2
I can't wait for AMD to finally standardize AVX-512, it seems Intel needs a decade to do so.
Yeah, Intel is all over place with their product segmentation strategy across all the CPU categories.
 

tabascosauz

Moderator
Supporter
Staff member
Joined
Jun 24, 2015
Messages
8,178 (2.36/day)
Location
Western Canada
System Name ab┃ob
Processor 7800X3D┃5800X3D
Motherboard B650E PG-ITX┃X570 Impact
Cooling NH-U12A + T30┃AXP120-x67
Memory 64GB 6400CL32┃32GB 3600CL14
Video Card(s) RTX 4070 Ti Eagle┃RTX A2000
Storage 8TB of SSDs┃1TB SN550
Case Caselabs S3┃Lazer3D HT5
I can't wait for AMD to finally standardize AVX-512, it seems Intel needs a decade to do so.

They won't, because AVX-512 exists for Intel, who wants to push its products in specific areas like AI. Instead of actually standardizing the entire instruction family, they just pull out single instructions under the AVX-512 banner whenever the marketing team needs it, eg. VNNI when Intel needs to market itself to deep learning.

Take a look at the horrendously fragmented list of products supporting scattered bits and pieces of AVX-512 and you'll see why it's not even remotely worth AMD's time right now.
 
Joined
Aug 20, 2007
Messages
21,530 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
Hasn't traditionally Intel had better compiler support on the software side or at least more widely used. This seems to be what I'd expect though there is only so much extra leeway they'll be able to gain from just a compiler advantage alone.

Michael Larabel deals in linux. There is no compiler advantage there. Heck, I don't even think anyone there USES ICC... I know you can't even build the kernel with it, for starters.
 
Joined
Oct 2, 2015
Messages
3,152 (0.94/day)
Location
Argentina
System Name Ciel / Akane
Processor AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling ID-Cooling 224-XT Basic / Stock
Memory 2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s) Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s) AOC Q27G3XMN / Samsung S22F350
Case Cougar MX410 Mesh-G / Generic
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W / Gigabyte P450B
Mouse EVGA X15 / Logitech G203
Keyboard VSG Alnilam / Dell
Software Windows 11
They won't, because AVX-512 exists for Intel, who wants to push its products in specific areas like AI. Instead of actually standardizing the entire instruction family, they just pull out single instructions under the AVX-512 banner whenever the marketing team needs it, eg. VNNI when Intel needs to market itself to deep learning.

Take a look at the horrendously fragmented list of products supporting scattered bits and pieces of AVX-512 and you'll see why it's not even remotely worth AMD's time right now.
Sales numbers can and will change that. They are still in 14nm hell, and seems like they will still be for at least another 6 months. Stupid decisions like these are costing them their credibility.
 
Joined
Mar 23, 2016
Messages
4,844 (1.52/day)
Processor Core i7-13700
Motherboard MSI Z790 Gaming Plus WiFi
Cooling Cooler Master RGB something
Memory Corsair DDR5-6000 small OC to 6200
Video Card(s) XFX Speedster SWFT309 AMD Radeon RX 6700 XT CORE Gaming
Storage 970 EVO NVMe M.2 500GB,,WD850N 2TB
Display(s) Samsung 28” 4K monitor
Case Phantek Eclipse P400S
Audio Device(s) EVGA NU Audio
Power Supply EVGA 850 BQ
Mouse Logitech G502 Hero
Keyboard Logitech G G413 Silver
Software Windows 11 Professional v23H2
Have any of the Celeron’s picked up HyperThreading or are they still limited to dual-cores without HT? With Comet Lake they should be two cores four threads.
 
Joined
Jan 2, 2018
Messages
289 (0.11/day)
The C++ application i am programming takes 35 seconds to compile using 8 threads on 9900K (stock)
And it takes 18 seconds to compile using 8 threads on 3700X (stock)

Now thats what i call a productive CPU



And it takes 25 minutes to compile using 2 threads on Allwiner A20 ARM cpu lol
 
Joined
Feb 3, 2017
Messages
3,810 (1.33/day)
Processor Ryzen 7800X3D
Motherboard ROG STRIX B650E-F GAMING WIFI
Memory 2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s) INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage 2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s) 42" LG C2 OLED, 27" ASUS PG279Q
Case Thermaltake Core P5
Power Supply Fractal Design Ion+ Platinum 760W
Mouse Corsair Dark Core RGB Pro SE
Keyboard Corsair K100 RGB
VR HMD HTC Vive Cosmos
For these over 100 tests run, the AMD EPYC 7742 2P on the latest Linux software packages yielded 14% better performance over Intel's top-end non-AP Xeon Platinum 8280 dual socket server.
What is kind of weird is the only 14% difference in geomean of test results. I guess there are just too many tests that do not rely on many threads. The systems are 128 vs 56 cores, after all.
 
Joined
Jan 6, 2013
Messages
350 (0.08/day)
So, 128 cores AMD vs 56 cores Intel and AMD wins by 14%????
LE: Now I see it. The tests are a mix of ST and lightly/hard MT scenarios. In any case, with very well MT software you'll see bigger difference, but I guess given these are the current workloads in the server space, Intel is not that far off.
 
Joined
Jun 10, 2014
Messages
2,987 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
Hasn't traditionally Intel had better compiler support on the software side or at least more widely used. This seems to be what I'd expect though there is only so much extra leeway they'll be able to gain from just a compiler advantage alone.
This is only a myth.
Pretty much all software today is compiled with eiter GCC, LLVM or MSVC, neither are biased.
Of all the compiler optimization that GCC and LLVM offers, most of them are generic. There are a few exceptions, like if you target Zen 2 vs. Skylake, but those are minimal and the majority of optimizations are all the same.

We can't optimize for the underlying microarchitectures, as the CPUs share a common ISA. The CPUs from Intel and AMD also behave very similarly, so in order to optimize significantly for either one, we need some significant ISA differences. As of right now, Skylake and Zen 2 is very comparable in ISA features (while Skylake-X and Ice Lake have some new features like AVX-512 and a few other instructions). So when the ISA and general behavior is the same, the possibility of targeted optimizations to favor one of them is pretty much non-existent. So whenever you hear people claim games are "Skylake optimized" etc., that's 100% BS, they have no idea what they're talking about.

They won't, because AVX-512 exists for Intel, who wants to push its products in specific areas like AI. Instead of actually standardizing the entire instruction family, they just pull out single instructions under the AVX-512 banner whenever the marketing team needs it, eg. VNNI when Intel needs to market itself to deep learning.
You are clearly way off base here.
The core functionality of AVX-512 is known as AVX-512F, the others are optional extensions.
The various "AI" features are marketed as AVX-512 because they use the AVX-512 vector units, unlike other single instructions running through the integer units.

As an additional note;
I'm not a fan of application specific instructions. Those never get widespread use, and quickly become obsolete, and software relying on these are no longer forward-compatible.
 
Joined
Jul 19, 2017
Messages
75 (0.03/day)
This is only a myth.
...
So whenever you hear people claim games are "Skylake optimized" etc., that's 100% BS, they have no idea what they're talking about.
Isn't that what's so usual, people (myself included) carry with us plenty of old and irrelevant information/data, mainly because it's almost impossible to keep being updated with it all?
 
Last edited:
Joined
Feb 19, 2009
Messages
1,162 (0.20/day)
Location
I live in Norway
Processor R9 5800x3d | R7 3900X | 4800H | 2x Xeon gold 6142
Motherboard Asrock X570M | AB350M Pro 4 | Asus Tuf A15
Cooling Air | Air | duh laptop
Memory 64gb G.skill SniperX @3600 CL16 | 128gb | 32GB | 192gb
Video Card(s) RTX 4080 |Quadro P5000 | RTX2060M
Storage Many drives
Display(s) AW3423dwf.
Case Jonsbo D41
Power Supply Corsair RM850x
Mouse g502 Lightspeed
Keyboard G913 tkl
Software win11, proxmox
So, 128 cores AMD vs 56 cores Intel and AMD wins by 14%????
LE: Now I see it. The tests are a mix of ST and lightly/hard MT scenarios. In any case, with very well MT software you'll see bigger difference, but I guess given these are the current workloads in the server space, Intel is not that far off.

In datacenter loads mostly mt rules, cause you don't run one application on a server.
You run virtualized, docker, yeah..
 
Joined
Oct 31, 2013
Messages
187 (0.05/day)
For modern supercomputers and AI, don't you just use a GPU for highly parellelized stuff like AVX 512?

AMD had the HSA stuff, but it never got adopted with the APUs.
 
Joined
Jun 10, 2014
Messages
2,987 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
For modern supercomputers and AI, don't you just use a GPU for highly parellelized stuff like AVX 512?

AMD had the HSA stuff, but it never got adopted with the APUs.
You raise a very valid question which many in here might be wondering, and there is an explanation.
AVX, multithreading and GPU acceleration are all different types of parallelism, but they work on different scopes.
  • AVX works mixed in with other instructions, and have a negligible overhead cost. AVX is primarily parallelization on data level, not logic level, which means repeated logic can be eliminated. One AVX operation costs the same as a single FP operation, so with AVX-512 you can do 16 32-bit floats at the same cost of a single float. And the only "cost" is the normal transfer between CPU registers. So this is parallelization on the finest level, typically a few lines of code or inside a loop.
  • Multithreading is on a coarser level than AVX. When using multiple threads, there are much higher synchronization costs, ranging from sending simple signals to sending larger pieces of data. Also data hazards can very quickly lead to stalls and inefficiency, so for this reason the proper way to scale with threads is to divide the workload into independent work chunks given to each worker threads. Multiple threads also have to deal with the OS scheduler which can cause latencies of several ms. Work chunks for threads are generally ranging from ms to seconds, while AVX works in the nanosecond range.
  • GPU acceleration have even larger synchronization costs than multithreading, but the GPU has also more computational power, so if the balance is right, GPU acceleration makes sense. The GPU is very good at computational density, while current GPUs still relies on the CPU to control the workflow on a higher level.
It's worth mentioning that many productive applications use two or all three types of parallelization, as they complement each other.

But when it comes to "AI" for supercomputers, this will soon be accelerated by ASICs. I see no reason why general purpose CPUs should include such features.
 
Last edited:
Joined
Feb 3, 2017
Messages
3,810 (1.33/day)
Processor Ryzen 7800X3D
Motherboard ROG STRIX B650E-F GAMING WIFI
Memory 2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s) INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage 2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s) 42" LG C2 OLED, 27" ASUS PG279Q
Case Thermaltake Core P5
Power Supply Fractal Design Ion+ Platinum 760W
Mouse Corsair Dark Core RGB Pro SE
Keyboard Corsair K100 RGB
VR HMD HTC Vive Cosmos
I know of one person that doesn't like avx-512 very much
He makes two points:
- AVX-512 support is fragmented. Which it is and he sounds like he would be OK with this part if all (or at least all Intel) CPUs would have it.
- He dislikes FP in general. This may or may not be a reasonable stance.
 
Joined
Jun 10, 2014
Messages
2,987 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I know of one person that doesn't like avx-512 very much

We already have a discussion about that, you are welcome to join it here: https://www.techpowerup.com/forums/...mmick-to-invent-and-win-at-benchmarks.269770/

- He dislikes FP in general. This may or may not be a reasonable stance.
FP is used a lot, in video, rendering, photo editing, games etc.
And AVX can do integer too, which is why I often refer to them as vector units, since they can do both integers and floats. Integers in AVX is used heavily in things like file compression.
 
Joined
Jan 8, 2017
Messages
9,499 (3.27/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
I can't wait for AMD to finally standardize AVX-512, it seems Intel needs a decade to do so.

I hope not, very wide SIMD is a fallacy in modern computer architecture design. SIMD was introduced in the days when other massively parallel compute hardware didn't exist and everyone thought frequency/numbers of transistors would just scale forever with increasingly lower power consumption. That didn't hold up, the contingency created by simultaneously trying to make a CPU that has the fastest possible single core performance while trying to add more and more cores and wider SIMD is too great. GPUs make CPU SIMD redundant, I can't think of a single application that couldn't be scaled up from x86 AVX to CUDA/OpenCL, in fact the latter are way more robust anyway.
 
Top