• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Zen Features Double the Per-core Number Crunching Machinery to Predecessor

Joined
Sep 15, 2007
Messages
3,947 (0.62/day)
Location
Police/Nanny State of America
Processor OCed 5800X3D
Motherboard Asucks C6H
Cooling Air
Memory 32GB
Video Card(s) OCed 6800XT
Storage NVMees
Display(s) 32" Dull curved 1440
Case Freebie glass idk
Audio Device(s) Sennheiser
Power Supply Don't even remember
Well they could if they want.
They have 2500 geekbench single thread score at 1.8 Ghz and in a very power restricted environment.

http://cdn.arstechnica.net/wp-content/uploads/2015/09/charts.0011.png

An i5 4440 at 3.1 has ~2900 in the same test.

http://browser.primatelabs.com/geekbench3/search?utf8=✓&q=i5+4440

And the FX8350 is around 2400 :)

http://browser.primatelabs.com/geekbench3/search?utf8=✓&q=fx+8350

They are definitely competitive and that is for sure desktop class CPU and if they could push ARM so far, I'm sure others will soon follow and there are big heavy names there: Qualcomm, Samsung, nVidia ...

You're high if you think ARM can do general processing with the power that x86 has.

Go ahead and run a real app on one core of x86 and that pitiful arm chip. Guess what is going to happen? Synthetic benchmarks are more useless than ever.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,173 (2.78/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
You're high if you think ARM can do general processing with the power that x86 has.

Go ahead and run a real app on one core of x86 and that pitiful arm chip. Guess what is going to happen? Synthetic benchmarks are more useless than ever.
Depends on the task. If most of the application's instructions are simple integer math operations for data and addresses, an ARM CPU will do pretty well because both x86 and ARM architectures will execute instructions like this in a single cycle. The time when this becomes a difference is when you start considering the more complex instructions offered by CISC instruction set CPUs like x86. Extensions like SSEx where introduced to do what would normally take several clock cycles and reduces it to only a handful if not a single cycle. However, that comes at a cost. It requires more circuity and transistors to have the extra logic to do these more complex instructions quickly. The result is higher manufacturing costs and higher power consumption but, on the other hand you can get significantly improved performance depending on the application.

So I won't say that ARM is crap compared to x86 because, it depends on what you're doing. If you're using a browser, reading/writing email, or playing a simple game like Angry Birds, an ARM CPU is more than enough. However, if you're doing video encoding, physics processing, or really any floating point math application, you're better off with something that can do a little more in a little less time but, that only helps you if you have the power to spare.

I just thought that a more balanced perspective on the matter was required because neither architectures are bad, it's just that they were designed with different things in mind under different philosophies.
 

Titus Joseph

New Member
Joined
Oct 23, 2015
Messages
2 (0.00/day)
Location
Cochin, Kerala, India
To add to that, x86 is designed for larger platforms like server, desktop, laptops.

Arm is mobile platform specific as of now. So the requirements and there for the capacity varies. Intel's x86 mobile processors which are being used in ASUS ZenFone series which I m using, are great performers but at the cost of higher power consumtion. I had to implement tonnes of tweaks to get a steady 10 hour continuous usage from it from 100% to 0%.

That its self says a lot.
 
Joined
Jul 31, 2014
Messages
484 (0.13/day)
System Name Diablo | Baal | Mephisto
Processor Ryzen 7700 | 2x Xeon E5-2697v4 | i7-13900H
Motherboard ASRockRack B650D4U-2L2T/BCM | Supermicro X10DRH-iT | Lenovo Thinkpad P1 Gen 6
Cooling Custom loop | SC846 Chassis cooled| dual-fanned heatpipes with LM
Memory 64GiB DDR5-5600 ECC | 256GiB DDR4-3200 ECC RDIMM | 64GiB DDR5-5600
Video Card(s) RTX 3090 Ti Founder's Edition | Embedded ASPEED2400 | RTX 5000 Mobile (80W)
Storage many, many SSDs and HDDs....
Display(s) Dell U3014 + Dell U3011 | SMCI IPMI KVMoIP | 3840×2400 Samsung OLED
Case Caselabs TH10A | Supermicro SC846 | Lenovo Thinkpad P1 Gen 6
Audio Device(s) Creative SoundBlaster X4 | None | On-board + Moondriver2 Ti + Bluetooth
Power Supply Corsair AX1600 | 1200W PSU (Delta) | Lenovo 230W or 300W
Mouse Logitech G604
Keyboard 1985 IBM Model F 122-key, Lenovo integrated
Software FAAAR too much to list
You're high if you think ARM can do general processing with the power that x86 has.

Go ahead and run a real app on one core of x86 and that pitiful arm chip. Guess what is going to happen? Synthetic benchmarks are more useless than ever.

Depends on the task. If most of the application's instructions are simple integer math operations for data and addresses, an ARM CPU will do pretty well because both x86 and ARM architectures will execute instructions like this in a single cycle. The time when this becomes a difference is when you start considering the more complex instructions offered by CISC instruction set CPUs like x86. Extensions like SSEx where introduced to do what would normally take several clock cycles and reduces it to only a handful if not a single cycle. However, that comes at a cost. It requires more circuity and transistors to have the extra logic to do these more complex instructions quickly. The result is higher manufacturing costs and higher power consumption but, on the other hand you can get significantly improved performance depending on the application.

So I won't say that ARM is crap compared to x86 because, it depends on what you're doing. If you're using a browser, reading/writing email, or playing a simple game like Angry Birds, an ARM CPU is more than enough. However, if you're doing video encoding, physics processing, or really any floating point math application, you're better off with something that can do a little more in a little less time but, that only helps you if you have the power to spare.

I just thought that a more balanced perspective on the matter was required because neither architectures are bad, it's just that they were designed with different things in mind under different philosophies.

To add to that, x86 is designed for larger platforms like server, desktop, laptops.

Arm is mobile platform specific as of now. So the requirements and there for the capacity varies. Intel's x86 mobile processors which are being used in ASUS ZenFone series which I m using, are great performers but at the cost of higher power consumtion. I had to implement tonnes of tweaks to get a steady 10 hour continuous usage from it from 100% to 0%.

That its self says a lot.

Not exactly.. comparing ARM to x86 is hard, because of how easy it is to build a custom ARM SoC vs an x86 SoC, so really, you need to be comparing them in each segment they're in.

Server:

In serverland, you're mostly limited by whether or not you can scale horizontally (as far as CPUs go). If you can scale horizontally, the ARM chips are competitive with x86 in terms of overall/total cost, because what you trade in in terms of per-CPU performance, you regain back in terms of being able to fit more machines in the same space (a high-end ARM SoC fits in about the same sort of space a RasPi takes, while Atom, for all it's low-power-ness, still needs about twice as much space).

The result is that some companies, like Linode, are using ARM chips for their low-power use cases, and others, like Google and Facebook, are considering ARM alongside POWER.

Desktop:

ARM hasn't had a desktop chip since the original Acorn RISC machines. Still, as far as basic browser/productivity/media use goes, something like the Shield microconsole or RasPi 2 running a better OS than Android do OK. Not amazing (mostly because of limited RAM), but OK.

Laptop:

ARM Chromebooks do about as well as low-end x86 chromebooks, but very few use high-end SoCs like Tegra K1/X1, so they lose out to their x86 brethren. Linux users also have fun with it, but in the end, nothing really beats a proper high-end Windows laptop (Like an XPS13 for example) running Linux.

Phones & Tablets:

In the phone and tablet arena, x86 has mostly been hampered by overall platform power and complexity, not performance. If you look at the landscape, one company stands above all the others: Qualcomm. Qualcomm has such a position because of their ability to pack the CPU, GPU, DSPs, ISP, modem, wifi, bluetooth, GPS all into the same die, then strap the RAM on top of the same package. This makes the board design really, really simple, since all you have to do is wire up the sensors (camera, accelerometer, gyro, barometer, temperature), radio frontends, PMIC, screen, digitizer and you're done. On x86, as of right now, you have to put on the CPU/RAM package, then wire up the various wireless interfaces (cellular, GPS, and wifi/BT) to the SoC. With 3 more extra fairly hefty chips to put on the board, things get expensive and raises idle battery usage a fair chunk. This is why the ZenFone 2 is the only phone using x86, and it shows compared to Qcomm. Other companies (MediaTek) also have more integrated stuff, similar to Qcomm.

As for the CISC/RISC argument.. that argument sailed a loooong time ago, around when IBM (POWER), Intel (Pentium Pro/II) and ARM (ARM8/ARMv4 I think, because of their use of speculative processing.. arguably even ARM7T/ARMv4T because of it's pipelining) went Out-of-Order/speculative/superscalar because they all decode the instructions into micro-ops and are no longer run directly. The myth kinda lived on for a while though, because Intel was, well, kinda crap at platform design compared to IBM - most of their CPUs, up until the P4 and Nehalem jumps really, were memory-starved by the FSB, and had higher power consumption than the POWER cores, which is a RISC core. That all changed when Apple replaced the hot-running PowerPC 970/G5 (POWER4) with cool-running Core chips, so much that nobody cares about RISC vs CISC anymore outside of people writing raw assembly, and even then, only for ease of use arguments, not performance.

EDIT: On the subject of pipelining: longer pipelines (i.e more stages, iirc SNB-SKL is 11-14 depending on instruction, Bulldozer is 31) let you achieve higher clockspeeds, but the longer the pipeline, the worse the penalty of a pipeline stall (from a branch mispredict causing a flush, for example). The problem with a long pipeline lies in the stall penalty just going up somewhat exponentially, as Intel learnt with NetBurst, and AMD through Bulldozer (though the pipeline length isn't as dominating an issue there as it was with NetBurst...).
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,173 (2.78/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
As for the CISC/RISC argument.. that argument sailed a loooong time ago, around when IBM (POWER), Intel (Pentium Pro/II) and ARM (ARM8/ARMv4 I think, because of their use of speculative processing.. arguably even ARM7T/ARMv4T because of it's pipelining) went Out-of-Order/speculative/superscalar because they all decode the instructions into micro-ops and are no longer run directly. The myth kinda lived on for a while though, because Intel was, well, kinda crap at platform design compared to IBM - most of their CPUs, up until the P4 and Nehalem jumps really, were memory-starved by the FSB, and had higher power consumption than the POWER cores, which is a RISC core. That all changed when Apple replaced the hot-running PowerPC 970/G5 (POWER4) with cool-running Core chips, so much that nobody cares about RISC vs CISC anymore outside of people writing raw assembly, and even then, only for ease of use arguments, not performance.
You misunderstand what I'm saying. RISC CPUs are going to tend to want instructions that all execute quickly and using those instructions to everything, that means you're not going to have instructions that are bulky or relatively slow compared to others. There is an expectation that the faster instructions will be used to do the same kind of operation. The problem is things like SSE exists to accelerate these kinds of workloads, where a more heavy-weight instruction that does multiple things at once very well can save clock cycles.

What I can tell you is while RISC can have complex instruction sets, that's not entirely true for ARM-based CPUs as it may be for others like SPARC. ARM was intended to be low power and cheap, not fast and performant. That's why don't often see ARM CPUs in clusters as you do SPARCs. So while your argument about RISC in general is true, that doesn't hold true for ARM as a RISC. Not all RISCs are made equally and I can tell you that a modern ARM core is much more simple than a modern x86 core.

Edit: Also a side note, load/store architectures tend to require more instructions to do the same thing. So performance aside, this will result in larger applications by size since you must explicitly do all memory operations outside of non-memory related instructions where even on the 68K, you could run operations on variables in memory without explicitly loading them into CPU registers before acting on them. That is more indicative of the RISC/CISC debate as opposed to the ARM/x86 which is quite a bit different.
 
Last edited:

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,263 (4.42/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
CISC is much faster at specialized workloads (like decryption, encoding, and decoding) than RISC; however, RISC can reasonably compensate for that shortcoming through cheap parallelism but doing so greatly increases complexity of compilers/code. At the end of the day, RISC lands somewhere between CISC and the heavily paralleled workloads modern GPUs are champions of which begs the question: why not make ARM add-in cards for x86 machines? Let CISC of the x86 handle specialized workloads, hand off simple but heavy logic workloads to ARM, and hand off limited logic workloads to GPUs?
 
Joined
Jun 15, 2015
Messages
24 (0.01/day)
System Name ?
Processor AMD FX-9370
Motherboard ASUS Crosshair V Formula Z
Cooling Phanteks PH-TC12DX Dual 120mm PWM CPU Cooler
Memory 16GB (4x4GB) AMD 2400 MHz DDR3
Video Card(s) AMD R9 290
Storage 250 GB Crucial SSD BX100, 3 TB Toshiba HDD 7200rpm, 4 TB WD HDD 5900rpm
Case Fractal Design Define R4 Black No Window
Audio Device(s) (USB) Mayflower Objective O2
Power Supply Fractal Design Newton R3 1000W Plantium Efficiency
Mouse Steelseries Sensei RAW
Keyboard Corsair K30
Software Windows 10, Ubuntu 15.10
CISC is much faster at specialized workloads (like decryption, encoding, and decoding) than RISC; however, RISC can reasonably compensate for that shortcoming through cheap parallelism but doing so greatly increases complexity of compilers/code. At the end of the day, RISC lands somewhere between CISC and the heavily paralleled workloads modern GPUs are champions of which begs the question: why not make ARM add-in cards for x86 machines? Let CISC of the x86 handle specialized workloads, hand off simple but heavy logic workloads to ARM, and hand off limited logic workloads to GPUs?

So basically, take the big.LITTLE design of ARM, but instead, bigCISC.LITTLEarm + GPU?

2-4 low power ARM
4-8 high power x86
???? GPU clusters

So I'm assuming that it would run similar to how I have my OS on an SSD and Games on a large HDD. The ARM processors would handle the basic system functions (filesystem, networking, sound?), the x86 processor would handle processes that need both grunt, and have no GPU acceleration, and would handle the draw calls? Or would that be the ARM?

I'm way in over my head. I feel like I was on the right track and flew off of them.
 
Last edited:
Joined
Mar 23, 2005
Messages
4,100 (0.57/day)
Location
Ancient Greece, Acropolis (Time Lord)
System Name RiseZEN Gaming PC
Processor AMD Ryzen 7 5800X @ Auto
Motherboard Asus ROG Strix X570-E Gaming ATX Motherboard
Cooling Corsair H115i Elite Capellix AIO, 280mm Radiator, Dual RGB 140mm ML Series PWM Fans
Memory G.Skill TridentZ 64GB (4 x 16GB) DDR4 3200
Video Card(s) ASUS DUAL RX 6700 XT DUAL-RX6700XT-12G
Storage Corsair Force MP500 480GB M.2 & MP510 480GB M.2 - 2 x WD_BLACK 1TB SN850X NVMe 1TB
Display(s) ASUS ROG Strix 34” XG349C 144Hz 1440p + Asus ROG 27" MG278Q 144Hz WQHD 1440p
Case Corsair Obsidian Series 450D Gaming Case
Audio Device(s) SteelSeries 5Hv2 w/ Sound Blaster Z SE
Power Supply Corsair RM750x Power Supply
Mouse Razer Death-Adder + Viper 8K HZ Ambidextrous Gaming Mouse - Ergonomic Left Hand Edition
Keyboard Logitech G910 Orion Spectrum RGB Gaming Keyboard
Software Windows 11 Pro - 64-Bit Edition
Benchmark Scores I'm the Doctor, Doctor Who. The Definition of Gaming is PC Gaming...
I am not against AMD charging a bit more for the ZEN desktop CPU's if and when they match and outperform its competition. But seeing AMD's past pricing scheme, the only Processors they actually over charged was the Quad-FX compatible CPU's. They were a complete ripoff.

I trust AMD will price ZEN in a fair manner.
 
Joined
Jul 31, 2014
Messages
484 (0.13/day)
System Name Diablo | Baal | Mephisto
Processor Ryzen 7700 | 2x Xeon E5-2697v4 | i7-13900H
Motherboard ASRockRack B650D4U-2L2T/BCM | Supermicro X10DRH-iT | Lenovo Thinkpad P1 Gen 6
Cooling Custom loop | SC846 Chassis cooled| dual-fanned heatpipes with LM
Memory 64GiB DDR5-5600 ECC | 256GiB DDR4-3200 ECC RDIMM | 64GiB DDR5-5600
Video Card(s) RTX 3090 Ti Founder's Edition | Embedded ASPEED2400 | RTX 5000 Mobile (80W)
Storage many, many SSDs and HDDs....
Display(s) Dell U3014 + Dell U3011 | SMCI IPMI KVMoIP | 3840×2400 Samsung OLED
Case Caselabs TH10A | Supermicro SC846 | Lenovo Thinkpad P1 Gen 6
Audio Device(s) Creative SoundBlaster X4 | None | On-board + Moondriver2 Ti + Bluetooth
Power Supply Corsair AX1600 | 1200W PSU (Delta) | Lenovo 230W or 300W
Mouse Logitech G604
Keyboard 1985 IBM Model F 122-key, Lenovo integrated
Software FAAAR too much to list
So basically, take the big.LITTLE design of ARM, but instead, bigCISC.LITTLEarm + GPU?

2-4 low power ARM
4-8 high power x86
???? GPU clusters

So I'm assuming that it would run similar to how I have my OS on an SSD and Games on a large HDD. The ARM processors would handle the basic system functions (filesystem, networking, sound?), the x86 processor would handle processes that need both grunt, and have no GPU acceleration, and would handle the draw calls? Or would that be the ARM?

I'm way in over my head. I feel like I was on the right track and flew off of them.

The difference of CISC vs RISC is largely academic in these days of superscalar processing with internal microcode, and big, expensive instruction decode stages: each instruction is turned into an internal micro-op, not run directly, thus making the actual execution style identical for both types. For high-performance chips at least.

As for straight performance, it's largely gated by power and die size these days, and in those arenas, Intel holds the crown across the board. In fact, current x86 performance is so good that in terms of performance/W, Intel is ahead of all the various ARM cores, but at the cost of having higher total power consumption and heat. It is for that reason that you don't see ARM in most scale-out server deployments.

As a result, there's simply no point in having a heterogenous architecture that mixes x86 and ARM, and AMD knows that, as do Intel, Samsung, Qualcomm etc.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,173 (2.78/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
The difference of CISC vs RISC is largely academic in these days of superscalar processing with internal microcode, and big, expensive instruction decode stages: each instruction is turned into an internal micro-op, not run directly, thus making the actual execution style identical for both types. For high-performance chips at least.

As for straight performance, it's largely gated by power and die size these days, and in those arenas, Intel holds the crown across the board. In fact, current x86 performance is so good that in terms of performance/W, Intel is ahead of all the various ARM cores, but at the cost of having higher total power consumption and heat. It is for that reason that you don't see ARM in most scale-out server deployments.

As a result, there's simply no point in having a heterogenous architecture that mixes x86 and ARM, and AMD knows that, as do Intel, Samsung, Qualcomm etc.
Actually one of the differences that exist still to this day is that most RISC CPUs don't tend to combine regular instructions and memory operations (load/store). For example, in x86, you may have an instruction that takes two operands (say ADD,) but, that last operand could be either a register or a memory location. This basically means that the output of the instruction should get stored directly into memory. RISC CPUs aren't like this, in fact you have to explicitly say load this memory location into register n or store this register n into a memory location. There are advantages to doing both of these methods. Separate load/store instructions allows for a simpler pipeline because no instruction will ever have to do a memory operations contained within the same instruction. RISC CPUs also tend to have a lot of general purpose registered with makes this even more feasible. Depending on the application, keeping variables in registers until a full computation is done means less pressure on the memory controller and cache as well as faster turnaround time since CPU registers are the fastest storage in a CPU.

I wanted to point this out because while a lot of the differences between CISC and RISC CPUs have evaporated, there are some things like the LOAD/STORE bit that still tends to hold true. IIRC, I want to say that RISC CPUs tend to be much more rigid in terms of the number of operands that can be provided to any given instruction as well where there are some X86 instructions that are essentially multi-arity instructions. These small things tend to result in a smaller pipeline on ARM and other RISC CPUs compared to their X86 counterparts.
 
Joined
Jul 31, 2014
Messages
484 (0.13/day)
System Name Diablo | Baal | Mephisto
Processor Ryzen 7700 | 2x Xeon E5-2697v4 | i7-13900H
Motherboard ASRockRack B650D4U-2L2T/BCM | Supermicro X10DRH-iT | Lenovo Thinkpad P1 Gen 6
Cooling Custom loop | SC846 Chassis cooled| dual-fanned heatpipes with LM
Memory 64GiB DDR5-5600 ECC | 256GiB DDR4-3200 ECC RDIMM | 64GiB DDR5-5600
Video Card(s) RTX 3090 Ti Founder's Edition | Embedded ASPEED2400 | RTX 5000 Mobile (80W)
Storage many, many SSDs and HDDs....
Display(s) Dell U3014 + Dell U3011 | SMCI IPMI KVMoIP | 3840×2400 Samsung OLED
Case Caselabs TH10A | Supermicro SC846 | Lenovo Thinkpad P1 Gen 6
Audio Device(s) Creative SoundBlaster X4 | None | On-board + Moondriver2 Ti + Bluetooth
Power Supply Corsair AX1600 | 1200W PSU (Delta) | Lenovo 230W or 300W
Mouse Logitech G604
Keyboard 1985 IBM Model F 122-key, Lenovo integrated
Software FAAAR too much to list
Actually one of the differences that exist still to this day is that most RISC CPUs don't tend to combine regular instructions and memory operations (load/store). For example, in x86, you may have an instruction that takes two operands (say ADD,) but, that last operand could be either a register or a memory location. This basically means that the output of the instruction should get stored directly into memory. RISC CPUs aren't like this, in fact you have to explicitly say load this memory location into register n or store this register n into a memory location. There are advantages to doing both of these methods. Separate load/store instructions allows for a simpler pipeline because no instruction will ever have to do a memory operations contained within the same instruction. RISC CPUs also tend to have a lot of general purpose registered with makes this even more feasible. Depending on the application, keeping variables in registers until a full computation is done means less pressure on the memory controller and cache as well as faster turnaround time since CPU registers are the fastest storage in a CPU.

I wanted to point this out because while a lot of the differences between CISC and RISC CPUs have evaporated, there are some things like the LOAD/STORE bit that still tends to hold true. IIRC, I want to say that RISC CPUs tend to be much more rigid in terms of the number of operands that can be provided to any given instruction as well where there are some X86 instructions that are essentially multi-arity instructions. These small things tend to result in a smaller pipeline on ARM and other RISC CPUs compared to their X86 counterparts.

As I understand it, those are architectural preferences by designers than actual traits of RISC vs CISC, which is why I tend to ignore it in favour of directly comparing the number of instructions available on each and the internal implementations. I mean, sure, a tiny ARM Cortex-M4 is essentially a direct implementation of the ISA, but a high-performance POWER8 design is much closer to a modern OoO CISC design like x86, and it shows when you compare dies and power consumption to performance...
 
Top