It's not like ARM doesn't come with its own share of extensions, but yeah...
I chose RISC-V as an example because:
1. While the "Advanced RISC Machine" has RISC in the acronym, the "RISC-V" CPU has the RISC acronym in its acronym. So it is "more obvious" that RISC-V is trying to be a RISC system.
2. ARM has plenty of extensions, but not quite as many as RISC-V, nor as convoluted. Western Digital probably has a whole bunch of special extensions in its RISC-V core for hard drive math and doesn't need to tell anybody about it. Most ARMs use the standard core and kind of add I/O peripherals (ie: Apple's Neural Engine, Qualcomm's DSP engine, microcontrollers and their ADC or GPIO pins, etc. etc.).
So you're right. Its just that I feel like "RISC-V" is a better example for #1 and #2. But ARM is also a perfectly fine example for how nonsensical "RISC vs CISC" debates have gotten recently.
If memory serves me right, just a few years ago people were talking same crap about ARM, and look where we are today: #1 in top500 is ARM-based, new macbooks are ARM-based, tons of HPC enterprise solutions are popping up everywhere, and smartphones are becoming faster than our laptops and desktops. Don't forget that this whole movement stared with RISC-V foundation, which is only 5 years old, and they are making strides a lot faster (in both performance and adoption rate) than ARM.
Ehhhh... just really M1 and A64Fx.
The thing about CPUs is that
execution matters more than ISA or philosophy (like RISC). 5 years ago, there weren't any 8-wide ARM decoders with 300x register files with an 600-out-of-order window on 128kB L1 cache. Today there is.
Its not the 'ARM' or 'RISC' that matters. Its the freakishly huge cache, freakishly huge decoder, freakishly huge register file, and freakishly huge out-of-order window that matters. At some point the ISA has a degree of influence (ARM is probably easier to decode than x86). But the ISA itself isn't really the part of the chip that's "important" for performance. And mind you: despite Apple's M1 design clearly taking the single-core and IPC crown, I'm not entirely sure I agree with all the tradeoffs yet. Apple's M1 is literally twice as large as other cores: AMD Renoir fits 8x large cores + Vega iGPU on 10-billion transistors, while Apple M1 only has 4x large cores + iGPU + neural engine on 16-billion transistors.
Apple's strategy is "fewer cores with more IPC" to an incredible degree, the likes that we haven't seen before. Its a bold move. It might work, but I'm not 100% convinced its the best idea yet.
All of this RISC vs CISC stuff is 40 years out of date and ignores the bold design decisions that actually rocketed Apple to the #1 IPC / single core performance king. Bold because its contrary to the general story and general understanding of computers: any other CPU manufacturer would rather split such a wide core with hyper-threads at least, or run 2x cores instead of 1x double-sized core.