Wednesday, December 2nd 2020
RISC-V Processor Achieves 5 GHz Frequency at Just 1 Watt of Power
Researchers at the University of California, Berkeley in 2010 have started an interesting project. They created a goal to develop a new RISC-like Instruction Set Architecture that is simple and efficient while being open-source and royalty-free. Born out of that research was RISC-V ISA, the fifth iteration of Reduced Instruction Set Computing (RISC) ideology. Over the years, the RISC-V ISA has become more common, and today, many companies are using it to design their processors and release new designs every day. One of those companies is Micro Magic Inc., a provider of silicon design tools, IP, and design services. The company has developed a RISC-V processor that is rather interesting.
Apart from the RISC-V ISA, the processor has an interesting feature. It runs at the whopping 5 GHz frequency, a clock speed unseen on the RISC-V chips before, at the power consumption of a mere one (yes that is 1) Watt. The chip ran at just 1.1 Volts, which means that a very low current needs to be supplied to the chip so it can achieve the 5 GHz mark. If you are wondering about performance, well the numbers show that at 5 GHz, the CPU can produce a score of 13000 CoreMarks. However, that is not the company's highest-performance RISC-V core. In yesterday's PR, Micro Magic published that their top-end design can achieve 110000 CoreMarks/Watt, so we are waiting to hear more details about it.
Source:
EE Times
Apart from the RISC-V ISA, the processor has an interesting feature. It runs at the whopping 5 GHz frequency, a clock speed unseen on the RISC-V chips before, at the power consumption of a mere one (yes that is 1) Watt. The chip ran at just 1.1 Volts, which means that a very low current needs to be supplied to the chip so it can achieve the 5 GHz mark. If you are wondering about performance, well the numbers show that at 5 GHz, the CPU can produce a score of 13000 CoreMarks. However, that is not the company's highest-performance RISC-V core. In yesterday's PR, Micro Magic published that their top-end design can achieve 110000 CoreMarks/Watt, so we are waiting to hear more details about it.
65 Comments on RISC-V Processor Achieves 5 GHz Frequency at Just 1 Watt of Power
1. "Traditional" CPUs: Branch-predicted, out-of-order, pipelined, superscalar cores -- ARM, POWER9 / POWER10, RISC-V, x86.
2. SIMD -- NVidia Ampere, AMD NAVI / GCN
3. VLIW -- Apple Neural Engine, Qualcomm Hexagon, Xilinx AI-engine
4. Systolic Engines -- NVidia "Tensor Cores", Google TPUs, "FPGAs"
I expect that most computers fall into one of the 4 categories today, maybe two or even three of the above categories. (Intel Skylake is traditional + SIMD. NVidia Ampere is SIMD + Systolic. Xilinx AI Engine is VLIW + SIMD + Systolic).
Apple M1 is just a really big traditional (branch-predicted / out-of-order / pipelined / superscalar) core. Its a non-standard configuration, but the performance benefits are pretty well known and decently studied at this point.
Baby steps, but in the right direction.
Let's say ARM or RISC-V achieves similar market-share across servers/desktops, which will take at least 10-15 years, that's enough time for enterprise/professional software to rely on "legacy garbage instructions", leading any new ISA at the same place where x86 is today.
RISC was great for specialized environments in the 90s, but a "generic user" using a RISC cpu today, will need dedicated fixed-function accelerators for video, audio, AI, compute, encryption, compression, graphics maybe more in some years.
All that fixed-function hardware, WILL NOT work with anything out of its purpose and when the CPU has no specialized instructions either, you are forced to upgrade/ditch old hardware.
Can you run an old iOS/MacOS/Android on a new smartphone?
That's a great opportunity to sell different hardware for different needs in a world where needs continue to increase and differ every 2 years.
A world brought to you by Apple and every other company's wet dream , where old software does not work on newer hardware.
B-B-BUT EMULATORS!!!
Meanwhile on x86 you can run anything you like - natively - because of that "legacy garbage".
And after intel stops supporting CSM and AMD follows a year or two after, you'll "can" even more. We are already at that point, and in most cases it's not hardware, but software that's the limiting factor (artificial, mind you). Just look at our current situation with windows and linux: wanna run old linux software - appimage, a container or a VM is your best friend (unless you wanna break something else with old dependencies); wanna run an old game - use dosbox or borrow your grandpa's PC; wanna use ancient CAD software - make a VM and install XP on it. etc. etc. etc. Especially in govt. segment it's a norm to maintain old hardware just to be able to run old software, until the point of no-return.
Also, radical hardware changes don't happen that often, so, let's say, by the time RISC-VI rolls out, it'll probably be powerful enough to emulate RISC-V in software.
The "problem" of "legacy garbage instructions" is yet another myth. Modern x86 microarchitectures use their own microoperations which are optimized for the most current relevant features, and legacy features are translated into current features, so they are not really suffering for this legacy support, only a tiny overhead in the CPU front-end to translate it.
One example, modern desktop CPUs from Intel and AMD don't have single FPUs, they only have vector units. So they convert single floating point instructions, MMX, and SSE into AVX and runs everything through the AVX units. Yeah, these application specific instructions are a mess, they require low-level code to be written for each ISA variant, and then they quickly become obsolete. They may be a necessity for low-power appliances, but a desktop computer should rather have much more generic performance, performance you can leverage for future software, codecs, etc.
"Pure" RISC designs will ultimately fail when it comes to performance. The claimed advantage was a smaller RISC design could compete with a larger CISC design by running at a higher clock speed, and the lower die size offering lower costs. But performance today scales towards cache misses, a single one costs ~400-500 clocks for Skylake/Zen. But even if you could push your RISC design far beyond 5 GHz, you will eventually get to a point where you can no longer offset the performance lost to extra cache misses by just boosting the clock speed. In order to close the gap, ARM needs to remain add comparable CISC-style features. Current ARM designs rely heavily on application specific instructions to be "competitive", so don't trust benchmarks like Geekbench to show generic performance. RISC-V will not get anywhere close, it lacks all kinds of "complex" instructions. Let's take one example; Instructions like CMOV may look insignificant, but it eliminates a lot of branching which in turn means less branch mispredictions, less cache misses and improved cache efficiency. We've had this one since Pentium, and many such instructions are essential to achieve the performance and responsiveness we expect from a modern desktop computer. A pure RISC design lacks such features, and no amount of clock speed can compensate for features which reduces cache misses.
Additionally, most x86 software is actually not compiled using an up-to-date ISA (while ARM software often is fairly up-to-date). A lot of the software in your computer, including your OS, drivers and a lot of your application are compiled for x86-64 and SSE2, so 17 years "behind". Is is at least about to change for GCC and LLVM for the purpose of compiling Linux with much greater performance. Hopefully MS will soon follow, and unlock this "free" potential.
I used to emulate Mac OS X on x86 back in the PowerPC days. It was doggone slow. That only works in an open source context where you can expect recompiles. AVX2 has been supported in the windows core since Windows 7 SP1 IIRC, but apps have not followed suit.
Then, I will try to find all my cassette tapes and 100K floppy disks of 6502 assembly code and run them again!
But AVX2 is still avoided by some application programmers because 2012-era CPUs are still kinda common (The venerable i7-2600k is still popular and kicking around today in many people's builds). AVX (the first one) was first supported by Sandy Bridge (i7-2600k), which was 2011. Probably safe to use today, but some people do run 10+ year old computers without that feature.
Worse still: the i3 and "Pentium" and "Celeron" chips, as well as 'Atoms' never supported AVX. So someone with an Atom from 2015 won't be able to run AVX or AVX2 code. "Open Source" code which tries to run on multiple platforms, only go up to 128-bit vectors (ARM NEON is only 128-bit wide), and therefore SSE (also a 128-bit instruction set) is the ideal for compatibility. Even the Apple M1 is still 128-bit wide for its SIMD-units.
Meaning you can target avx and if it is not present, it will fall back to legacy code.
Such CPUs can be found all over offices, schools etc.