Monday, June 13th 2022

Researchers Use SiFive's RISC-V SoC to Build a Supercomputer

Researchers from Università di Bologna and CINECA, the largest supercomputing center in Italy, have been playing with the concept of developing a RISC-V supercomputer. The team has laid the grounds for the first-ever implementation that demonstrates the capability of the relatively novel ISA to run high-performance computing. To create a supercomputer, you need pieces of hardware that seem like Lego building blocks. Those are called clusters, made from a motherboard, processor, memory, and storage. Italian researchers decided to try and use something different than Intel/AMD solution to the problem and use a processor based on RISC-V ISA. Using SiFive's Freedom U740 SoC as the base, researchers named their RISC-V cluster "Monte Cimone."

Monte Cimone features four dual-board servers, each in a 1U form factor. Each board has a SiFive's Freedom U740 SoC with four U74 cores running up to 1.4 GHz and one S7 management core. In total, eight nodes combine for a total of 32 RISC-V cores. Paired with 16 GB of 64-bit DDR4 memory operating at 1866s MT/s, PCIe Gen 3 x8 bus running at 7.8 GB/s, one gigabit Ethernet port, USB 3.2 Gen 1 interfaces, the system is powered by two 250 Watt PSUs to support future expansion and addition of accelerator cards.
The team over in Italy benchmarked the system using HPL and Stream to determine the machine's floating-point computation capability and memory bandwidth. While the results are not very impressive, they are a beginning for RISC-V. Each node produced a sustained 1.86 GFLOPS performance in HPL, with a total computing power of 14.88 GFLOPS with perfect linear scaling. However, the efficiency for the entire cluster was 85%, resulting in 12.65 GFLOPS of computational force. The node should achieve a 14.928 GB/s in memory bandwidth; however, the actual results were 7760 MB/s.
These results show two things. Firstly, the RISC-V HPC software stack is mature but needs further optimization and faster silicon to achieve anything monumental like weather simulation. Secondly, it shows that scaling in the HPC world is quite tricky and requires careful optimization to get the hardware and software to coexist in a world where everything scales well. So to get to a point where we meet some of the scaling and performance of supercomputers like Frontier, RISC-V needs a lot more tuning. Researchers and engineers are working hard to bring that idea to life, and it is a matter of time before we see more robust designs appear.
Sources: The Next Platform, via Tom's Hardware
Add your own comment

14 Comments on Researchers Use SiFive's RISC-V SoC to Build a Supercomputer

#1
Denver
Risc-V doesn't even exist. period.

PS: Its a joke.
Posted on Reply
#2
P4-630
Then I call my own desktop system with a i7 12700K and 32GB DDR5 a "Supercomputer"...
It actually is :D:D
Posted on Reply
#3
Avlin
Each node produced a sustained 1.86 GFLOPS

Level 1999 computing
Posted on Reply
#4
defaultluser
You see, this is exactly why yu will never see any actual supercomputers form siifive - by the time you spend money building-up all that missing coherent interconnect, all that missing high-bandwith management engines, and then fixing the under-powered CPUs themselves, you could have just bought a compute GPU from NVIDIA.

When you already have N2 with multi-socket systems, it becomes even harder to make anyone care about "Yet-another pointless RISC"!

RiscV is doomed to be a fully-customized micro-controller platform, NOT A SUPERCOMPUTER.
Posted on Reply
#5
eidairaman1
The Exiled Airman
P4-630Then I call my own desktop system with a i7 12700K and 32GB DDR5 a "Supercomputer"...
It actually is :D:D
Same with my FX8350 with 2400 ram
Posted on Reply
#6
Fourstaff
Long journey ahead for RISC-V. We are not going to see high performance coming out of this architecture anytime soon, but they might work their way into cheaper electronics e.g. routers. or other "smart" devices.
Posted on Reply
#7
Count von Schwalbe
Read a bit somewhere about x86 potentially reaching a ceiling of performance. While their predictions were inaccurate, I am not sure how much more we can shrink our nodes. If, and only if, there comes a point where new nodes are impractical (or far too expensive), we may see more RISC designs in pursuit of efficiency. As the only two real options are ARM and RISC-V, I can see this being a serious thing.

I realize that this is not particularly likely, and certainly not imminent, but the future may lie here. If so, these researchers will be at the forefront of this movement.
Posted on Reply
#8
blitz120
Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
Posted on Reply
#9
Wirko
blitz120Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
Simulation of anything in the physical world requires FP. Machine learning requires FP. Sure there are some types of computing workload that require mostly integer arithmetic but I can't think of any right now. Gene sequencing maybe?
Posted on Reply
#10
silentbogo
blitz120I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
I'm not a career programmer, dipped my toes on more than one occasion. If you work for a relatively big or relatively old company - you constantly have to deal with tons of legacy stuff, and legacy approaches to coding. Back in a day floats were slow and expensive, so most of the libs avoid using them. Some companies had their own portfolio of code, which might be outdated and total crap, but they still force everyone to use them. My cousin works at the big company which had a contract with anther big subcontractor, which worked for a huge and famous car manufacturer which I shall not name, which had a stupid requirement to use only their broken "proprietary" implementations of sdt* libraries, all to avoid GPL, or not being able to use the fastest and easiest option for device's UI, just because it's opensource.
Nowadays almost everything hangs on FP, from ML/AI to physics simulations. The entire HPC industry basically throttles on parallelizing more FP16/FP32 for even more massive sims, most AI/ML code relies on FP, same goes for CV.
AleksandarKResearchers from Università di Bologna and CINECA, the largest supercomputing center in Italy, have been playing with the concept of developing a RISC-V supercomputer.
That's where the entire stock of Unmatched boards went... I believe it's a bit stupid to build a "supercomputer" out of dev boards based on an early chip architecture with a core IP that developers themselves market as being "ideal" for network appliances and DVRs (not servers). Now, all these boards are gonna rot at some university's basement rather than being used by devs to port and adapt software for this platform. Once again, short-term financial gains beat long-term benefits .
Phoronix did an early review, and this thing is at best twice as slow as Pi400(in the best case scenario), hence a whole rack with two boards is barely enough to catch up with a credit-card sized SBC.
FourstaffLong journey ahead for RISC-V. We are not going to see high performance coming out of this architecture anytime soon, but they might work their way into cheaper electronics e.g. routers. or other "smart" devices.
They are an ideal candidate to bump MIPS off it's spot in network appliance market. Too bad SiFive decided to sell the bulk, just to tease "the next big thing" for devs, while it's been several generations of boards that missed the mark already. I don't think I've ever seen any SiFive boards in real life, nor was I able to buy a RISC-V MCU dev board. I was hoping to get my hands at least on Allwinner D1, but that's gonna be a real bummer at least until the war is over (most sellers on alibaba/aliexpress don't ship to Ukraine, at least stuff that's interesting or useful to me).
Posted on Reply
#11
First Strike
blitz120Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
Because "supercomputer" is a term designated to scientific computing. Back when computers were first invented, they were used to calculate artillery trajectory, molecular dynamics, etc. That's what a "compute"r really means. So supercomputer computes super heavy scientific problems.

Back then there were no Facebook nor Google, integer performance can only do as much good as an email server.

Also the scientific workloads basically have an unbounded need for FP performance. The finer the simulation grid, the better. In the contrast Facebook and Google's server do have a finite client size and ROI. So, if someone want to push their scaling technique to the limit, they should build a computer for scientific workloads.
Posted on Reply
#12
dragontamer5788
blitz120Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
Because supercomputer workloads are largely composed of double-precision floating point calculations.

* FEA (Finite Element Analysis), aka simulated car crashes, bridge modeling, etc. etc.
* Weather simulations
* Protein Folding
* Atoms / Molecule simulations
* etc. etc.

All of these are double-precision floating point problems, the type that very big government organizations are willing to spend $300,000,000 to calculate slightly better than other government organizations.
WirkoSure there are some types of computing workload that require mostly integer arithmetic but I can't think of any right now.
The supercomputer-level integer stuff is for CPU synthesis. Proving multipliers, RTL (register transfer languages) are correct and stuff.

I'm fairly certain they can, theoretically, be run on a GPU. (GPGPU accelerated binary decision diagrams are an active area of research right now. They seem possible even if not all the details are figured out) But I bet most programs (which are 30+ years old) are based on CPU compute (largely because the GPU stuff is still research-project phase).

It is said that the 3d-cache that AMD made is for AMD to use (!!!!), because it really accelerates this kind of simulation. So AMD is likely using AMD EPYCs with the best 3D-cache / big L3 caches for designing CPUs and other digital-synthesis problems.
Posted on Reply
#13
eidairaman1
The Exiled Airman
blitz120Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
silentbogoI'm not a career programmer, dipped my toes on more than one occasion. If you work for a relatively big or relatively old company - you constantly have to deal with tons of legacy stuff, and legacy approaches to coding. Back in a day floats were slow and expensive, so most of the libs avoid using them. Some companies had their own portfolio of code, which might be outdated and total crap, but they still force everyone to use them. My cousin works at the big company which had a contract with anther big subcontractor, which worked for a huge and famous car manufacturer which I shall not name, which had a stupid requirement to use only their broken "proprietary" implementations of sdt* libraries, all to avoid GPL, or not being able to use the fastest and easiest option for device's UI, just because it's opensource.
Nowadays almost everything hangs on FP, from ML/AI to physics simulations. The entire HPC industry basically throttles on parallelizing more FP16/FP32 for even more massive sims, most AI/ML code relies on FP, same goes for CV.


That's where the entire stock of Unmatched boards went... I believe it's a bit stupid to build a "supercomputer" out of dev boards based on an early chip architecture with a core IP that developers themselves market as being "ideal" for network appliances and DVRs (not servers). Now, all these boards are gonna rot at some university's basement rather than being used by devs to port and adapt software for this platform. Once again, short-term financial gains beat long-term benefits .
Phoronix did an early review, and this thing is at best twice as slow as Pi400(in the best case scenario), hence a whole rack with two boards is barely enough to catch up with a credit-card sized SBC.

They are an ideal candidate to bump MIPS off it's spot in network appliance market. Too bad SiFive decided to sell the bulk, just to tease "the next big thing" for devs, while it's been several generations of boards that missed the mark already. I don't think I've ever seen any SiFive boards in real life, nor was I able to buy a RISC-V MCU dev board. I was hoping to get my hands at least on Allwinner D1, but that's gonna be a real bummer at least until the war is over (most sellers on alibaba/aliexpress don't ship to Ukraine, at least stuff that's interesting or useful to me).
At this rate we should go back to Fortran, Basic and use Unix.
Posted on Reply
#14
blitz120
WirkoSimulation of anything in the physical world requires FP. Machine learning requires FP. Sure there are some types of computing workload that require mostly integer arithmetic but I can't think of any right now. Gene sequencing maybe?
Most arithmetic operations can be bounded, and thus scaled integers are more than sufficient, and avoid the rounding errors and nonuniform distribution of floating point numbers. Machine learning certainly uses floating point, but it certainly doesn't need it. On the other hand, there are many algorithms which require extensive graph manipulation, which rely heavily on pointer and integer arithmetic.
silentbogoI'm not a career programmer, dipped my toes on more than one occasion. If you work for a relatively big or relatively old company - you constantly have to deal with tons of legacy stuff, and legacy approaches to coding. Back in a day floats were slow and expensive, so most of the libs avoid using them. Some companies had their own portfolio of code, which might be outdated and total crap, but they still force everyone to use them. My cousin works at the big company which had a contract with anther big subcontractor, which worked for a huge and famous car manufacturer which I shall not name, which had a stupid requirement to use only their broken "proprietary" implementations of sdt* libraries, all to avoid GPL, or not being able to use the fastest and easiest option for device's UI, just because it's opensource.
Nowadays almost everything hangs on FP, from ML/AI to physics simulations. The entire HPC industry basically throttles on parallelizing more FP16/FP32 for even more massive sims, most AI/ML code relies on FP, same goes for CV.
I spent most of my career working for large and old telecom companies, and generally didn't deal much with legacy systems. The work was on a variety of areas, from manufacturing to billing, to pattern recognition and matching, to implementing database and transaction processing systems, to OS, to language interpreters and compilers, to framework development. None of these required any significant amount of floating point work. The single largest use of floating point was building floating point types into a database system -- not because anyone really wanted to use it, but it was required to meet standards and kept us "check box compliant".
Posted on Reply
Add your own comment
May 21st, 2024 13:13 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts