• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Researchers Use SiFive's RISC-V SoC to Build a Supercomputer

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,732 (1.01/day)
Researchers from Università di Bologna and CINECA, the largest supercomputing center in Italy, have been playing with the concept of developing a RISC-V supercomputer. The team has laid the grounds for the first-ever implementation that demonstrates the capability of the relatively novel ISA to run high-performance computing. To create a supercomputer, you need pieces of hardware that seem like Lego building blocks. Those are called clusters, made from a motherboard, processor, memory, and storage. Italian researchers decided to try and use something different than Intel/AMD solution to the problem and use a processor based on RISC-V ISA. Using SiFive's Freedom U740 SoC as the base, researchers named their RISC-V cluster "Monte Cimone."

Monte Cimone features four dual-board servers, each in a 1U form factor. Each board has a SiFive's Freedom U740 SoC with four U74 cores running up to 1.4 GHz and one S7 management core. In total, eight nodes combine for a total of 32 RISC-V cores. Paired with 16 GB of 64-bit DDR4 memory operating at 1866s MT/s, PCIe Gen 3 x8 bus running at 7.8 GB/s, one gigabit Ethernet port, USB 3.2 Gen 1 interfaces, the system is powered by two 250 Watt PSUs to support future expansion and addition of accelerator cards.



The team over in Italy benchmarked the system using HPL and Stream to determine the machine's floating-point computation capability and memory bandwidth. While the results are not very impressive, they are a beginning for RISC-V. Each node produced a sustained 1.86 GFLOPS performance in HPL, with a total computing power of 14.88 GFLOPS with perfect linear scaling. However, the efficiency for the entire cluster was 85%, resulting in 12.65 GFLOPS of computational force. The node should achieve a 14.928 GB/s in memory bandwidth; however, the actual results were 7760 MB/s.

These results show two things. Firstly, the RISC-V HPC software stack is mature but needs further optimization and faster silicon to achieve anything monumental like weather simulation. Secondly, it shows that scaling in the HPC world is quite tricky and requires careful optimization to get the hardware and software to coexist in a world where everything scales well. So to get to a point where we meet some of the scaling and performance of supercomputers like Frontier, RISC-V needs a lot more tuning. Researchers and engineers are working hard to bring that idea to life, and it is a matter of time before we see more robust designs appear.

View at TechPowerUp Main Site | Source
 
Joined
Jan 5, 2006
Messages
18,584 (2.67/day)
System Name AlderLake
Processor Intel i7 12700K P-Cores @ 5Ghz
Motherboard Gigabyte Z690 Aorus Master
Cooling Noctua NH-U12A 2 fans + Thermal Grizzly Kryonaut Extreme + 5 case fans
Memory 32GB DDR5 Corsair Dominator Platinum RGB 6000MT/s CL36
Video Card(s) MSI RTX 2070 Super Gaming X Trio
Storage Samsung 980 Pro 1TB + 970 Evo 500GB + 850 Pro 512GB + 860 Evo 1TB x2
Display(s) 23.8" Dell S2417DG 165Hz G-Sync 1440p
Case Be quiet! Silent Base 600 - Window
Audio Device(s) Panasonic SA-PMX94 / Realtek onboard + B&O speaker system / Harman Kardon Go + Play / Logitech G533
Power Supply Seasonic Focus Plus Gold 750W
Mouse Logitech MX Anywhere 2 Laser wireless
Keyboard RAPOO E9270P Black 5GHz wireless
Software Windows 11
Benchmark Scores Cinebench R23 (Single Core) 1936 @ stock Cinebench R23 (Multi Core) 23006 @ stock
Then I call my own desktop system with a i7 12700K and 32GB DDR5 a "Supercomputer"...
It actually is :D:D
 
Joined
Aug 6, 2020
Messages
729 (0.45/day)
You see, this is exactly why yu will never see any actual supercomputers form siifive - by the time you spend money building-up all that missing coherent interconnect, all that missing high-bandwith management engines, and then fixing the under-powered CPUs themselves, you could have just bought a compute GPU from NVIDIA.

When you already have N2 with multi-socket systems, it becomes even harder to make anyone care about "Yet-another pointless RISC"!

RiscV is doomed to be a fully-customized micro-controller platform, NOT A SUPERCOMPUTER.
 
Last edited:

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
43,224 (6.74/day)
Location
Republic of Texas (True Patriot)
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
Then I call my own desktop system with a i7 12700K and 32GB DDR5 a "Supercomputer"...
It actually is :D:D
Same with my FX8350 with 2400 ram
 

Fourstaff

Moderator
Staff member
Joined
Nov 29, 2009
Messages
10,082 (1.82/day)
Location
Home
System Name Orange! // ItchyHands
Processor 3570K // 10400F
Motherboard ASRock z77 Extreme4 // TUF Gaming B460M-Plus
Cooling Stock // Stock
Memory 2x4Gb 1600Mhz CL9 Corsair XMS3 // 2x8Gb 3200 Mhz XPG D41
Video Card(s) Sapphire Nitro+ RX 570 // Asus TUF RTX 2070
Storage Samsung 840 250Gb // SX8200 480GB
Display(s) LG 22EA53VQ // Philips 275M QHD
Case NZXT Phantom 410 Black/Orange // Tecware Forge M
Power Supply Corsair CXM500w // CM MWE 600w
Long journey ahead for RISC-V. We are not going to see high performance coming out of this architecture anytime soon, but they might work their way into cheaper electronics e.g. routers. or other "smart" devices.
 

Count von Schwalbe

Nocturnus Moderatus
Staff member
Joined
Nov 15, 2021
Messages
3,243 (2.78/day)
Location
Knoxville, TN, USA
System Name Work Computer | Unfinished Computer
Processor Core i7-6700 | Ryzen 5 5600X
Motherboard Dell Q170 | Gigabyte Aorus Elite Wi-Fi
Cooling A fan? | Truly Custom Loop
Memory 4x4GB Crucial 2133 C17 | 4x8GB Corsair Vengeance RGB 3600 C26
Video Card(s) Dell Radeon R7 450 | RTX 2080 Ti FE
Storage Crucial BX500 2TB | TBD
Display(s) 3x LG QHD 32" GSM5B96 | TBD
Case Dell | Heavily Modified Phanteks P400
Power Supply Dell TFX Non-standard | EVGA BQ 650W
Mouse Monster No-Name $7 Gaming Mouse| TBD
Read a bit somewhere about x86 potentially reaching a ceiling of performance. While their predictions were inaccurate, I am not sure how much more we can shrink our nodes. If, and only if, there comes a point where new nodes are impractical (or far too expensive), we may see more RISC designs in pursuit of efficiency. As the only two real options are ARM and RISC-V, I can see this being a serious thing.

I realize that this is not particularly likely, and certainly not imminent, but the future may lie here. If so, these researchers will be at the forefront of this movement.
 

blitz120

New Member
Joined
Jun 14, 2022
Messages
2 (0.00/day)
Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
 
Joined
Jan 3, 2021
Messages
3,726 (2.52/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
Simulation of anything in the physical world requires FP. Machine learning requires FP. Sure there are some types of computing workload that require mostly integer arithmetic but I can't think of any right now. Gene sequencing maybe?
 

silentbogo

Moderator
Staff member
Joined
Nov 20, 2013
Messages
5,594 (1.37/day)
Location
Kyiv, Ukraine
System Name WS#1337
Processor Ryzen 7 5700X3D
Motherboard ASUS X570-PLUS TUF Gaming
Cooling Xigmatek Scylla 240mm AIO
Memory 64GB DDR4-3600(4x16)
Video Card(s) MSI RTX 3070 Gaming X Trio
Storage ADATA Legend 2TB
Display(s) Samsung Viewfinity Ultra S6 (34" UW)
Case ghetto CM Cosmos RC-1000
Audio Device(s) ALC1220
Power Supply SeaSonic SSR-550FX (80+ GOLD)
Mouse Logitech G603
Keyboard Modecom Volcano Blade (Kailh choc LP)
VR HMD Google dreamview headset(aka fancy cardboard)
Software Windows 11, Ubuntu 24.04 LTS
I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
I'm not a career programmer, dipped my toes on more than one occasion. If you work for a relatively big or relatively old company - you constantly have to deal with tons of legacy stuff, and legacy approaches to coding. Back in a day floats were slow and expensive, so most of the libs avoid using them. Some companies had their own portfolio of code, which might be outdated and total crap, but they still force everyone to use them. My cousin works at the big company which had a contract with anther big subcontractor, which worked for a huge and famous car manufacturer which I shall not name, which had a stupid requirement to use only their broken "proprietary" implementations of sdt* libraries, all to avoid GPL, or not being able to use the fastest and easiest option for device's UI, just because it's opensource.
Nowadays almost everything hangs on FP, from ML/AI to physics simulations. The entire HPC industry basically throttles on parallelizing more FP16/FP32 for even more massive sims, most AI/ML code relies on FP, same goes for CV.

Researchers from Università di Bologna and CINECA, the largest supercomputing center in Italy, have been playing with the concept of developing a RISC-V supercomputer.
That's where the entire stock of Unmatched boards went... I believe it's a bit stupid to build a "supercomputer" out of dev boards based on an early chip architecture with a core IP that developers themselves market as being "ideal" for network appliances and DVRs (not servers). Now, all these boards are gonna rot at some university's basement rather than being used by devs to port and adapt software for this platform. Once again, short-term financial gains beat long-term benefits .
Phoronix did an early review, and this thing is at best twice as slow as Pi400(in the best case scenario), hence a whole rack with two boards is barely enough to catch up with a credit-card sized SBC.
Long journey ahead for RISC-V. We are not going to see high performance coming out of this architecture anytime soon, but they might work their way into cheaper electronics e.g. routers. or other "smart" devices.
They are an ideal candidate to bump MIPS off it's spot in network appliance market. Too bad SiFive decided to sell the bulk, just to tease "the next big thing" for devs, while it's been several generations of boards that missed the mark already. I don't think I've ever seen any SiFive boards in real life, nor was I able to buy a RISC-V MCU dev board. I was hoping to get my hands at least on Allwinner D1, but that's gonna be a real bummer at least until the war is over (most sellers on alibaba/aliexpress don't ship to Ukraine, at least stuff that's interesting or useful to me).
 
Joined
Jun 12, 2017
Messages
136 (0.05/day)
Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.
Because "supercomputer" is a term designated to scientific computing. Back when computers were first invented, they were used to calculate artillery trajectory, molecular dynamics, etc. That's what a "compute"r really means. So supercomputer computes super heavy scientific problems.

Back then there were no Facebook nor Google, integer performance can only do as much good as an email server.

Also the scientific workloads basically have an unbounded need for FP performance. The finer the simulation grid, the better. In the contrast Facebook and Google's server do have a finite client size and ROI. So, if someone want to push their scaling technique to the limit, they should build a computer for scientific workloads.
 
Last edited:
Joined
Apr 24, 2020
Messages
2,792 (1.61/day)
Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.

Because supercomputer workloads are largely composed of double-precision floating point calculations.

* FEA (Finite Element Analysis), aka simulated car crashes, bridge modeling, etc. etc.
* Weather simulations
* Protein Folding
* Atoms / Molecule simulations
* etc. etc.

All of these are double-precision floating point problems, the type that very big government organizations are willing to spend $300,000,000 to calculate slightly better than other government organizations.

Sure there are some types of computing workload that require mostly integer arithmetic but I can't think of any right now.

The supercomputer-level integer stuff is for CPU synthesis. Proving multipliers, RTL (register transfer languages) are correct and stuff.

I'm fairly certain they can, theoretically, be run on a GPU. (GPGPU accelerated binary decision diagrams are an active area of research right now. They seem possible even if not all the details are figured out) But I bet most programs (which are 30+ years old) are based on CPU compute (largely because the GPU stuff is still research-project phase).

It is said that the 3d-cache that AMD made is for AMD to use (!!!!), because it really accelerates this kind of simulation. So AMD is likely using AMD EPYCs with the best 3D-cache / big L3 caches for designing CPUs and other digital-synthesis problems.
 
Last edited:

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
43,224 (6.74/day)
Location
Republic of Texas (True Patriot)
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
Why do supercomputer benchmarks always quote floating point performance? Most interesting software makes very little use of floating point, and integer and pointer arithmetic are far more common. I spent almost 40 years as a software engineer, and used floating point perhaps a dozen times, and my colleagues had similar experiences.

I'm not a career programmer, dipped my toes on more than one occasion. If you work for a relatively big or relatively old company - you constantly have to deal with tons of legacy stuff, and legacy approaches to coding. Back in a day floats were slow and expensive, so most of the libs avoid using them. Some companies had their own portfolio of code, which might be outdated and total crap, but they still force everyone to use them. My cousin works at the big company which had a contract with anther big subcontractor, which worked for a huge and famous car manufacturer which I shall not name, which had a stupid requirement to use only their broken "proprietary" implementations of sdt* libraries, all to avoid GPL, or not being able to use the fastest and easiest option for device's UI, just because it's opensource.
Nowadays almost everything hangs on FP, from ML/AI to physics simulations. The entire HPC industry basically throttles on parallelizing more FP16/FP32 for even more massive sims, most AI/ML code relies on FP, same goes for CV.


That's where the entire stock of Unmatched boards went... I believe it's a bit stupid to build a "supercomputer" out of dev boards based on an early chip architecture with a core IP that developers themselves market as being "ideal" for network appliances and DVRs (not servers). Now, all these boards are gonna rot at some university's basement rather than being used by devs to port and adapt software for this platform. Once again, short-term financial gains beat long-term benefits .
Phoronix did an early review, and this thing is at best twice as slow as Pi400(in the best case scenario), hence a whole rack with two boards is barely enough to catch up with a credit-card sized SBC.

They are an ideal candidate to bump MIPS off it's spot in network appliance market. Too bad SiFive decided to sell the bulk, just to tease "the next big thing" for devs, while it's been several generations of boards that missed the mark already. I don't think I've ever seen any SiFive boards in real life, nor was I able to buy a RISC-V MCU dev board. I was hoping to get my hands at least on Allwinner D1, but that's gonna be a real bummer at least until the war is over (most sellers on alibaba/aliexpress don't ship to Ukraine, at least stuff that's interesting or useful to me).

At this rate we should go back to Fortran, Basic and use Unix.
 

blitz120

New Member
Joined
Jun 14, 2022
Messages
2 (0.00/day)
Simulation of anything in the physical world requires FP. Machine learning requires FP. Sure there are some types of computing workload that require mostly integer arithmetic but I can't think of any right now. Gene sequencing maybe?
Most arithmetic operations can be bounded, and thus scaled integers are more than sufficient, and avoid the rounding errors and nonuniform distribution of floating point numbers. Machine learning certainly uses floating point, but it certainly doesn't need it. On the other hand, there are many algorithms which require extensive graph manipulation, which rely heavily on pointer and integer arithmetic.

I'm not a career programmer, dipped my toes on more than one occasion. If you work for a relatively big or relatively old company - you constantly have to deal with tons of legacy stuff, and legacy approaches to coding. Back in a day floats were slow and expensive, so most of the libs avoid using them. Some companies had their own portfolio of code, which might be outdated and total crap, but they still force everyone to use them. My cousin works at the big company which had a contract with anther big subcontractor, which worked for a huge and famous car manufacturer which I shall not name, which had a stupid requirement to use only their broken "proprietary" implementations of sdt* libraries, all to avoid GPL, or not being able to use the fastest and easiest option for device's UI, just because it's opensource.
Nowadays almost everything hangs on FP, from ML/AI to physics simulations. The entire HPC industry basically throttles on parallelizing more FP16/FP32 for even more massive sims, most AI/ML code relies on FP, same goes for CV.

I spent most of my career working for large and old telecom companies, and generally didn't deal much with legacy systems. The work was on a variety of areas, from manufacturing to billing, to pattern recognition and matching, to implementing database and transaction processing systems, to OS, to language interpreters and compilers, to framework development. None of these required any significant amount of floating point work. The single largest use of floating point was building floating point types into a database system -- not because anyone really wanted to use it, but it was required to meet standards and kept us "check box compliant".
 
Top