• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

TOP500 Update: Frontier Remains No.1 With Aurora Coming in at No. 2

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,641 (0.99/day)
The 62nd edition of the TOP500 reveals that the Frontier system retains its top spot and is still the only exascale machine on the list. However, five new or upgraded systems have shaken up the Top 10.

Housed at the Oak Ridge National Laboratory (ORNL) in Tennessee, USA, Frontier leads the pack with an HPL score of 1.194 EFlop/s - unchanged from the June 2023 list. Frontier utilizes AMD EPYC 64C 2GHz processors and is based on the latest HPE Cray EX235a architecture. The system has a total of 8,699,904 combined CPU and GPU cores. Additionally, Frontier has an impressive power efficiency rating of 52.59 GFlops/watt and relies on HPE's Slingshot 11 network for data transfer.




The new Aurora system at the Argonne Leadership Computing Facility in Illinois, USA, entered the list at the No. 2 spot - previously held by Fugaku - with an HPL score of 585.34 PFlop/s. That said, it is important to note that Aurora's numbers were submitted with a measurement on half of the planned final system. Aurora is currently being commissioned and will reportedly exceed Frontier with a peak performance of 2 EFlop/s when finished.

Aurora is built by Intel and is based on the HPE Cray EX - Intel Exascale Compute Blade, which uses Intel Xeon CPU Max Series processors and Intel Data Center GPU Max Series accelerators. These communicate through HPE's Slingshot-11 network interconnect.

In the entire list, 20 new systems now use Intel Sapphire Rapids CPUs. Bringing the total number of systems using this CPU to 25, the Intel Sapphire Rapids CPU is now leading the new CPU among new systems. However, of the 45 new systems on the list only four use the corresponding Intel GPU, with Aurora being the largest by far.

Another new system named Eagle, installed in the Microsoft Azure Cloud in the USA, has taken the No. 3 spot. This is the highest rank a cloud system has ever achieved on the TOP500. In fact, it was only 2 years ago that a previous Azure system was the first cloud system ever to enter the TOP10 at spot No. 10. This Microsoft NDv5 system has an HPL score of 561.2 PFlop/s and is based on Intel Xeon Platinum 8480C processors and NVIDIA H100 accelerators.

Fugaku has moved to its current ranking of No. 4 after achieving No. 2 in the June 2023 list and holding the No. 1 spot from June 2020 until November 2021. This system is based in Kobe, Japan, and has an HPL score of 442.01 PFlop/s. It remains the highest ranked system outside the USA.

The LUMI system based at Euro HPC/CSC in Kajaani, Finland, achieved the No. 5 spot with an HPL score of 379.70 PFlop/s. This system is the largest in Europe and has seen multiple upgrades that keep it near the top of the list, this time improving from an HPL score of 309.10 PFlop/s. on the last list.

Here is a summary of the systems in the Top 10:

  • Frontier remains the No. 1 system in the TOP500. This HPE Cray EX system is the first US system with a performance exceeding one Exaflop/s. It is installed at the Oak Ridge National Laboratory (ORNL) in Tennessee, USA, where it is operated for the Department of Energy (DOE). It currently has achieved 1.194 Exaflop/s using 8,699,904 cores. The HPE Cray EX architecture combines 3rd Gen AMD EPYC CPUs optimized for HPC and AI, with AMD Instinct 250X accelerators, and a Slingshot-11 interconnect.
  • Aurora achieved the No. 2 spot by submitting an HPL score of 585 Pflop/s measured on half of the full system. It is installed at the Argonne Leadership Computing Facility, Illinois, USA, where it is also operated for the Department of Energy (DOE). This new Intel system is based on HPE Cray EX - Intel Exascale Compute Blades. It uses Intel Xeon CPU Max Series processors, Intel Data Center GPU Max Series accelerators, and a Slingshot-11 interconnect.
  • Eagle the new No. 3 system is installed by Microsoft in its Azure cloud. This Microsoft NDv5 system is based on Xeon Platinum 8480C processors and NVIDIA H100 accelerators and achieved an HPL score of 561 Pflop/s.
  • Fugaku, the No. 4 system, is installed at the RIKEN Center for Computational Science (R-CCS) in Kobe, Japan. It has 7,630,848 cores which allowed it to achieve an HPL benchmark score of 442 Pflop/s.
  • The (again) upgraded LUMI system, another HPE Cray EX system installed at EuroHPC center at CSC in Finland is now the No. 5 with a performance of 380 Pflop/s. The European High-Performance Computing Joint Undertaking (EuroHPC JU) is pooling European resources to develop top-of-the-range Exascale supercomputers for processing big data. One of the pan-European pre-Exascale supercomputers, LUMI, is located in CSC's data center in Kajaani, Finland.
  • The No. 6 system Leonardo is installed at a different EuroHPC site in CINECA, Italy. It is an Atos BullSequana XH2000 system with Xeon Platinum 8358 32C 2.6GHz as main processors, NVIDIA A100 SXM4 40 GB as accelerators, and Quad-rail NVIDIA HDR100 Infiniband as interconnect. It achieved a Linpack performance of 238.7 Pflop/s.
  • Summit, an IBM-built system at the Oak Ridge National Laboratory (ORNL) in Tennessee, USA, is now listed at the No. 7 spot worldwide with a performance of 148.8 Pflop/s on the HPL benchmark, which is used to rank the TOP500 list. Summit has 4,356 nodes, each one housing two POWER9 CPUs with 22 cores each and six NVIDIA Tesla V100 GPUs each with 80 streaming multiprocessors (SM). The nodes are linked together with a Mellanox dual-rail EDR InfiniBand network.
  • The MareNostrum 5 ACC system is new at No. 8 and installed at the EuroHPC/Barcelona Supercomputing Center in Spain. This BullSequana XH3000 system uses Xeon Platinum 8460Y processors with NVIDIA H100 and Infiniband NDR200. It achieved 183.2 Pflop/s HPL performance.
  • The new Eos system listed at No. 9 is a NVIDIA DGX SuperPOD based system at NVIDIA, USA. It is based on the NVIDIA DGX H100 with Xeon Platinum 8480C processors,N VIDIA H100 accelerators, and Infiniband NDR400 and it achieves 121.4 Pflop/s.
  • Sierra, a system at the Lawrence Livermore National Laboratory, CA, USA is at No. 10. Its architecture is very similar to the #7 system's Summit. It is built with 4,320 nodes with two POWER9 CPUs and four NVIDIA Tesla V100 GPUs. Sierra achieved 94.6 Pflop/s.

Other TOP500 Highlights
The 62nd edition of the TOP500 shows that Intel, AMD, and IBM processors are the preferred choice for HPC systems. Out of the TOP10, five systems use Intel Xeon processors (Aurora, Eagle, Leonardo, MareNostrum 5 ACC, and EOS NVIDIA DGX SuperPod), two systems use AMD processors (Frontier and LUMI), and two systems use IBM processors (Summit and Sierra.)

Much like it has been on many other previous lists, China and the United States earned most of the entries on the entire TOP500 list. The United States increased its lead from 150 machines on the previous list to 161 on this one, while China once again dropped from 134 to 104.

In terms of entire continents, North America improved from 160 machines on the previous list to 171 on this one, Asia decreased from 192 machines to 169, and Europe increased from 133 systems to 143.

GREEN500 Results
The No. 1 spot on the GREEN500 remains Henri at the Flatiron Institute in New York, USA. The system achieved an energy efficiency rating of 65.40 GFlops/Watt while producing an HPL score of 2.88 PFlop/s. Henri is a Lenovo ThinkSystem SR670 with Intel Xeon Platinum and NVIDIA H100, it has 8,288 total cores, and it ranks No. 293 on the TOP500 list.

The Frontier Test & Development (TDS) system at ORNL in Tennessee, USA, claims the No. 2 spot with an energy efficiency rating of 62.68 GFlops/Watt and an HPL score of 19.2 PFlop/s. The TDS is basically just one rack identical to the actual Frontier system and utilizes 120,832 total cores.

The No. 3 spot was taken by the Adastra system, which is housed at GENCI-CINES in France. The system achieved an energy efficiency rating of 58.02 GFlops/Watt and an impressive HPL score of 46.1 PFlop/s. Adastra has 319,072 total cores.

Additionally, just like on the last list, the actual Frontier system at No. 1 on the TOP500 deserves an honorable mention here for its achievements in energy efficiency. Despite more than doubling the HPL score of the Aurora system at the No. 2 spot on the TOP500, Frontier took the No. 8 spot on the GREEN500 with an energy efficiency of 52.59 GFlops/Watts.

Considering this system was the first machine to achieve exascale, Frontier is proof that power does not need to be sacrificed to achieve an impressive energy efficiency rating.

Finally, being green in supercomputing has truly become a global endeavor. The top 10 spots of the GREEN500 are occupied by 8 different countries: United States (3 times), France, Australia,Sweden, Spain, Finland, Germany, and South Korea.

HPCG Results
The TOP500 list has incorporated the High-Performance Conjugate Gradient (HPCG) benchmark results, which provide an alternative metric for assessing supercomputer performance. This score is meant to complement the HPL measurement to give a fuller understanding of the machine.

Like on the June 2023 list, Fugaku remains the leader with an HPCG benchmark of 16 PFlop/s. Frontier claimed the No. 2 spot with 14.05 HPCG-PFlop/s and No. 3 was taken by LUMI with 4.59 HPCG-PFlop/s.

HPL-MxP Results (Formerly HPL-AI)
The HPL-MxP benchmark seeks to highlight the use of mixed precision computations. Traditional HPC uses 64-bit floating point computations. Today, we see hardware with various levels of floating-point precisions - 32-bit, 16-bit, and even 8-bit. The HPL-MxP benchmark demonstrates that by using mixed precision during computation, much higher performance is possible. By using mathematical techniques, the same accuracy can be computed with a mixed-precision technique when compared with straight 64-bit precision.

Frontier was the clear winner here with an HPL-MxP score of 9.95 EFlop/s. The No. 2 spot was claimed by LUMI with a score of 2.35 EFlop/s and the No. 3 spot was taken by Fugaku with a score of 2.0 EFlop/s.

About the TOP500 List
The first version of what became today's TOP500 list started as an exercise for a small conference in Germany in June 1993. A second version of the list was compiled in November 1993 for the SC93 conference. Comparing both editions to see how things had changed the authors realized how valuable this information was and continued to compile statistics about the market for HPC systems based on it. The TOP500 is now a much-anticipated, much-watched and much-debated twice-yearly event.

View at TechPowerUp Main Site
 
Joined
Oct 6, 2021
Messages
1,605 (1.37/day)
If the GPUs were upgraded to the MI300, it would provide sufficient computational power to surpass the combined computing capabilities of all supercomputers on the list. :rolleyes:
 
Joined
Dec 12, 2016
Messages
1,939 (0.66/day)
All I can say is that Aurora is an absolute fail. Half the performance at the same power as Frontier at half scale (double the power at the same performance at full scale). If this is what I get for my taxes after 13 years I want my money back.
 
Joined
Jul 10, 2011
Messages
798 (0.16/day)
Processor Intel
Motherboard MSI
Cooling Cooler Master
Memory Corsair
Video Card(s) Nvidia
Storage Western Digital/Kingston
Display(s) Samsung
Case Thermaltake
Audio Device(s) On Board
Power Supply Seasonic
Mouse Glorious
Keyboard UniKey
Software Windows 10 x64
Joined
Oct 6, 2021
Messages
1,605 (1.37/day)
More like can't run a hours without failures. :roll:

"Oak Ridge National Laboratory’s Frontier supercomputer is by far not the only system around to use HPE’s Cray EX architecture with Slingshot interconnects, AMD’s EPYC CPUs and AMD’s Instinct compute GPUs. For example, Finland’s Lumi supercomputer (Cray EX, EPYC Milan, Instinct MI250X compute GPUs) delivers 550 PetaFLOPS peak performance and is officially ranked as the world’s third most powerful supercomputer. Perhaps, the problem is valid with the scale of the machine that uses 60 million parts in total."

Did you dig up something from last year to discredit AMD? Other supercomputers use the same hardware and have not had such problems, the important thing is that they have solved it. Now working as it should.

 
Joined
Jan 3, 2021
Messages
3,586 (2.48/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
"Oak Ridge National Laboratory’s Frontier supercomputer is by far not the only system around to use HPE’s Cray EX architecture with Slingshot interconnects, AMD’s EPYC CPUs and AMD’s Instinct compute GPUs. For example, Finland’s Lumi supercomputer (Cray EX, EPYC Milan, Instinct MI250X compute GPUs) delivers 550 PetaFLOPS peak performance and is officially ranked as the world’s third most powerful supercomputer. Perhaps, the problem is valid with the scale of the machine that uses 60 million parts in total."

Did you dig up something from last year to discredit AMD? Other supercomputers use the same hardware and have not had such problems, the important thing is that they have solved it. Now working as it should.

Maybe it's true that individual nodes fail more often than normal or anticipated in this industry, I don't know. But this doesn't keep Frontier from operating at full steam. It's just that technicians with forklifts suffer a bit more than they thought they would, having to swap out 20 nodes weekly instead of 10, for example.
 
Joined
Nov 6, 2016
Messages
1,770 (0.60/day)
Location
NH, USA
System Name Lightbringer
Processor Ryzen 7 2700X
Motherboard Asus ROG Strix X470-F Gaming
Cooling Enermax Liqmax Iii 360mm AIO
Memory G.Skill Trident Z RGB 32GB (8GBx4) 3200Mhz CL 14
Video Card(s) Sapphire RX 5700XT Nitro+
Storage Hp EX950 2TB NVMe M.2, HP EX950 1TB NVMe M.2, Samsung 860 EVO 2TB
Display(s) LG 34BK95U-W 34" 5120 x 2160
Case Lian Li PC-O11 Dynamic (White)
Power Supply BeQuiet Straight Power 11 850w Gold Rated PSU
Mouse Glorious Model O (Matte White)
Keyboard Royal Kludge RK71
Software Windows 10
What I don't understand is why somebody would use Intel over AMD or an Arm solution....quite literally intel's hardware is the slowest and least efficient, so the only thing I can come up with is that Intel is practically giving this stuff away. I have to believe that, for example, the hardware for Aurora was provided at serious discount, like cost or even lower, and Intel looks at it as a PR expense.
 

Fourstaff

Moderator
Staff member
Joined
Nov 29, 2009
Messages
10,079 (1.83/day)
Location
Home
System Name Orange! // ItchyHands
Processor 3570K // 10400F
Motherboard ASRock z77 Extreme4 // TUF Gaming B460M-Plus
Cooling Stock // Stock
Memory 2x4Gb 1600Mhz CL9 Corsair XMS3 // 2x8Gb 3200 Mhz XPG D41
Video Card(s) Sapphire Nitro+ RX 570 // Asus TUF RTX 2070
Storage Samsung 840 250Gb // SX8200 480GB
Display(s) LG 22EA53VQ // Philips 275M QHD
Case NZXT Phantom 410 Black/Orange // Tecware Forge M
Power Supply Corsair CXM500w // CM MWE 600w
#3 is Microsoft, that is interesting. Seems like they have a lot of money to throw at AI.
 

Leiesoldat

lazy gamer & woodworker
Supporter
Joined
Jun 29, 2021
Messages
123 (0.10/day)
System Name Arda
Processor AMD Ryzen 5800X3D
Motherboard Gigabyte X570-I AORUS Pro WiFi
Cooling Custom Loop - Aquacomputer, Optimus, EK, Bykski
Memory GSkill Trident Z RGB 32 GB (2x16) DDR4-3200
Video Card(s) Gigabyte Gaming OC RX 6800XT
Storage SK Hynix P41 1TB
Display(s) VIOTEK 3440 x 1440 144 Hz Curved
Case XTIA Proto-XL
Audio Device(s) Schiit Modius + Schiit Jotunheim
Power Supply Seasonic Prime 850W Titanium
Mouse Xtrfy MZ1 Zy's Rail Wireless
Keyboard Rainkeebs Yasui - Custom 40% Ortholinear
Software Windows 11 Pro
More like can't run a hours without failures. :roll:


All of those issues have been fixed. HPE had a team of engineers working around the clock for months fixing the issues of runtime. Most of the scientific runs now are completing at full scale, i.e. 8000+ nodes, with blistering speed results.

What I don't understand is why somebody would use Intel over AMD or an Arm solution....quite literally intel's hardware is the slowest and least efficient, so the only thing I can come up with is that Intel is practically giving this stuff away. I have to believe that, for example, the hardware for Aurora was provided at serious discount, like cost or even lower, and Intel looks at it as a PR expense.

You have to remember that the Aurora and Frontier projects were bid and accepted back in 2014/2015 (Aurora probably earlier than that since that machine was supposed to be a bridge between Summit and Frontier, but Intel kept delaying and increasing the power of the machine). These aren't just grab the latest and greatest and stick a bunch of them together. Frontier was 600M USD, and Aurora was probably a similar cost because the US Department of Energy wanted a diversity of providers instead of a sole source.
 
Last edited:
Joined
Apr 24, 2020
Messages
2,721 (1.60/day)
#3 is Microsoft, that is interesting. Seems like they have a lot of money to throw at AI.

64-bit compute for the top500 computers is useless for AI (aka: 16-bit FP) comparisons.
You have to remember that the Aurora and Frontier projects were bid and accepted back in 2014/2015 (Aurora probably earlier than that since that machine was supposed to be a bridge between Summit and Frontier, but Intel kept delaying and increasing the power of the machine). These aren't just grab the latest and greatest and stick a bunch of them together. Frontier was 600M USD, and Aurora was probably a similar cost because the US Department of Energy wanted a diversity of providers instead of a sole source.

Aurora was supposed to launch in 2018. Its launching 5 years late and not an exaflop to be seen yet.

But yeah, Intel has burned a lot of trust by delivering this late.
 

Fourstaff

Moderator
Staff member
Joined
Nov 29, 2009
Messages
10,079 (1.83/day)
Location
Home
System Name Orange! // ItchyHands
Processor 3570K // 10400F
Motherboard ASRock z77 Extreme4 // TUF Gaming B460M-Plus
Cooling Stock // Stock
Memory 2x4Gb 1600Mhz CL9 Corsair XMS3 // 2x8Gb 3200 Mhz XPG D41
Video Card(s) Sapphire Nitro+ RX 570 // Asus TUF RTX 2070
Storage Samsung 840 250Gb // SX8200 480GB
Display(s) LG 22EA53VQ // Philips 275M QHD
Case NZXT Phantom 410 Black/Orange // Tecware Forge M
Power Supply Corsair CXM500w // CM MWE 600w
64-bit compute for the top500 computers is useless for AI (aka: 16-bit FP) comparisons.
You can run another list comparing 16bit FP, I don't think Eagle is going to fall far behind with their H100 innards.
 
Joined
Apr 24, 2020
Messages
2,721 (1.60/day)
You can run another list comparing 16bit FP, I don't think Eagle is going to fall far behind with their H100 innards.

Well, I'm not actually interested in AI. I'm actually more into the classical 64-bit stuff.

But these lists are made with a purpose, and that purpose is 64-bit matrix-multiplications (aka: physics simulators). Given the history of TOP500 and what it represents, remember that its a physics benchmark at its core. That's plenty useful and #3 is perfectly fine, just don't call it AI.
 

Fourstaff

Moderator
Staff member
Joined
Nov 29, 2009
Messages
10,079 (1.83/day)
Location
Home
System Name Orange! // ItchyHands
Processor 3570K // 10400F
Motherboard ASRock z77 Extreme4 // TUF Gaming B460M-Plus
Cooling Stock // Stock
Memory 2x4Gb 1600Mhz CL9 Corsair XMS3 // 2x8Gb 3200 Mhz XPG D41
Video Card(s) Sapphire Nitro+ RX 570 // Asus TUF RTX 2070
Storage Samsung 840 250Gb // SX8200 480GB
Display(s) LG 22EA53VQ // Philips 275M QHD
Case NZXT Phantom 410 Black/Orange // Tecware Forge M
Power Supply Corsair CXM500w // CM MWE 600w
Well, I'm not actually interested in AI. I'm actually more into the classical 64-bit stuff.

But these lists are made with a purpose, and that purpose is 64-bit matrix-multiplications (aka: physics simulators). Given the history of TOP500 and what it represents, remember that its a physics benchmark at its core. That's plenty useful and #3 is perfectly fine, just don't call it AI.
A machine which is a monster at 16bit FP and can still be pretty decent at 64-bit, they are not exclusive. You made an observation that its very good at 64-bit Linpack workloads, I made an observation that the machine is likely to be used for AI workloads given the H100 innards. I think both of us can be right at the same time.
 
Joined
Apr 24, 2020
Messages
2,721 (1.60/day)
A machine which is a monster at 16bit FP and can still be pretty decent at 64-bit, they are not exclusive. You made an observation that its very good at 64-bit Linpack workloads, I made an observation that the machine is likely to be used for AI workloads given the H100 innards. I think both of us can be right at the same time.

Hmm. If I rereading your first post with the context that you're specifically talking about the Microsoft NDv5, I think that's fair then.

You're right, the H100 is good (but not the best) at both. Microsoft's NDv5 could be an AI supercomputer, or it might be a classic physics supercomputer.

I misread earlier and thought you were talking about TOP500 list in general. I didn't notice you were specifically talking about the NDv5.
 
Top