• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Collaborates with US DOE to Deliver the Frontier Supercomputer

Joined
Aug 20, 2007
Messages
21,443 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
It doesn't matter if it's open or closed.

And if the driver is a closed binary unavailable for your platform?

Least resistance. It is arguably easier to port a driver than pay for a new closed one to integrate with a moving target (OSS kernel).

I do advocate opensource, but only when it's actaully helpful. Their build design choices suggest it must be, or they'd probably have went with a CUDA based system for the versatility of the end users running code.
 
Joined
Jun 28, 2016
Messages
3,595 (1.17/day)
And if the driver is a closed binary unavailable for your platform?

Least resistance. It is arguably easier to port a driver than pay for a new closed one to integrate with a moving target (OSS kernel).
I'm not sure why porting a driver would be easier than paying for a new closed one. Porting takes time and costs. And you'll likely outsource it either way.

Also, we're talking about an HPC system. Cray delivers the whole package: configured and ready to run.
 
Joined
Jun 28, 2016
Messages
3,595 (1.17/day)
Because you'll need to pay for a port update everytime the open source kernel you are integrating with updates. Open source integrates easier with open source and is inherently cheaper to maintain.
You buy a server with service. Cray (or any other OEM) provides support - including drivers.
It's way more cost effective as well.
 
Joined
Jul 9, 2015
Messages
3,413 (1.00/day)
System Name M3401 notebook
Processor 5600H
Motherboard NA
Memory 16GB
Video Card(s) 3050
Storage 500GB SSD
Display(s) 14" OLED screen of the laptop
Software Windows 10
Benchmark Scores 3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.
Perhaps that's why Threadripper 2 has disappeared.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,164 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Well, the reality is that Nvidia cluster can run CUDA and this can't.
The reality is that nVidia owns the copyrights for CUDA like all of their other IP and everyone else doesn't. If it's not nVidia, it's not going to do CUDA. If your software uses CUDA, that's called vender lock-in.
 

Frick

Fishfaced Nincompoop
Joined
Feb 27, 2006
Messages
19,531 (2.85/day)
Location
Piteå
System Name White DJ in Detroit
Processor Ryzen 5 5600
Motherboard Asrock B450M-HDV
Cooling Be Quiet! Pure Rock 2
Memory 2 x 16GB Kingston Fury 3400mhz
Video Card(s) XFX 6950XT Speedster MERC 319
Storage Kingston A400 240GB | WD Black SN750 2TB |WD Blue 1TB x 2 | Toshiba P300 2TB | Seagate Expansion 8TB
Display(s) Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case Fractal Design Define R4
Audio Device(s) Line6 UX1 + Sony MDR-10RC, Nektar SE61 keyboard
Power Supply Corsair RM850x v3
Mouse Logitech G602
Keyboard Cherry MX Board 1.0 TKL Brown
Software Windows 10 Pro
Benchmark Scores Rimworld 4K ready!
I live in a time when there are three exaflop supercomputers in the works. Crazy.
 
Joined
Oct 27, 2009
Messages
1,180 (0.21/day)
Location
Republic of Texas
System Name [H]arbringer
Processor 4x 61XX ES @3.5Ghz (48cores)
Motherboard SM GL
Cooling 3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory 16x gskill DDR3 1600 cas6 2gb
Video Card(s) blah bigadv folder no gfx needed
Storage 32GB Sammy SSD
Display(s) headless
Case Xigmatek Elysium (whats left of it)
Audio Device(s) yawn
Power Supply Antec 1200w HCP
Software Ubuntu 10.10
Benchmark Scores http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww
AMD made ROCm and HIP for a reason, HIP can convert cuda code... and often it is faster on nvidia cards after the conversion than before...
ROCm has been lagging behind 2-3 versions in support but they have been catching up quite well to CUDA's feature set.

If you want to do workloads that rely heavily on tensor ops, then you go with V100s, if you need simple double, single or half precision than AMD is a solid option.
If you need just need inferencing... shiiiit options are wide open.

Edit: Shit, they have caught up on version support... https://rocm.github.io/dl.html
 
Last edited:
Joined
Aug 20, 2007
Messages
21,443 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
You buy a server with service. Cray (or any other OEM) provides support - including drivers.
It's way more cost effective as well.

And Cray has an interest in reducing the price of the service they provide?

As I said, I can't be certain. But I think their parts choices point down that road...

You're naive if you think you have an argument.

And your naive if you think calling me naive constitutes an argument.

AMD sold the idea to DoE in 2011. Nvidia & Intel followed.

And the wright brothers invented the airplane. Who cares who follows who in relation to marketshare? Furthermore, call me when their system is even in the TOP100 charts.
 
Last edited:
D

Deleted member 158293

Guest
Starting with open solutions is the quickest way to program.
AMD made ROCm and HIP for a reason, HIP can convert cuda code... and often it is faster on nvidia cards after the conversion than before...
ROCm has been lagging behind 2-3 versions in support but they have been catching up quite well to CUDA's feature set.

If you want to do workloads that rely heavily on tensor ops, then you go with V100s, if you need simple double, single or half precision than AMD is a solid option.
If you need just need inferencing... shiiiit options are wide open.

Edit: Shit, they have caught up on version support... https://rocm.github.io/dl.html

Interesting to see the need for, and always good to have an exit ramp from vendor lock in.
 
Joined
Feb 25, 2016
Messages
292 (0.09/day)
It's written in the article that ROCm software stack will be used. It's open source and with version 2.4, it's quite competitive with CUDA feature wise. Though I wouldn't be surprised if the next supercomputer win belongs to IBM+Nvidia combo. Let's wait and watch.
 
Joined
Oct 27, 2009
Messages
1,180 (0.21/day)
Location
Republic of Texas
System Name [H]arbringer
Processor 4x 61XX ES @3.5Ghz (48cores)
Motherboard SM GL
Cooling 3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory 16x gskill DDR3 1600 cas6 2gb
Video Card(s) blah bigadv folder no gfx needed
Storage 32GB Sammy SSD
Display(s) headless
Case Xigmatek Elysium (whats left of it)
Audio Device(s) yawn
Power Supply Antec 1200w HCP
Software Ubuntu 10.10
Benchmark Scores http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww
Furthermore, call me when their system is even in the TOP100 charts.

Titan is still #9 and AMD based... There is a Zen based chinese small node # supercomputer that is #38
This will be #1 and 7x faster than the current #1 and 50% faster than Intel's system that may or may not get finished first.
I think this is the first time I have seen AMD cpu+gpu be in the top 10, everything before has been AMD CPU/Nvidia GPU.
 
Joined
Apr 30, 2011
Messages
2,702 (0.55/day)
Location
Greece
Processor AMD Ryzen 5 5600@80W
Motherboard MSI B550 Tomahawk
Cooling ZALMAN CNPS9X OPTIMA
Memory 2*8GB PATRIOT PVS416G400C9K@3733MT_C16
Video Card(s) Sapphire Radeon RX 6750 XT Pulse 12GB
Storage Sandisk SSD 128GB, Kingston A2000 NVMe 1TB, Samsung F1 1TB, WD Black 10TB
Display(s) AOC 27G2U/BK IPS 144Hz
Case SHARKOON M25-W 7.1 BLACK
Audio Device(s) Realtek 7.1 onboard
Power Supply Seasonic Core GC 500W
Mouse Sharkoon SHARK Force Black
Keyboard Trust GXT280
Software Win 7 Ultimate 64bit/Win 10 pro 64bit/Manjaro Linux
Interesting you accuse me of not reading, because I already addressed this.

Frontier is based on Zen, which is a complete ground up redesign since Bulldozer based fusion. They are about as related as an Apple and a Potato, or Netburst and Sandy Bridge.


Pretending you are smarter than everyone is not making you look great here.

or maybe I am confused... what is your argument here, exactly?

If you are just doing a generic "AMD IS DAH BEST," you at least aren't exclusively wrong. Consider the following:

Zen is better suited for this than Intel for certain. It's their GPU-choice I find intriguing, and only because they could've likely been more power efficient (important in super computer clusters) with nvidias line.
For servers the GPUs aren't rated by their gaming performance but in TFlops=raw compute power. And in that, AMD has efficient GPUs as Vega arch is mainly a compute targeted one for data centers. And in 7nm AMD GPUs are much more efficient. Radeon 7 is ~30% faster than Vega 64 using 10% less power.
 
Joined
Oct 22, 2014
Messages
14,078 (3.82/day)
Location
Sunshine Coast
System Name H7 Flow 2024
Processor AMD 5800X3D
Motherboard Asus X570 Tough Gaming
Cooling Custom liquid
Memory 32 GB DDR4
Video Card(s) Intel ARC A750
Storage Crucial P5 Plus 2TB.
Display(s) AOC 24" Freesync 1m.s. 75Hz
Mouse Lenovo
Keyboard Eweadn Mechanical
Software W11 Pro 64 bit
And the wright brothers invented the airplane.
And here I was thinking they were the first to actually do a documented flight, and they invented planes too....
 
Joined
Sep 15, 2011
Messages
6,715 (1.39/day)
Processor Intel® Core™ i7-13700K
Motherboard Gigabyte Z790 Aorus Elite AX
Cooling Noctua NH-D15
Memory 32GB(2x16) DDR5@6600MHz G-Skill Trident Z5
Video Card(s) ZOTAC GAMING GeForce RTX 3080 AMP Holo
Storage 2TB SK Platinum P41 SSD + 4TB SanDisk Ultra SSD + 500GB Samsung 840 EVO SSD
Display(s) Acer Predator X34 3440x1440@100Hz G-Sync
Case NZXT PHANTOM410-BK
Audio Device(s) Creative X-Fi Titanium PCIe
Power Supply Corsair 850W
Mouse Logitech Hero G502 SE
Software Windows 11 Pro - 64bit
Benchmark Scores 30FPS in NFS:Rivals
31MW of power. Talking about "global warming"? :laugh:
We are in 2019. 4 wind turbines like those can provide more than adequate power. Also they can install additional solar panels on the roof and can have 100% ECO energy. ;)
 
Joined
Jun 3, 2010
Messages
2,540 (0.48/day)
We are in 2019. 4 wind turbines like those can provide more than adequate power. Also they can install additional solar panels on the roof and can have 100% ECO energy. ;)
31MWh is still 7.4 megacalories per second, or 7.4°C/s per ton of water...
 
Last edited:
Joined
Sep 27, 2014
Messages
550 (0.15/day)
We are in 2019. 4 wind turbines like those can provide more than adequate power. Also they can install additional solar panels on the roof and can have 100% ECO energy
And DOE can make calculations only when wind blows. Good investment.
Seriously, every single one of those wind turbines has to be paired up with a quick-reacting gas turbine, ready to pick up the load. Spinning reserve. Those are expensive to buy, to maintain and they are not fuel efficient compared to a slow reacting coal or nuclear plants (those cannot be used as spinning reserves because cannot be turned on-off so fast).

So by using the "free" wind, you just increased the price of electricity...

Did you miss the part where I worked here just a year ago?
What happen, did they finally fired you? :D











Just joking, sorry, could not help, was so easy...
 

bogmali

In Orbe Terrum Non Visi
Joined
Mar 16, 2008
Messages
9,536 (1.57/day)
Location
Pacific Northwest
System Name Daily Driver/Part Time
Processor Core i7-13700K/Ryzen R5-7600
Motherboard ASUS ROG MAXIMUS Z790 APEX/Asrock B650 Pro RS Wi-Fi
Cooling Corsair H150i RGB PRO XT AIO/Deep Cool LS-520 White
Memory G-Skill Trident Z5 Silver 2x24GB DDR5-8200/XPG Lancer Blade 2X16GB DDR-5-6000
Video Card(s) MSI Ventus 3X OC RTX-4080 Super/Sapphire Radeon RX-7800XT
Storage Samsung 980 Pro M.2 NVMe 2TB/KingSpec XG 7000 4TB M.2 NVMe/Crucial P5 Plus 2TB M.2 NVMe
Display(s) Alienware AW3423DW
Case Corsair 5000d AirFlow/Asus AP201 White
Audio Device(s) AudioEngine D1 DAC/Onboard
Power Supply Seasonic Prime Ultra 1K Watt/Seagotep 750W
Mouse Corsair M65 RGB Elite
Keyboard Adata XPG Summoner
Software Win11 Pro 64
Benchmark Scores Xbox Live Gamertag=jondonken
This thread is not for chest beating purposes, you can continue those offline or via PMs. Thread cleansed and reply bans issued.
 
Joined
Sep 15, 2011
Messages
6,715 (1.39/day)
Processor Intel® Core™ i7-13700K
Motherboard Gigabyte Z790 Aorus Elite AX
Cooling Noctua NH-D15
Memory 32GB(2x16) DDR5@6600MHz G-Skill Trident Z5
Video Card(s) ZOTAC GAMING GeForce RTX 3080 AMP Holo
Storage 2TB SK Platinum P41 SSD + 4TB SanDisk Ultra SSD + 500GB Samsung 840 EVO SSD
Display(s) Acer Predator X34 3440x1440@100Hz G-Sync
Case NZXT PHANTOM410-BK
Audio Device(s) Creative X-Fi Titanium PCIe
Power Supply Corsair 850W
Mouse Logitech Hero G502 SE
Software Windows 11 Pro - 64bit
Benchmark Scores 30FPS in NFS:Rivals
Seriously, every single one of those wind turbines has to be paired up with a quick-reacting gas turbine, ready to pick up the load. Spinning reserve.
No need for those. The Datacenter will be connected to the main grid anyway ready to pick up the missing load in case no wind/sun from the Mother Nature.

But I am also curious about the future upgradebilitty for those server farms. Can the CPUs/GPUs be easily upgraded in the future on the fly? And I mean without changing motherboards and such?
 
Joined
Sep 27, 2014
Messages
550 (0.15/day)
The Datacenter will be connected to the main grid anyway ready to pick up the missing load in case no wind/sun from the Mother Nature.
You don't get it. That fabled "main grid" ready to jump to help... it's composed of many individual steam or gas turbines like I said above.
In every microsecond of the day, 24/7, 365 days/year, the amount of electricity produced has to be equal to the electricity consumed. If a power generator drops quickly, somewhere in the system, preferably close by, another one has to pick up as quickly that exact deficit of power.
Steam turbines (coal fired) have a spare capacity of a few minutes of steam, and by that time, the gas turbines have to start-up already to pick up the slack.
You can't mess up with a nuclear power plant up and down that way. Or... you can, but might end up with Chernobyl.
 
Joined
Oct 22, 2014
Messages
14,078 (3.82/day)
Location
Sunshine Coast
System Name H7 Flow 2024
Processor AMD 5800X3D
Motherboard Asus X570 Tough Gaming
Cooling Custom liquid
Memory 32 GB DDR4
Video Card(s) Intel ARC A750
Storage Crucial P5 Plus 2TB.
Display(s) AOC 24" Freesync 1m.s. 75Hz
Mouse Lenovo
Keyboard Eweadn Mechanical
Software W11 Pro 64 bit
You don't get it. That fabled "main grid" ready to jump to help... it's composed of many individual steam or gas turbines like I said above.
In every microsecond of the day, 24/7, 365 days/year, the amount of electricity produced has to be equal to the electricity consumed.
WUT! :kookoo:
Wrong and so off topic, I'm leaving it there.
 
Joined
Feb 19, 2019
Messages
324 (0.15/day)
After Watching this about Milan and the possibility that it will include 15 Chiplets and maybe SMT4, I think [my Imagination] I Have an Idea where AMD is going with it's future(Maybe Custom design?) HPC EPYC design on 7nm+:
1)Each CPU chiplet will be 6C/24T to save space/power while giving similar or better then 8c/16t performance.
2)Adding 4 custom Instinct GPU chiplets.
3)Adding 2 custom AI accelerator [Asics] chiplets.
4)1 I/O chiplet with HBM memory stack.

So the final EPYC Milan(?) can be HPC beast with:
  • 48C/192T Zen CPU cores.
  • 4 custom Instinct GPUs.
  • 2 AI accelerator Asics.
  • 1 I/O Chiplet with HBM 3D staking .


EDIT: I see that there was already great article on such HPC APU design:
https://www.overclock.net/forum/225-...lops-200w.html
You can see the EPYC PCB design in "Figure 2. Exascale Heterogeneous Processor (EHP) ".

So after reading some of it I changed my illustration:
No CPU Chiplet ontop of I/O:[Took Vega Pro 20CU image and placed the HBM on top and shrank it to 7nm+ level]
IMO the GPU's could take ~150W + 75W~100W rest of the CPU+I/O= around 250W TDP.


Or CPU Chiplets ontop of I/O- it can still be 14nm or 7nm- but it gonna stay large chiplet anyway to place cpu chiplets on top,



And 8 Milans could be installed in Cray’s Shasta 1U with Direct Liquid Cooling:

https://www.anandtech.com/show/13616...liquid-cooling
 
Top