Tuesday, May 7th 2019
AMD Collaborates with US DOE to Deliver the Frontier Supercomputer
The U.S. Department of Energy today announced a contract with Cray Inc. to build the Frontier supercomputer at Oak Ridge National Laboratory, which is anticipated to debut in 2021 as the world's most powerful computer with a performance of greater than 1.5 exaflops.
Scheduled for delivery in 2021, Frontier will accelerate innovation in science and technology and maintain U.S. leadership in high-performance computing and artificial intelligence. The total contract award is valued at more than $600 million for the system and technology development. The system will be based on Cray's new Shasta architecture and Slingshot interconnect and will feature high-performance AMD EPYC CPU and AMD Radeon Instinct GPU technology.By solving calculations up to 50 times faster than today's top supercomputers-exceeding a quintillion, or 10^18, calculations per second-Frontier will enable researchers to deliver breakthroughs in scientific discovery, energy assurance, economic competitiveness, and national security. As a second-generation AI system-following the world-leading Summit system deployed at ORNL in 2018-Frontier will provide new capabilities for deep learning, machine learning and data analytics for applications ranging from manufacturing to human health.
"Frontier's record-breaking performance will ensure our country's ability to lead the world in science that improves the lives and economic prosperity of all Americans and the entire world," said U.S. Secretary of Energy Rick Perry. "Frontier will accelerate innovation in AI by giving American researchers world-class data and computing resources to ensure the next great inventions are made in the United States."
Since 2005, Oak Ridge National Laboratory has deployed Jaguar, Titan, and Summit, each the world's fastest computer in its time. The combination of traditional processors with graphics processing units to accelerate the performance of leadership-class scientific supercomputers is an approach pioneered by ORNL and its partners and successfully demonstrated through ORNL's No.1 ranked Titan and Summit supercomputers.
"ORNL's vision is to sustain the nation's preeminence in science and technology by developing and deploying leadership computing for research and innovation at an unprecedented scale," said ORNL Director Thomas Zacharia. "Frontier follows the well-established computing path charted by ORNL and its partners that will provide the research community with an exascale system ready for science on day one."
Researchers with DOE's Exascale Computing Project are developing exascale scientific applications today on ORNL's 200-petaflop Summit system and will seamlessly transition their scientific applications to Frontier in 2021. In addition, the lab's Center for Accelerated Application Readiness is now accepting proposals from scientists to prepare their codes to run on Frontier.Researchers will harness Frontier's powerful architecture to advance science in such applications as systems biology, materials science, energy production, additive manufacturing and health data science. Visit the Frontier website to learn more about what researchers plan to accomplish in these and other scientific fields.
Frontier will offer best-in-class traditional scientific modeling and simulation capabilities while also leading the world in artificial intelligence and data analytics. Closely integrating artificial intelligence with data analytics and modeling and simulation will drastically reduce the time to discovery by automatically recognizing patterns in data and guiding simulations beyond the limits of traditional approaches.
"We are honored to be part of this historic moment as we embark on supporting extreme-scale scientific endeavors to deliver the next U.S. exascale supercomputer to the Department of Energy and ORNL," said Peter Ungaro, president and CEO of Cray. "Frontier will incorporate foundational new technologies from Cray and AMD that will enable the new exascale era-characterized by data-intensive workloads and the convergence of modeling, simulation, analytics, and AI for scientific discovery, engineering and digital transformation."
Frontier will incorporate several novel technologies co-designed specifically to deliver a balanced scientific capability for the user community. The system will be composed of more than 100 Cray Shasta cabinets with high density compute blades powered by HPC and AI- optimized AMD EPYC processors and Radeon Instinct GPU accelerators purpose-built for the needs of exascale computing. The new accelerator-centric compute blades will support a 4:1 GPU to CPU ratio with high speed AMD Infinity Fabric links and coherent memory between them within the node. Each node will have one Cray Slingshot interconnect network port for every GPU with streamlined communication between the GPUs and network to enable optimal performance for high-performance computing and AI workloads at exascale.
To make this performance seamless to consume by developers, Cray and AMD are co-designing and developing enhanced GPU programming tools optimized for performance, productivity and portability. This will include new capabilities in the Cray Programming Environment and AMD's ROCm open compute platform that will be integrated together into the Cray Shasta software stack for Frontier.
"AMD is proud to be working with Cray, Oak Ridge National Laboratory and the Department of Energy to push the boundaries of high performance computing with Frontier," said Lisa Su, AMD president and CEO. "Today's announcement represents the power of collaboration between private industry and public research institutions to deliver groundbreaking innovations that scientists can use to solve some of the world's biggest problems."
Frontier leverages a decade of exascale technology investments by DOE. The contract award includes technology development funding, a center of excellence, several early-delivery systems, the main Frontier system, and multi-year systems support. The Frontier system is expected to be delivered in 2021, and acceptance is anticipated in 2022.
Frontier will be part of the Oak Ridge Leadership Computing Facility, a DOE Office of Science User Facility. ORNL is managed by UT-Battelle for DOE's Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE's Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit DOE's webiste.
Scheduled for delivery in 2021, Frontier will accelerate innovation in science and technology and maintain U.S. leadership in high-performance computing and artificial intelligence. The total contract award is valued at more than $600 million for the system and technology development. The system will be based on Cray's new Shasta architecture and Slingshot interconnect and will feature high-performance AMD EPYC CPU and AMD Radeon Instinct GPU technology.By solving calculations up to 50 times faster than today's top supercomputers-exceeding a quintillion, or 10^18, calculations per second-Frontier will enable researchers to deliver breakthroughs in scientific discovery, energy assurance, economic competitiveness, and national security. As a second-generation AI system-following the world-leading Summit system deployed at ORNL in 2018-Frontier will provide new capabilities for deep learning, machine learning and data analytics for applications ranging from manufacturing to human health.
"Frontier's record-breaking performance will ensure our country's ability to lead the world in science that improves the lives and economic prosperity of all Americans and the entire world," said U.S. Secretary of Energy Rick Perry. "Frontier will accelerate innovation in AI by giving American researchers world-class data and computing resources to ensure the next great inventions are made in the United States."
Since 2005, Oak Ridge National Laboratory has deployed Jaguar, Titan, and Summit, each the world's fastest computer in its time. The combination of traditional processors with graphics processing units to accelerate the performance of leadership-class scientific supercomputers is an approach pioneered by ORNL and its partners and successfully demonstrated through ORNL's No.1 ranked Titan and Summit supercomputers.
"ORNL's vision is to sustain the nation's preeminence in science and technology by developing and deploying leadership computing for research and innovation at an unprecedented scale," said ORNL Director Thomas Zacharia. "Frontier follows the well-established computing path charted by ORNL and its partners that will provide the research community with an exascale system ready for science on day one."
Researchers with DOE's Exascale Computing Project are developing exascale scientific applications today on ORNL's 200-petaflop Summit system and will seamlessly transition their scientific applications to Frontier in 2021. In addition, the lab's Center for Accelerated Application Readiness is now accepting proposals from scientists to prepare their codes to run on Frontier.Researchers will harness Frontier's powerful architecture to advance science in such applications as systems biology, materials science, energy production, additive manufacturing and health data science. Visit the Frontier website to learn more about what researchers plan to accomplish in these and other scientific fields.
Frontier will offer best-in-class traditional scientific modeling and simulation capabilities while also leading the world in artificial intelligence and data analytics. Closely integrating artificial intelligence with data analytics and modeling and simulation will drastically reduce the time to discovery by automatically recognizing patterns in data and guiding simulations beyond the limits of traditional approaches.
"We are honored to be part of this historic moment as we embark on supporting extreme-scale scientific endeavors to deliver the next U.S. exascale supercomputer to the Department of Energy and ORNL," said Peter Ungaro, president and CEO of Cray. "Frontier will incorporate foundational new technologies from Cray and AMD that will enable the new exascale era-characterized by data-intensive workloads and the convergence of modeling, simulation, analytics, and AI for scientific discovery, engineering and digital transformation."
Frontier will incorporate several novel technologies co-designed specifically to deliver a balanced scientific capability for the user community. The system will be composed of more than 100 Cray Shasta cabinets with high density compute blades powered by HPC and AI- optimized AMD EPYC processors and Radeon Instinct GPU accelerators purpose-built for the needs of exascale computing. The new accelerator-centric compute blades will support a 4:1 GPU to CPU ratio with high speed AMD Infinity Fabric links and coherent memory between them within the node. Each node will have one Cray Slingshot interconnect network port for every GPU with streamlined communication between the GPUs and network to enable optimal performance for high-performance computing and AI workloads at exascale.
To make this performance seamless to consume by developers, Cray and AMD are co-designing and developing enhanced GPU programming tools optimized for performance, productivity and portability. This will include new capabilities in the Cray Programming Environment and AMD's ROCm open compute platform that will be integrated together into the Cray Shasta software stack for Frontier.
"AMD is proud to be working with Cray, Oak Ridge National Laboratory and the Department of Energy to push the boundaries of high performance computing with Frontier," said Lisa Su, AMD president and CEO. "Today's announcement represents the power of collaboration between private industry and public research institutions to deliver groundbreaking innovations that scientists can use to solve some of the world's biggest problems."
Frontier leverages a decade of exascale technology investments by DOE. The contract award includes technology development funding, a center of excellence, several early-delivery systems, the main Frontier system, and multi-year systems support. The Frontier system is expected to be delivered in 2021, and acceptance is anticipated in 2022.
Frontier will be part of the Oak Ridge Leadership Computing Facility, a DOE Office of Science User Facility. ORNL is managed by UT-Battelle for DOE's Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE's Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit DOE's webiste.
47 Comments on AMD Collaborates with US DOE to Deliver the Frontier Supercomputer
Grats to AMD for the contract !
It was about dam' time to get them into the driver seat... even if only in some areas of the market.
www.anandtech.com/show/14302/us-dept-of-energy-announces-frontier-supercomputer-cray-and-amd-1-5-exaflops
Summit - 2018
IBM/Nvidia
200 PFLOPS
Aurora - 2021
Intel only (both CPU & GPU)
1000 PFLOPS
Frontier - 2021
AMD only (both CPU & GPU)
1500 PFLOPS
AMD, nVidia, Intel , PowerPC... my tax money at work.
31MW of power. Talking about "global warming"? :laugh:
Also AMD is only saying that the GPUs are “based on the Radeon Instinct family” and have “yet to be announced."
Far be it for me to SUPPORT Trump, but I think him boosting funding to projects like this is something he accidentally got right.
As a EE, I would love to be in the design team for the support building and utilities - power, HVAC, water... We are used to see something like max 30kW per cabinet :)
And I would rather have the tax money spent here in US than in rebuilding failed countries.
Well played.
Maybe Cray will help AMD write a proper API. Or port CUDA...
"And as the principle processor provider, AMD will also be taking on a lot of the responsibility for developing the software stack as well, with the company working with Cray to develop an enhanced version of their ROCm environment to best extract performance from the massive cluster of CPUs and GPUs. "
These supercomputers are used by researchers. You have a project, you apply for access and they decide whether you're worthy or not. ;-)
It seems like going for Nvidia GPUs would be more flexible. Also, majority of their clusters use Nvidia GPUs already.
Suddenly ORNL ordered 2 supercomputers with GPUs made by Intel and AMD. It's slightly surprising - that's all.
Whether this is cheaper or not - I have no idea. Maybe they simply wanted a customized GPU, in which case AMD is an easier partner. Once again: this is an all-round cluster, not built for a particular task. So the "additional hardware" is a plus. Especially when it's made for machine learning (it's quite popular, really :-P).
Anyway, both V100 and P100 are still offered by Nvidia. I'm not sure about K80 - maybe it's limited to existing clients.
AMD, hats off to you sir/maam :)
So effectively everyone. There is also the other versatility, you know, the one of being able to operate on more platforms and with more software (open vs closed drivers). I don't disagree. That's why I suspect a strange software stack that needs an open source driver for some part of the solution. I don't see another justification.
It is all there in fine print.
*Walks away confused*
If you're moving to a platform with different API, you have to rewrite everything. It doesn't matter if it's open or closed.
Who already has access to existing Nvidia clusters will likely stay there (especially for AI-related computing). New users will be moved to Frontier.
People have been using CUDA for a decade. It's the de facto standard.
Sure, I'd rather have something market wide in case Thanos snaps fingers and we're unlucky enough to lose the whole Nvidia team. But this standard should be CUDA. It's excellent. And everyone already uses it.
AMD and Intel should simply pay Nvidia and port it instead of wasting money on developing alternatives.
Anyway, we're going to see one more exa cluster announcement in USA (for LLNL). One went to Intel, one to AMD. Maybe that was the idea: provide 3 different architectures.
Least resistance. It is arguably easier to port a driver than pay for a new closed one to integrate with a moving target (OSS kernel).
I do advocate opensource, but only when it's actaully helpful. Their build design choices suggest it must be, or they'd probably have went with a CUDA based system for the versatility of the end users running code.