AMD has announced a major win with the US Department of Energy (DOE) and supercomputer specialist Cray, which will see the latter build a 1.5-exaflop supercomputer out of the former's Epyc CPU and Radeon Instinct GPU products.

High-performance computing (HPC) is a lucrative market, with governments and well-heeled corporations the world over spending billions to fight their way to the top of the TOP500 list of the world's most powerful supercomputers. As of the last list, November 2018, the US Department of Energy held the top two places with a pair of systems built around IBM's POWER9 processors and Nvidia Volta GV100 GPGPU accelerators - but its new partnership will see its next design switching to AMD for both CPU and GPU compute.

Dubbed Frontier, the supercomputer designed by Cray is to use customised AMD Epyc CPUs and Radeon Instinct GPU accelerators to provide compute performance claimed to be in excess of 1.5 exaflops - the rough equivalent performance, if it's reached, of the top 160 supercomputers in the world combined.

'AMD is proud to partner with Cray and ORNL [Oak Ridge National Laboratory, where the system is to be installed] to deliver what is expected to be the world's most powerful supercomputer,' crows Forrest Norrod, senior vice president and general manager for AMD's Datacentre and Embedded Systems Group. 'Frontier will feature custom CPU and GPU technology from AMD and represents the latest achievement on a long list of technology innovations AMD has contributed to the Department of Energy exascale programs.'

'We are excited to work with the team at AMD to deliver the Frontier system to Oak Ridge National Laboratory,' adds Steve Scott, senior vice president and chief technology officer at Cray. 'Cray's Shasta supercomputers are designed to support leading edge processor technologies and high-performance storage, all tightly interconnected by Cray's new Slingshot network. The combination of Cray and AMD technology in the Frontier system will dramatically enhance performance at scale for AI, analytics, and simulation, enabling DOE to further push the boundaries of scientific discovery.'

It's not the first time AMD has partnered with the DOE - its components were found in 2005's Jaguar and 2012's Titan supercomputers - but it has been losing ground of late: Titan, which uses AMD's Opteron 6274 processors alongside Nvidia' K20x accelerators, sits in ninth place in the November 2018 TOP500 list with 17.6 petaflops of sustained compute. Above it are systems largely powered by rival Intel's Xeon chips and Nvidia accelerators, though the third-place Sunway TaihuLight uses Sunway SW26010 260-core processors while the DOE's Sierra and Summit take second and first place with their IBM POWER9 chips and Nvidia Volta accelerators offering 94.6 and 143.5 petaflops respectively.

As well as the customised Epyc and Radeon Instinct processors, claimed to be optimised for general HPC and artificial intelligence (AI) workloads, AMD is also providing a custom high-bandwidth low-latency variant of its Infinity Fabric interconnect to link four Radeon Instinct GPUs to each Epyc CPU, and an enhanced version of the ROCm programming environment developed in partnership with Cray.

The system is expected to go live in 2021.

Discuss this in the forums
Mod of the Month August 2020 in Association with Corsair

September 18 2020 | 18:30