This section is going to be shorter than we’d otherwise normally do for a major architectural launch, primarily because ill-timed holidays meant we missed the key in-person technical presentations that are vital for deepening our understanding beyond presentation slides. Nevertheless, we wanted to highlight the key features of RDNA to act as a grounding for understanding where it’s different and why.
The Compute Unit remains the key functional unit in RDNA (closest equivalent for Nvidia is its Streaming Multiprocessor), but it has been redesigned significantly compared to GCN. BY doubling the scheduler count from one to two, widening the SIMD (single-instruction, multiple-data) units, and allowing Compute Units to be pooled together in pair to form a Workgroup Processor, AMD is able to distribute more varied workloads in a more efficient manner (i.e. using more of the execution units at once).
The second key change is that AMD has also implemented a new multilevel cache hierarchy for Navi, reducing latency and power consumption across the die and also reducing the strain on the L2 cache. It has also increased the number of areas throughout the chip that can read and write colour-compressed data, and the compression algorithm itself has been improved, further optimising efficiency.
AMD’s Asynchronous Compute capabilities have also improved, allowing faster execution of high-priority compute tasks.
Learning from its Zen CPU architecture, AMD has also been able to optimise Navi in terms of clock speeds and clock gating. This is one of the key pillars of RDNA’s extra performance per watt credentials alongside the gains made from switching to the 7nm process and the performance per clock gains achieved in part by the techniques outlined above.
While Navi has nothing equivalent to Nvidia RT Cores or Tensor Cores, a slide from AMD’s Next Horizon event does imply that selected real-time ray traced lighting effects will be supported in the next-generation of RDNA products. The degree of support and how it’s achieved very much remains to be seen, but it’s clear that the gaming industry is moving towards more ray tracing.
Feature-wise, the Radeon RX 5700 XT includes support for PCIe 4.0, though is entirely backwards-compatible with PCIe 3.0 motherboards. As of today, AMD X570 motherboards are the only ones to support the new faster standard (PCIe 4.0 offers a doubling of bandwidth), although we are not expecting GPUs to face many if any bandwidth bottlenecks in games these days, as a 16-lane PCIe 3.0 slot provides plenty as it is.
The Navi 10 GPU also benefits from the latest implementations of AMD’s Radeon Display Engine and the Radeon Multimedia Engine. The slides in for what these bring to the table are pretty self-explanatory, so we’ll let them do the talking above.
September 22 2021 | 09:03