GDDR5X and Fourth Gen Delta Colour Compression
While the Tesla P100 uses HBM2 memory instead of GDDR5, this is a very new technology meaning that volume, implementation and cost are all issues to deal with. For its first consumer Pascal part, Nvidia opted to stick with GDDR5, or more specifically a new generation of GDDR5 that it has co-developed with Micron called GDDR5X.
Instead of widening the GP104's interface with additional memory controllers that take up more transistors and die space, the focus was on making GDDR5X itself faster. In terms of architecture and logic, GDDR5X is unchanged, but it now runs with 10Gbps transfer rates compared to 7Gbps as a maximum previously – a 43 percent upshot. To ensure that the memory system could cope with such speeds, a new IO circuit architecture was developed. Nvidia also paid very close attention to the board channel design, i.e. the physical bridge between the GPU and the GDDR5X dies. Nvidia claims to have studied every single channel, in fact, to minimise issues like channel loss and crosstalk, and says that the lessons learnt can be applied to future GDDR5 products as well.
Click to enlarge
With a 256-bit memory interface operating at a 10Gbps transfer rate, Nvidia has achieved a 320GB/sec figure for available bandwidth. This is close to what the GTX 980 Ti and GTX Titan X achieved over a fatter 384-bit bus with 7Gbps memory (336GB/sec). With the higher clock speed, you might logically assume considerably higher power consumption, but GDDR5X has one final trick up its sleeve; it operates at 1.35V rather than 1.5V. As such, power consumption remains equal to what it was despite the significant speed bump.
Click to enlarge
To make even further use of the available memory bandwdith, the GTX 1080 introduces Nvidia's fourth generation delta colour compression. This is a lossless means of compressing pixel colour data, thus freeing up more memory capacity and bandwidth. It works by calculating the differences between pixels within a pixel block and then storing the block as a set of reference pixels and the delta values (the deviation from the reference). Improvements in the fourth generation of this algorithm include: enhancements to 2:1 delta compression, where the stored block is at least half the size of the original; a new 4:1 delta compression, where colour differences are small enough for the stored block to be just a quarter the original size; and a new 8:1 compression mode, which merges 4:1 constant colour compression of 2x2 pixel blocks (where all four pixels have the same colour) with a 2:1 compression of the deltas between those blocks.
Click to enlarge
The Project CARS screenshots here (supplied by Nvidia) show the new effectiveness of Pascal's colour compression – the bits in pink indicate the pixels that were successfully compressed, and the improvement with Pascal is clear. Nvidia says that, on average, memory bandwidth is improved over Maxwell by roughly 20 percent as a result of the new compression techniques. Coupled with the ~1.4x raw throughput increase in moving to 10Gbps, the firm says that Pascal's true effective bandwidth is about 1.7x higher now.
Want to comment? Please log in.