Intel Core i7 - Nehalem Architecture Dive

November 3, 2008 | 05:57

Tags: #45nm #analysis #architecture #core #cpu #detail #discussion #i7 #inside #lga1366 #mode #nehalem #qpi #quad #smt #turbo

Companies: #intel

DDR3 from here forward

Yes, Intel has gone DDR3 only with Core i7 - there will be no DDR2 support included in any product, so it's upgrades all round. Given the price of DDR2 has been "stock market" low for well over a year now, we expect the memory companies to retain DDR3 at a reasonable price elevation as they anticipate new demand from Core i7.

Nehalem uses triple-channel DDR3 so expect to see at least three memory slots, but more likely six, on X58 motherboards from here on out. Populating all three channels isn't necessary - the memory controller can still work in single and dual channel modes, albeit obviously at reduced bandwidth.

What's more, Intel allows any mixture of DIMMs to be used in the channels and the memory controller is smart enough to logically reorder itself into the most efficient operating parameters. However it is under the condition that slot 0 much be populated before slot 1 - these will usually be clearly identified by colour.

Intel rates its memory controller to work at just 800MHz or 1,066MHz - that's a little low by current standards that currently push the boundaries of 2,000MHz on some boards. Intel is almost "relaunching" DDR3 as it attempts to rein in the large overvoltages currently being used to achieve the super high MHz up until now. By using lower voltages, it can control the power consumption of the system better and it also duly compensates for the MHz difference with far more efficient access - essentially halving the latency.

Intel Core i7 - Nehalem Architecture Dive DDR3 and Power Control Intel Core i7 - Nehalem Architecture Dive DDR3 and Power Control
Click to enlarge

While we were fearful about the memory overclockability of Core i7 CPUs we've found this not to be the case - we've hit 1,600MHz without even breaking a sweat on pre-production BIOSes and companies are already launching triple channel memory kits that are hitting 2,000MHz at a record low 1.65V. You can push the memory voltage, but Intel performance guru François Piednoel was keen to stress that the CPU voltage must be kept within a 0.5V potential difference to memory.

With CPU stock voltages at sub 1.2V this only allows an upper safe maximum of ~1.65-1.7V, however increase the CPU voltage to 1.4 or 1.5 (or more) with some extreme cooling (remember these CPUs are rated at a 130W TDP at ~1.2V) and you give yourself more breathing room to 1.9-2.0V on the memory.

That's not to say that exceeding the 0.5V rule will instantly kill a CPU, but it's like the effect of excessive VTT or PLL voltages to current Penryn CPUs - don't be surprised if they die without warning much sooner than you expect. By that rationale, AMD's Athlon 64s suffered the same fate with DDR2, but the effect was often exacerbated because the initial voltage difference was that much greater to begin with and greatly overvolting the memory to 2.3+V made it that much worse.

On the positive side though - if you consider that 1,066MHz triple channel DDR3 gives you 30 percent more bandwidth than 1,600MHz dual channel DDR3 on Penryn, imagine what kind of memory performance you get from very low latency, 1,600+MHz triple channel DDR3 on Nehalem? In a word, unparalleled.

Voltage Regulator Down 11.1 and Power Control

Intel Core i7 - Nehalem Architecture Dive DDR3 and Power ControlIntel continually updates its VRD specifications and VRD11.1 was introduced to a limited number of Core 2 CPUs with the latest E0 stepping and a few existing LGA775 motherboards already support this. For all Core i7 processors and motherboards, it's now standard.

This enables more low power states like C3 and C6 in addition to the usual C1. These states are already available on mobile Merom CPUs, but for the first time are being fully transplanted to desktop CPUs. Here's a brief detail about what each state does:
  • C0 is the active state in which everything is running at full capacity.

  • C1 has the core clock turned off, as well as a slightly reduced core voltage. However the motherboard power lines are kept alive and the data cache is kept intact. This means that the performance isn’t compromised and the wake-up time is extremely fast - the C1E state is what currently features on all Core 2 CPUs.

  • C3 has the same core voltage drop as the C1 state, but now turns off the PLLs and flushes the L1 and L2 cache, switching it off and losing the data. The consequence of turning more off is a longer wake-up time, but Intel’s chart shows not a huge drop in idle power, unlike C0 to C1.

  • C6, the final power down state, is an almost complete shutdown of the CPU. There is a significant drop in core voltage and everything is now switched off except the L3 cache. The obvious downside to this state is that the resume time will be greater, but the power saving is huge.
Intel Core i7 - Nehalem Architecture Dive DDR3 and Power Control Intel Core i7 - Nehalem Architecture Dive DDR3 and Power Control
Click to enlarge

While Intel includes a Clock Gate that shuts of switching power to idle logic when not in use in all its current CPUs, with Nehalem it's gone one further with a Power Gate. This is new, proprietary microcontroller unit that shuts off both switching power and leakage power and allows core independent power states rather than die-level power states that was previously limiting Intel CPUs.

By shifting the control from external hardware to embedded firmware it allows uniform control no matter what motherboard is bought, more finesse in power control and it's completely invisible to software so it doesn't use system resources. Having said that, some settings can be turned off in the motherboard BIOS (should it be made available).

Tying in real time temperature and current/power monitoring to resource use enables the processor to manipulate its idle states according to whether a thread needs a resource or not. Intel claims that because the PCU also monitors the interrupt rate and recognises that some workloads that use low CPU utilisation still need performance, the processor will use a lower C-State to ensure that resources don't take too long to kick in when required.
Discuss this in the forums
YouTube logo
MSI MPG Velox 100R Chassis Review

October 14 2021 | 15:04