AnandTech: The Quest for More Processing Power, Part One: “Is the single core CPU doomed?”. _For the old computer system engineer in me, I just love Anandtech, they don’t pull any punches in explaining what is going on. Here is a great piece about why performance is slowing down so much_
In November 2002, Intel was well ahead of the competition with the introduction of a 3.06 GHz Pentium 4. Intel had doubled the clock speed of its latest x86 architecture within two years, which was quite an accomplishment.
Two and half years later, Intel’s Pentium 4 is running at 3.8 GHz, which means that clock speed has increased by only 25% and performance by even less.
Here are some reasons:
* Leakage Power. As chips have strunk, there is more “tunnelling” as current leaks from one side of the transistor to another. The chars are amazing showing at at 250nm, leakage is just 1% while at 90nm, leakage is 40% of the power.
* Wire Delay. Basically as transistors shrink and switch faster, the wires between stay at the same speed. At some point, most of the time spent is just transitioning the wires between infinitely fast switches. Reminds me of the floating bridge delays here in Seattle. No matter how often the traffic lights switch, get on the 4-lane SR-520 and wait.’
* Memory Wall. Although processor have gotten faster, memory is about the same speed. So no matter how fast you switch, you wait for memory.
The net is that a single processor just doesn’t get that much faster, so folks are talking about multi-core so you get more than one processor. The big issue of course is that it is really hard to optimize applications to be multiprocessor oriented. In many cases, the applications themselves are intrinsically serialized.
There are a bunch of semiconductor technical solutions for this:
* Leakage Current. AMD moved to Silicon-on-insulator with the Athlon 64. Intel when they switch to 45nm, will use a high-K dielectric in their transistors shrinking leakage by 100x. Strained silicon will be introduced by AMD with the “E0” stepping as another example.
* Wire Delay. The big change is more layers of wires and moving from aluminum to copper interconnects.
The most interesting analysis is why the Prescott is “failing” that is it was supposed to go to 5GHz and can just barely get to 3.8GHz. Basically, by going to a deep pipeline of 38 stages, they introduced so much more logic in branch prediction and fast adder units that leakage power increased defeating the whole purpose.
Final conclusion is that this doesn’t mean dual core is going to be faster and that in fact better implement single core can still win the day. Interesting to debate over a beer. 🙂