"IBM POWER7 SPECfp_rate2006: Poor Scaling? Or Configuration Confusion?" is a blog post by John Henning that casts a skeptical eye on certain aspects of the SPEC benchmark results posted for the IBM POWER7. His overall conclusion? "Scaling POWER7 from 2 to 4 chips is not impressive." He writes that, "As of 23-Feb-2010, IBM's best published 2-chip result and best 4-chip result for SPECfp_rate2006 are, respectively, 586 and 851. The scaling from 2 chips to 4 chips is less than 1.5x (851/586=1.452)."
Henning presents tabular results that show the 4-chip system, with twice as many cores, has a slightly lower MHz, while in all other dimensions it would seem to provide twice the capability of the 2-chip system.
Continuing his examination of the CPU2006 floating point suite, Henning notes that the 17 individual benchmarks do not show uniform scaling, as " ... when twice as many copies are run on twice as many chips, some of the programs scale well, while others stall out."
He points out that seven of the tested programs scale relatively poorly. These are, according to SPEC, drawn from fluid dynamics, speech recognition, physics, linear programming, and electromagnetics applications.
Henning surmises that this failure to scale uniformly is a result of the fact that he computations performed in these benchmarks exercise more than just the chip and its caches and are instead memory intensive, and will not scale well unless memory bandwidth is also scaled.
Considering the differences between the IBM 750 (which has the best 4-chip result) vs. IBM 780 (which has the best 2-chip result), Henning notes that while the 750 uses one 4-RU box to hold up to 4 chips, the 780 places only 2 chips into each 4-RU enclosure. "Presumably, the extra space in the 780 is used to take better advantage of the POWER7 memory system, perhaps by using more of its memory controllers / channels."
Henning's conclusion, having factored in the issue of price, is that users seeking a 4-chip system that scales well for all of the SPECfp_rate2006 benchmarks when compared to the 2-chip 780, should presumably build a 4-chip 780 rather than a 4-chip 750, even though the 780 will cost noticeably more than the 750.
Henning's final paragraph is a caveat to users: " ... if you want scaling, you have to pay attention to whether your application is hungry for memory bandwidth; and, if so, you need to pay careful attention to which model you are looking at. Try not to be confused by the different benchmarks that exercise different capabilities of the different configurations."
More Information
Sun's Niagara 3 and IBM POWER7
Performance Articles
[...read more...]