Pages

October 17, 2007

Sony PS3 cluster supercomputers

A 68 page pdf on using Sony PS3's for scientific computing

For a matrix of size 2Kx2K they achieved 11.05 Gflop/s, which is around 75% of the double precision peak. They have also implemented a single precision version of the code, which achieved 155 Gflop/s (again around 75%efficiency) for a matrix of size 4Kx4K. Unfortunately, a single precision algorithm does not legitimately implement the Linpack benchmark. Our initial implementation of the mixed-precision Linpack benchmark [21] placed the CELL processor on the Linpack Report [22] with performance close to 100 Gflop/s.

One way of looking at the CELL processor is to treat it as eight digital signal processors (DSP), augmented with a control processor, on a single chip.

One of the major shortcomings of the current CELL processor for numerical application is the relatively slow speed of the double precision arithmetic. The next reincarnation of the CELL processor is going to include a fully-pipelined double precision unit, which will deliver the speed of 12.8 Gflop/s from a single SPE clocked at 3.2 GHz, and 102.4 Gflop/s from an 8-SPE system, what is going to make the chip a very hard competitor in the world of scientific and engineering computing. Given that, the current CELL processor employs a rather modest number of transistors of 234 million. It is not hard to envision a CELL processor with more than one PPE and many more SPEs, perhaps reaching the performance of a TeraFlop/s for a single chip.
The Cell2 is expected in 2008 and will initially be used in the Roadrunner supercomputer.

A cluster of eight PS3s has been linked together for astrophysics calculations.

This PS3 cluster has been reviewed at Wired magazine


The eight PS3 probably get to a combined 500-800 gigaflops of performance for $3200.

1 comments:

notsomuch said...

Cell (PS3) processor consist of 234M transistors, Xenon (XBOX360) processor consist of 165M transistors, Intel Core2 quad consist of 582M transistors. Radeon HD 2900 and Geforce 8800GTX have about 700M transistors. 1GB RAM consist of more than 8000M transistors.