Monday, June 16th 2008
AMD FireStream 9250 Breaks the 1 Teraflop Barrier
At the International Supercomputing Conference, AMD today introduced its next-generation stream processor, the AMD FireStream 9250, specifically designed to accelerate critical algorithms in high-performance computing (HPC), mainstream and consumer applications. Leveraging the GPU design expertise of AMD's Graphics Product Group, AMD FireStream 9250 breaks the one teraflop barrier for single precision performance. It occupies a single PCI slot, for unmatched density and with power consumption of less than 150 watts, the AMD FireStream 9250 delivers an unprecedented rate of performance per watt efficiency with up to eight gigaflops per watt.
Customers can leverage AMD's latest FireStream offering to run critical workloads such as financial analysis or seismic processing dramatically faster than with CPU alone, helping them to address more complex problems and achieve faster results. For example, developers are reporting up to a 55x performance increase on financial analysis codes as compared to processing on the CPU alone, which supports their efforts to make better and faster decisions. Additionally, the use of flexible GPU technology rather than custom accelerators assists those creating application-specific systems to enhance and maintain their solutions easily.
The AMD FireStream 9250 stream processor includes a second-generation double-precision floating point hardware implementation delivering more than 200 gigaflops, building on the capabilities of the earlier AMD FireStream 9170, the industry's first GP-GPU with double-precision floating point support. The AMD FireStream 9250's compact size makes it ideal for small 1U servers as well as most desktop systems, workstations, and larger servers and it features 1GB of GDDR3 memory, enabling developers to handle large, complex problems.
Driving broad consumer adoption with open systems
AMD enables development of the FireStream family of processors with its AMD Stream SDK, designed to help developers create accelerated applications for AMD FireStream, ATI FireGL and ATI Radeon GPUs. AMD takes an open-systems approach to its stream computing development environment to ensure that developers can access and build on the tools at any level. AMD offers published interfaces for its high-level language API, intermediate language, and instruction set architecture; and the AMD Stream SDK's Brook+ front-end is available as open source code.
In keeping with its open systems philosophy, AMD has also joined the Khronos Compute Working Group. This working group's goals include developing industry standards for data parallel programming and working with proposed specifications like OpenCL. The OpenCL specification can help provide developers with an easy path to development across multiple platforms.
"An open industry standard programming specification will help drive broad-based support for stream computing technology in mainstream applications," said Rick Bergman, senior vice president and general manager, Graphics Product Group, AMD. "We believe that OpenCL is a step in the right direction and we fully support this effort. AMD intends to ensure that the AMD Stream SDK rapidly evolves to comply with open industry standards as they emerge."
Accelerating industry adoption
The growth of the stream computing market has accelerated over the past few years with Fortune 1000 companies, leading software developers and academic institutions utilizing stream technology to achieve tremendous performance gains across a variety of applications.
"Stream computing is increasingly important for mainstream and consumer applications and is no longer limited to just the academic or engineering industries. Today we are truly seeing a fundamental shift in emerging system architectures," said Jon Peddie, president, Jon Peddie Research. "As the industry's only provider of both high-performance discrete GPUs and x86-compatible CPUs, AMD is uniquely well-suited to developing these architectures."
AMD customers, including ACCIT, Centre de Physique de Particules de Marseille, Neurala and Telanetix are using the AMD Stream SDK and current AMD FireStream, ATI FireGL or ATI Radeon boards to achieve dramatic performance gains on critical algorithms in HPC, workstation and consumer applications. Currently, Neurala reports that it is achieving 10-200x speedups over the CPU alone on biologically inspired neural models, applicable to finance, image processing and other applications.
AMD is also working closely with world class application and solution providers to ensure customers can achieve optimum performance results. Stream computing application and solution providers include CAPS entreprise, Mercury Computer Systems, RapidMind, RogueWave and VizExperts. Mercury Computer Systems provides high-performance computing systems and software designed for complex image, sensor, and signal processing applications. Its algorithm team reports that it has achieved 174 GFLOPS performance for large 1D complex single-precision floating point FFTs on the AMD FireStream 9250.
Pricing and availability
AMD plans to deliver the FireStream 9250 and the supporting SDK in Q3 2008 at an MSRP of $999 USD. AMD FireStream 9170, the industry's first double-precision floating point stream processor, is currently available for purchase and is competitively priced at $1,999 USD. For more information about AMD FireStream 9250 or AMD FireStream 9170 or AMD's complete line of stream computing solutions, please visit http://www.amd.com/stream.
Source:
AMD
Customers can leverage AMD's latest FireStream offering to run critical workloads such as financial analysis or seismic processing dramatically faster than with CPU alone, helping them to address more complex problems and achieve faster results. For example, developers are reporting up to a 55x performance increase on financial analysis codes as compared to processing on the CPU alone, which supports their efforts to make better and faster decisions. Additionally, the use of flexible GPU technology rather than custom accelerators assists those creating application-specific systems to enhance and maintain their solutions easily.
The AMD FireStream 9250 stream processor includes a second-generation double-precision floating point hardware implementation delivering more than 200 gigaflops, building on the capabilities of the earlier AMD FireStream 9170, the industry's first GP-GPU with double-precision floating point support. The AMD FireStream 9250's compact size makes it ideal for small 1U servers as well as most desktop systems, workstations, and larger servers and it features 1GB of GDDR3 memory, enabling developers to handle large, complex problems.
Driving broad consumer adoption with open systems
AMD enables development of the FireStream family of processors with its AMD Stream SDK, designed to help developers create accelerated applications for AMD FireStream, ATI FireGL and ATI Radeon GPUs. AMD takes an open-systems approach to its stream computing development environment to ensure that developers can access and build on the tools at any level. AMD offers published interfaces for its high-level language API, intermediate language, and instruction set architecture; and the AMD Stream SDK's Brook+ front-end is available as open source code.
In keeping with its open systems philosophy, AMD has also joined the Khronos Compute Working Group. This working group's goals include developing industry standards for data parallel programming and working with proposed specifications like OpenCL. The OpenCL specification can help provide developers with an easy path to development across multiple platforms.
"An open industry standard programming specification will help drive broad-based support for stream computing technology in mainstream applications," said Rick Bergman, senior vice president and general manager, Graphics Product Group, AMD. "We believe that OpenCL is a step in the right direction and we fully support this effort. AMD intends to ensure that the AMD Stream SDK rapidly evolves to comply with open industry standards as they emerge."
Accelerating industry adoption
The growth of the stream computing market has accelerated over the past few years with Fortune 1000 companies, leading software developers and academic institutions utilizing stream technology to achieve tremendous performance gains across a variety of applications.
"Stream computing is increasingly important for mainstream and consumer applications and is no longer limited to just the academic or engineering industries. Today we are truly seeing a fundamental shift in emerging system architectures," said Jon Peddie, president, Jon Peddie Research. "As the industry's only provider of both high-performance discrete GPUs and x86-compatible CPUs, AMD is uniquely well-suited to developing these architectures."
AMD customers, including ACCIT, Centre de Physique de Particules de Marseille, Neurala and Telanetix are using the AMD Stream SDK and current AMD FireStream, ATI FireGL or ATI Radeon boards to achieve dramatic performance gains on critical algorithms in HPC, workstation and consumer applications. Currently, Neurala reports that it is achieving 10-200x speedups over the CPU alone on biologically inspired neural models, applicable to finance, image processing and other applications.
AMD is also working closely with world class application and solution providers to ensure customers can achieve optimum performance results. Stream computing application and solution providers include CAPS entreprise, Mercury Computer Systems, RapidMind, RogueWave and VizExperts. Mercury Computer Systems provides high-performance computing systems and software designed for complex image, sensor, and signal processing applications. Its algorithm team reports that it has achieved 174 GFLOPS performance for large 1D complex single-precision floating point FFTs on the AMD FireStream 9250.
Pricing and availability
AMD plans to deliver the FireStream 9250 and the supporting SDK in Q3 2008 at an MSRP of $999 USD. AMD FireStream 9170, the industry's first double-precision floating point stream processor, is currently available for purchase and is competitively priced at $1,999 USD. For more information about AMD FireStream 9250 or AMD FireStream 9170 or AMD's complete line of stream computing solutions, please visit http://www.amd.com/stream.
54 Comments on AMD FireStream 9250 Breaks the 1 Teraflop Barrier
Basically, the 55x speedup quoted by AMD is:
1>> A single core Opteron running an opensource math library, COMPARED TO
2>> The FireStream running optimized math library SPECIFICALLY designed for financial math by RapidMind.
REAL COMPARISON
1./ Single core CPU, running inefficient C++ math library
2./ Replace math library with RapidMind, = 2x speedup
3./ Replace "single core" Opteron with "single core" Intel Core 2, = 2x speedup
4./ Replace single core with quad core = 4x speedup
So, actually, the REAL COMPARISON should be 55/16 = 3.5x speedup. At a price of $999.
OK, SO LETS USE A DUAL XEON SYSTEM ALTERNATIVE
5./ Upgrade to dual socket mainboard, one extra xeon, total $500, = 2 x speedup
That would give a net speedup of 1.75x to the FireStream but at a higher cost ($499), plus development time associated with using the SDK for FireStream and then having codethat could only run on the FireSteam. (THERE ARE GOOD SECURITY REASONS TO DO THIS... ESPECIALLY FOR PROPRIETARY FINANCE SOFTWARE).
IMO, 1.75x speed of a dual xeon workstation, is not all that impressive.
******
From looking closer at the hardware of FireStream, it seems to be essentially a GPU card with the "Video" bits removed. You could probably get a regular gaming card to do exactly the same. But I'm sure AMD will "lock" features within the BIOS, just like they do with the FireGL GPUs. I agree, too expensive
But its not much of a breakthrough. Its a GPU in wolfs clothes, with an SDK not dissimilar to CUDA concept.
Smoke and mirrors by AMD.
I'm curious, though, has anyone else noticed that AMD seems to have drastically changed their marketing strategies over the last 3-5 months? It seems to me that they've become a lot more aggressive in their marketing and claims, compared to how they used to be.
They're finally adopting the ruthless attitude of all the other financially successful and stable companies.
1./ Bullshit
2./ Lies
3./ Misrepresentation And that one too, please:
A./ No integrity
B./ No ethics
C./ Short term profit before brand reputation and customer loyalty, ala, fool the customer with 1, 2, 3
Sure, recently they might be 'twisting' the truth and stretching it as far as they can, but we're still given some kind of base to look at as well; unlike other companies who spit out propaganda that looks like they waved their voodoo stick over a spread sheeting while swinging chickens.
if AMD used their GPUs for CPUs... Intel would be screwed.
Even if a GPU the class of a HD2600 XT (120 SP's) was embedded, theoritically it means an added 50 GFLOPs at least.
They (quietly) point out that the GPGPU are fantastic for massively parallel calculations. But for general purpose mixed math they are aweful. Why? Because the processing power and benchmarks we keep reading about are based on calculations that are scalable via parallelization, so that, e.g. ALL 320 stream processors are put to good use.
If you were using the GPGPU to "re-calculate an EXCEL table", then divide performance by 320, since you wont get parallelization there. In such situations a CPU's FPU will PWN the GPGPU.
The GPGPU comes into its own ONLY when using the math library and SDK designed for it... AND when doing things like vector or matrix math, of SIMPLE additions, subtractions and multiplications.
An FPU will PWN a GPGU at trig math, for example.
Lets hope Fusion will give phenom the well needed performance boost.
Clearspeed's new math co-processor delivers 100 DP math (compared to Firestram 200 DP math) but with only 12W (compared to Firestream 150W).
Clearspeed CSX700 is the winner. It also has a better math library (faster) due to the CSX700 being a much more capable FPU than GPGPU (which is limited to simpler natives of plus, minus, multiply etc.)
Downside, $3000
i think that the cell would be useless because youd only be able to run linux and whats the point in having a powerfull cpu for linux if all you can run is doom 3 and quake 4
either way gpus are different architecture from cpus youd have to totaly redesign the gpu to include cache and memory controllers ect
im not sure why youd want a math co processor
co processors are useless if you have multi threading on a cpu and the software is programed to use it fully
id like to see physics done on a core of a cpu or have a full single graphics card for physics but be able to add in a cheaper graphics card o take advantage
ati.amd.com/technology/streamcomputing/faq.html#5
from here www.picocomputing.com/ this can be used with laptops!!