Sunday, July 6th 2008
Intel Larrabee Capable of 2 TFLOPs
German tech-journal Heise caught up with Intel's Pat Gelsinger for an article discussing the company's past and future as the silicon giant heads towards 40 years of service this 18th of July.
Among several topics, came up the most interesting one, visual computing and Intel's plans on it. 'Larrabee' strikes as a buzzword. It is the codename of Intel's upcoming graphics processor (GPU) architecture with which it plans to take on established players such as NVIDIA and AMD among others.
What's unique (so far) about Larrabee is that it's entirely made up of x86 processing cores. The Larrabee is likely to have 32 x86 processing cores. Here's a surprise: These processing cores are based on the design of Pentuim P54C, a 13+ year old x86 processor. This processor will be miniaturised to the 45nm fabrication process, they will be assisted by a 512-bit SIMD unit and these cores will support 64-bit address. Gelsinger says that 32 of these cores clocked at 2.00 GHz could belt out 2 TFLOPs of raw computational power. That's close to that of the upcoming AMD R700. Heise also reports that this GPU could have a TDP of as much as 300W (peak).With inputs from Heise
Among several topics, came up the most interesting one, visual computing and Intel's plans on it. 'Larrabee' strikes as a buzzword. It is the codename of Intel's upcoming graphics processor (GPU) architecture with which it plans to take on established players such as NVIDIA and AMD among others.
What's unique (so far) about Larrabee is that it's entirely made up of x86 processing cores. The Larrabee is likely to have 32 x86 processing cores. Here's a surprise: These processing cores are based on the design of Pentuim P54C, a 13+ year old x86 processor. This processor will be miniaturised to the 45nm fabrication process, they will be assisted by a 512-bit SIMD unit and these cores will support 64-bit address. Gelsinger says that 32 of these cores clocked at 2.00 GHz could belt out 2 TFLOPs of raw computational power. That's close to that of the upcoming AMD R700. Heise also reports that this GPU could have a TDP of as much as 300W (peak).With inputs from Heise
77 Comments on Intel Larrabee Capable of 2 TFLOPs
1./ Larrabee has a much more powerful ALU that a GPU, meaning that for some tasks, Larrabee can do in one instruction what might take a fat loop and lookuptables on a GPU
2./ Larrabee ALU is DP and FP. GPU SPE is SP. To mimick DP or FP using SP requires a lot of loop and overhead
3./ SIMD on Larrabee is 512bit or more. That's the same as 16x 32bit (SP) calculations at once. With 32 x86 cores in the Larrabee matrix, that is equivalent to 16x 32cores = 512 simultantous SP calculations. ie the same as 512 shader processor units.
The key and as yet unknown data is how many clock cycles to execute SIMD compared to a GPU's SPE.
I hope intel can sock it to the other 2,it will be good for us in the longrun,whether their first attempt is good or not.
Well, this Larrabee thing will put an end to Beowulf Class I clusters. And put a STOP to the interest in Cell blades.
Why? Much cheaper. And you wouldnt need to learn a new architecture model for programming, e.g. Cell. Just use your regular x86 IDE with Larrabee add-in. With Larrabee we are getting 2000Gflops / 300Watt = 6000Mflops / watt, ie 10-30 times as power efficient as the best supercomputers.
That has a HUGE implication to power and cooling needed to host a number crunching monster.
It also has a HUGE implication on the cost of installing an HPC given how cheap Larrabee is compared to scaling under regular Beowulf.
With Larrabee, anyone could have an HPC if they wanted to.
Hopefully they'll get smart and use Pentium Pro cores instead; 512 kb of L2 cache, MMX arch., and cooler name; what could go wrong?
Also, imagine if you got a bunch of mobos with 4 PCI-E x16 lanes (I'm pretty sure they exist), stuck these cards onto a whole bunch of them (along with a quad core something), and ran a beowulf cluster? Say you had 8 motherboards, thats 4 cards per mobo, which is 32 cards, which is 64 TFLOPS!! :eek:
@ TheGuruStud: I went from a Pentium 90 to a Celeron-400! :p
Then I swapped it for a 600 pentium III, but ran it at stock. Piece of crap CPU just magically died one day. Then I built a new rig :) AMD 1.4 Thunderbird! And I've never looked back (upgraded to xp 2100, then a long wait until athlon 64 3500, x2 4200 and opteron 170).
Damn, way off topic. Don't hurt me.
With a Larrabee, it is a cluster, but, strictly, it is not a beowulf cluster.
If you like home-made beowulfs, go here www.calvin.edu/~adams/research/microwulf/
en.wikipedia.org/wiki/Streaming_SIMD_Extensions
So, will Larrabe be adopting AVX?
If this becomes something big, it will suck for nVida and AMD... and for us computer tweakers.
bryan d
yes you can