- Joined
- May 21, 2024
- Messages
- 738 (3.32/day)
Intel's taking a different path with its Gaudi 3 accelerator chips. It's staying away from the high-demand market for training big AI models, which has made NVIDIA so successful. Instead, Intel wants to help businesses that need cheaper AI solutions to train and run smaller specific models and open-source options. At a recent event, Intel talked up Gaudi 3's "price performance advantage" over NVIDIA's H100 GPU for inference tasks. Intel says Gaudi 3 is faster and more cost-effective than the H100 when running Llama 3 and Llama 2 models of different sizes.
Intel also claims that Gaudi 3 is as power-efficient as the H100 for large language model (LLM) inference with small token outputs and does even better with larger outputs. The company even suggests Gaudi 3 beats NVIDIA's newer H200 in LLM inference throughput for large token outputs. However, Gaudi 3 doesn't match up to the H100 in overall floating-point operation throughput for 16-bit and 8-bit formats. For bfloat16 and 8-bit floating-point precision matrix math, Gaudi 3 hits 1,835 TFLOPS in each format, while the H100 reaches 1,979 TFLOPS for BF16 and 3,958 TFLOPS for FP8.
In an interview with CRN, Anil Nanduri, head of Intel's AI acceleration office, stated that purchasing decisions for AI training infrastructure have primarily focused on performance rather than cost.
Intel believes that for many businesses, the answer is "no" and they will instead opt for smaller models based on tasks with less performance demands. Nanduri said that while the Gaudi 3 can't "catch up" to NVIDIA's latest GPUs, from a head-to-head performance standpoint, Gaudi 3 chips are ideal to enable the right systems to run task-based and open source models.
On a different subject, Intel has announced major job cuts in several states as part of its wider plan to shrink its workforce. The company will eliminate 1,300 positions in Oregon, 385 in Arizona, 319 in California, and 251 in Texas. Intel has a workforce of over 23,000 in Oregon, 12,000 in Arizona, 13,500 in California, and 2,100 in Texas. The layoffs are set to take place over a 14-day period starting November 15.
View at TechPowerUp Main Site | Source
Intel also claims that Gaudi 3 is as power-efficient as the H100 for large language model (LLM) inference with small token outputs and does even better with larger outputs. The company even suggests Gaudi 3 beats NVIDIA's newer H200 in LLM inference throughput for large token outputs. However, Gaudi 3 doesn't match up to the H100 in overall floating-point operation throughput for 16-bit and 8-bit formats. For bfloat16 and 8-bit floating-point precision matrix math, Gaudi 3 hits 1,835 TFLOPS in each format, while the H100 reaches 1,979 TFLOPS for BF16 and 3,958 TFLOPS for FP8.
In an interview with CRN, Anil Nanduri, head of Intel's AI acceleration office, stated that purchasing decisions for AI training infrastructure have primarily focused on performance rather than cost.
"And if you think in that context, there is an incumbent benefit, where all the frontier model research, all the capabilities are developed on the de facto platform where you're building it, you're researching it, and you're, in essence, subconsciously optimizing it as well. And then to make that port over [to a different platform] is work.
The world we are starting to see is people are questioning the [return on investment], the cost, the power and everything else. This is where—I don't have a crystal ball—but the way we think about it is, do you want one giant model that knows it all?", Anil Nanduri, the head of Intel's AI acceleration office.
Intel believes that for many businesses, the answer is "no" and they will instead opt for smaller models based on tasks with less performance demands. Nanduri said that while the Gaudi 3 can't "catch up" to NVIDIA's latest GPUs, from a head-to-head performance standpoint, Gaudi 3 chips are ideal to enable the right systems to run task-based and open source models.
On a different subject, Intel has announced major job cuts in several states as part of its wider plan to shrink its workforce. The company will eliminate 1,300 positions in Oregon, 385 in Arizona, 319 in California, and 251 in Texas. Intel has a workforce of over 23,000 in Oregon, 12,000 in Arizona, 13,500 in California, and 2,100 in Texas. The layoffs are set to take place over a 14-day period starting November 15.
View at TechPowerUp Main Site | Source