- Joined
- Oct 9, 2007
- Messages
- 47,217 (7.55/day)
- Location
- Hyderabad, India
System Name | RBMK-1000 |
---|---|
Processor | AMD Ryzen 7 5700G |
Motherboard | ASUS ROG Strix B450-E Gaming |
Cooling | DeepCool Gammax L240 V2 |
Memory | 2x 8GB G.Skill Sniper X |
Video Card(s) | Palit GeForce RTX 2080 SUPER GameRock |
Storage | Western Digital Black NVMe 512GB |
Display(s) | BenQ 1440p 60 Hz 27-inch |
Case | Corsair Carbide 100R |
Audio Device(s) | ASUS SupremeFX S1220A |
Power Supply | Cooler Master MWE Gold 650W |
Mouse | ASUS ROG Strix Impact |
Keyboard | Gamdias Hermes E2 |
Software | Windows 11 Pro |
Today, MLCommons published results of its industry AI performance benchmark, MLPerf Training v4.0. Intel's results demonstrate the choice that Intel Gaudi 2 AI accelerators give enterprises and customers. Community-based software simplifies generative AI (GenAI) development and industry-standard Ethernet networking enables flexible scaling of AI systems. For the first time on the MLPerf benchmark, Intel submitted results on a large Gaudi 2 system (1,024 Gaudi 2 accelerators) trained in Intel Tiber Developer Cloud to demonstrate Gaudi 2 performance and scalability and Intel's cloud capacity for training MLPerf's GPT-3 175B1 parameter benchmark model.
"The industry has a clear need: address the gaps in today's generative AI enterprise offerings with high-performance, high-efficiency compute options. The latest MLPerf results published by MLCommons illustrate the unique value Intel Gaudi brings to market as enterprises and customers seek more cost-efficient, scalable systems with standard networking and open software, making GenAI more accessible to more customers," said Zane Ball, Intel corporate vice president and general manager, DCAI Product Management.
More customers want to benefit from GenAI but are unable to because of cost, scale and development requirements. With only 10% of enterprises successfully moving GenAI projects into production last year, Intel's AI offerings address the challenges businesses face in scaling AI initiatives. Intel Gaudi 2 is an accessible, scalable solution that has proven its ability to handily train large language models (LLMs) from 70 billion to 175 billion parameters. The soon-to-be-released Intel Gaudi 3 accelerator will bring a leap in performance, as well as openness and choice to enterprise GenAI.
The MLPerf results show Gaudi 2 continues to be the only MLPerf-benchmarked alternative for AI compute to the Nvidia H100. Trained on the Tiber Developer Cloud, Intel's GPT-3 results for time-to-train (TTT) of 66.9 minutes on an AI system of 1,024 Gaudi accelerators proves strong Gaudi 2 scaling performance on ultra-large LLMs within a developer cloud environment.
The benchmark suite featured a new measurement: fine-tuning the Llama 2 70B parameter model using low-rank adapters (LoRa). Fine-tuning LLMs is a common task for many customers and AI practitioners, making it a relevant benchmark for everyday applications. Intel's submission achieved time-to-train of 78.1 minutes on eight Gaudi 2 accelerators. Intel utilized open source software from Optimum Habana for the submission, leveraging Zero-3 from DeepSpeed for optimizing memory efficiency and scaling during large model training, as well as Flash-Attention-2 to accelerate attention mechanisms. The benchmark task force - led by the engineering teams from Intel's Habana Labs and Hugging Face - are responsible for the reference code and benchmark rules.
How Intel Gaudi Provides Customers with Value in AI: To date, high costs have priced too many enterprises out of the market. Gaudi is starting to change that. At Computex, Intel announced that a standard AI kit including eight Intel Gaudi 2 accelerators with a universal baseboard (UBB) offered to system providers at $65,000 is estimated to be one-third the cost of comparable competitive platforms. A kit including eight Intel Gaudi 3 accelerators with a UBB lists at $125,000, estimated to be two-thirds the cost of comparable competitive platforms.
The proof is in increased momentum. Customers use Gaudi for the value it brings with price-performance advantages and accessibility, including:
According to CIO.com, Seekr cited cost savings of 40% up to 400% from the Tiber Developer Cloud for select AI workloads compared to on-premise systems with another vendor's GPUs and with another cloud service provider, along with 20% faster AI training and 50% faster AI inference than on-premise.
What's Next: Intel will submit MLPerf results based on the Intel Gaudi 3 AI accelerator in the upcoming inference benchmark. Intel Gaudi 3 accelerators are projected to provide a leap in performance for AI training and inference on popular LLMs and multimodal models and will be generally available from original equipment manufacturers in fall of 2024.
View at TechPowerUp Main Site
"The industry has a clear need: address the gaps in today's generative AI enterprise offerings with high-performance, high-efficiency compute options. The latest MLPerf results published by MLCommons illustrate the unique value Intel Gaudi brings to market as enterprises and customers seek more cost-efficient, scalable systems with standard networking and open software, making GenAI more accessible to more customers," said Zane Ball, Intel corporate vice president and general manager, DCAI Product Management.
More customers want to benefit from GenAI but are unable to because of cost, scale and development requirements. With only 10% of enterprises successfully moving GenAI projects into production last year, Intel's AI offerings address the challenges businesses face in scaling AI initiatives. Intel Gaudi 2 is an accessible, scalable solution that has proven its ability to handily train large language models (LLMs) from 70 billion to 175 billion parameters. The soon-to-be-released Intel Gaudi 3 accelerator will bring a leap in performance, as well as openness and choice to enterprise GenAI.
The MLPerf results show Gaudi 2 continues to be the only MLPerf-benchmarked alternative for AI compute to the Nvidia H100. Trained on the Tiber Developer Cloud, Intel's GPT-3 results for time-to-train (TTT) of 66.9 minutes on an AI system of 1,024 Gaudi accelerators proves strong Gaudi 2 scaling performance on ultra-large LLMs within a developer cloud environment.
The benchmark suite featured a new measurement: fine-tuning the Llama 2 70B parameter model using low-rank adapters (LoRa). Fine-tuning LLMs is a common task for many customers and AI practitioners, making it a relevant benchmark for everyday applications. Intel's submission achieved time-to-train of 78.1 minutes on eight Gaudi 2 accelerators. Intel utilized open source software from Optimum Habana for the submission, leveraging Zero-3 from DeepSpeed for optimizing memory efficiency and scaling during large model training, as well as Flash-Attention-2 to accelerate attention mechanisms. The benchmark task force - led by the engineering teams from Intel's Habana Labs and Hugging Face - are responsible for the reference code and benchmark rules.
How Intel Gaudi Provides Customers with Value in AI: To date, high costs have priced too many enterprises out of the market. Gaudi is starting to change that. At Computex, Intel announced that a standard AI kit including eight Intel Gaudi 2 accelerators with a universal baseboard (UBB) offered to system providers at $65,000 is estimated to be one-third the cost of comparable competitive platforms. A kit including eight Intel Gaudi 3 accelerators with a UBB lists at $125,000, estimated to be two-thirds the cost of comparable competitive platforms.
The proof is in increased momentum. Customers use Gaudi for the value it brings with price-performance advantages and accessibility, including:
- Naver, a South Korean cloud service provider and leading search engine catering to more than 600 million users, is building a new AI ecosystem and lowering barriers to enable wide-scale LLM adoption by reducing development costs and project timelines for its customers.
- AI Sweden, an alliance between the Swedish government and private business, leverages Gaudi for fine-tuning with domain-specific municipal content to improve operational efficiencies and enhance public services for Sweden's constituents.
According to CIO.com, Seekr cited cost savings of 40% up to 400% from the Tiber Developer Cloud for select AI workloads compared to on-premise systems with another vendor's GPUs and with another cloud service provider, along with 20% faster AI training and 50% faster AI inference than on-premise.
What's Next: Intel will submit MLPerf results based on the Intel Gaudi 3 AI accelerator in the upcoming inference benchmark. Intel Gaudi 3 accelerators are projected to provide a leap in performance for AI training and inference on popular LLMs and multimodal models and will be generally available from original equipment manufacturers in fall of 2024.
View at TechPowerUp Main Site