News Posts matching #Gaudi 3

Return to Keyword Browsing

GIGABYTE Showcases a Leading AI and Enterprise Portfolio at Supercomputing 2024

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, shows off at SC24 how the GIGABYTE enterprise portfolio provides solutions for all applications, from cloud computing to AI to enterprise IT, including energy-efficient liquid-cooling technologies. This portfolio is made more complete by long-term collaborations with leading technology companies and emerging industry leaders, which will be showcased at GIGABYTE booth #3123 at SC24 (Nov. 19-21) in Atlanta. The booth is sectioned to put the spotlight on strategic technology collaborations, as well as direct liquid cooling partners.

The GIGABYTE booth will showcase an array of NVIDIA platforms built to keep up with the diversity of workloads and degrees of demands in applications of AI & HPC hardware. For a rack-scale AI solution using the NVIDIA GB200 NVL72 design, GIGABYTE displays how seventy-two GPUs can be in one rack with eighteen GIGABYTE servers each housing two NVIDIA Grace CPUs and four NVIDIA Blackwell GPUs. Another platform at the GIGABYTE booth is the NVIDIA HGX H200 platform. GIGABYTE exhibits both its liquid-cooling G4L3-SD1 server and an air-cooled version, G593-SD1.

Dell Shows Compute-Dense AI Servers at SC24

Dell Technologies (NYSE: DELL) continues to make enterprise AI adoption easier with the Dell AI Factory, expanding the world's broadest AI solutions portfolio.Powerful new infrastructure, solutions and services accelerate, simplify and streamline AI workloads and data management.

"Getting AI up and running across a company can be a real challenge," said Arthur Lewis, president, Infrastructure Solutions Group, Dell Technologies. "We're making it easier for our customers with new AI infrastructure, solutions and services that simplify AI deployments, paving the way for smarter, faster ways to work and a more adaptable future."

Intel Won't Compete Against NVIDIA's High-End AI Dominance Soon, Starts Laying Off Over 2,200 Workers Across US

Intel's taking a different path with its Gaudi 3 accelerator chips. It's staying away from the high-demand market for training big AI models, which has made NVIDIA so successful. Instead, Intel wants to help businesses that need cheaper AI solutions to train and run smaller specific models and open-source options. At a recent event, Intel talked up Gaudi 3's "price performance advantage" over NVIDIA's H100 GPU for inference tasks. Intel says Gaudi 3 is faster and more cost-effective than the H100 when running Llama 3 and Llama 2 models of different sizes.

Intel also claims that Gaudi 3 is as power-efficient as the H100 for large language model (LLM) inference with small token outputs and does even better with larger outputs. The company even suggests Gaudi 3 beats NVIDIA's newer H200 in LLM inference throughput for large token outputs. However, Gaudi 3 doesn't match up to the H100 in overall floating-point operation throughput for 16-bit and 8-bit formats. For bfloat16 and 8-bit floating-point precision matrix math, Gaudi 3 hits 1,835 TFLOPS in each format, while the H100 reaches 1,979 TFLOPS for BF16 and 3,958 TFLOPS for FP8.

Inflection AI and Intel Launch Enterprise AI System

Today, Inflection AI and Intel announced a collaboration to accelerate the adoption and impact of AI for enterprises as well as developers. Inflection AI is launching Inflection for Enterprise, an industry-first, enterprise-grade AI system powered by Intel Gaudi and Intel Tiber AI Cloud (AI Cloud), to deliver empathetic, conversational, employee-friendly AI capabilities and provide the control, customization and scalability required for complex, large-scale deployments. This system is available presently through the AI Cloud and will be shipping to customers as an industry-first AI appliance powered by Gaudi 3 in Q1 2025.

"Through this strategic collaboration with Inflection AI, we are setting a new standard with AI solutions that deliver immediate, high-impact results. With support for open-source models, tools, and competitive performance per watt, Intel Gaudi 3 solutions make deploying GenAI accessible, affordable, and efficient for enterprises of any size." -Justin Hotard, Intel executive vice president and general manager of the Data Center and AI Group

ASUS Introduces All-New Intel Xeon 6 Processor Servers

ASUS today announced its all-new line-up of Intel Xeon 6 processor-powered servers, ready to satisfy the escalating demand for high-performance computing (HPC) solutions. The new servers include the multi-node ASUS RS920Q-E12, which supports Intel Xeon 6900 series processors for HPC applications; and the ASUS RS720Q-E12, RS720-E12 and RS700-E12 server models, embedded with Intel Xeon 6700 series with E-cores, will also support Intel Xeon 6700/6500 series with P-cores in Q1, 2025, to provide seamless integration and optimization for modern data centers and diverse IT environments.

These powerful new servers, built on the solid foundation of trusted and resilient ASUS server design, offer improved scalability, enabling clients to build customized data centers and scale up their infrastructure to achieve their highest computing potential - ready to deliver HPC success across diverse industries and use cases.

Supermicro Adds New Max-Performance Intel-Based X14 Servers

Supermicro, Inc. a Total IT Solution Provider for AI/ML, HPC, Cloud, Storage, and 5G/Edge, today adds new maximum performance GPU, multi-node, and rackmount systems to the X14 portfolio, which are based on the Intel Xeon 6900 Series Processors with P-Cores (formerly codenamed Granite Rapids-AP). The new industry-leading selection of workload-optimized servers addresses the needs of modern data centers, enterprises, and service providers. Joining the efficiency-optimized X14 servers leveraging the Xeon 6700 Series Processors with E-cores launched in June 2024, today's additions bring maximum compute density and power to the Supermicro X14 lineup to create the industry's broadest range of optimized servers supporting a wide variety of workloads from demanding AI, HPC, media, and virtualization to energy-efficient edge, scale-out cloud-native, and microservices applications.

"Supermicro X14 systems have been completely re-engineered to support the latest technologies including next-generation CPUs, GPUs, highest bandwidth and lowest latency with MRDIMMs, PCIe 5.0, and EDSFF E1.S and E3.S storage," said Charles Liang, president and CEO of Supermicro. "Not only can we now offer more than 15 families, but we can also use these designs to create customized solutions with complete rack integration services and our in-house developed liquid cooling solutions."

GIGABYTE Intros Performance Optimized Servers Using Intel Xeon 6900-series with P-core

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, today announced its first wave of GIGABYTE servers for Intel Xeon 6 Processors with P-cores. This new Intel Xeon platform is engineered to optimize per-core-performance for compute-intensive and AI intensive workloads, as well as general purpose applications. GIGABYTE servers for these workloads are built to achieve the best possible performance by fine tuning the server design to the chip design and to specific workloads. ⁠

All new GIGABYTE servers support Intel Xeon 6900-series processors with P-cores that have up to 128 cores and up to 96 PCIe Gen 5 lanes. Additionally, for greater performance in memory intensive workloads, the 6900-series expands to 12 channel memory, and makes available up to 64 lanes CXL 2.0. Overall, this modular SOC architecture has great potential with the ability to leverage a shared platform for running both performance and efficiency optimized architecture.⁠

Intel Launches Gaudi 3 AI Accelerator and P-Core Xeon 6 CPU

As AI continues to revolutionize industries, enterprises are increasingly in need of infrastructure that is both cost-effective and available for rapid development and deployment. To meet this demand head-on, Intel today launched Xeon 6 with Performance-cores (P-cores) and Gaudi 3 AI accelerators, bolstering the company's commitment to deliver powerful AI systems with optimal performance per watt and lower total cost of ownership (TCO).

"Demand for AI is leading to a massive transformation in the data center, and the industry is asking for choice in hardware, software and developer tools," said Justin Hotard, Intel executive vice president and general manager of the Data Center and Artificial Intelligence Group. "With our launch of Xeon 6 with P-cores and Gaudi 3 AI accelerators, Intel is enabling an open ecosystem that allows our customers to implement all of their workloads with greater performance, efficiency and security."

Intel Announces Deployment of Gaudi 3 Accelerators on IBM Cloud

IBM and Intel announced a global collaboration to deploy Intel Gaudi 3 AI accelerators as a service on IBM Cloud. This offering, which is expected to be available in early 2025, aims to help more cost-effectively scale enterprise AI and drive innovation underpinned with security and resiliency. This collaboration will also enable support for Gaudi 3 within IBM's watsonx AI and data platform. IBM Cloud is the first cloud service provider (CSP) to adopt Gaudi 3, and the offering will be available for both hybrid and on-premise environments.

"Unlocking the full potential of AI requires an open and collaborative ecosystem that provides customers with choice and accessible solutions. By integrating Gaudi 3 AI accelerators and Xeon CPUs with IBM Cloud, we are creating new AI capabilities and meeting the demand for affordable, secure and innovative AI computing solutions," said Justin Hotard, Intel executive vice president and general manager of the Data Center and AI Group.

Supermicro Previews New Max Performance Intel-based X14 Servers

Supermicro, Inc., a Total IT Solution Provider for AI/ML, HPC, Cloud, Storage, and 5G/Edge, is previewing new, completely re-designed X14 server platforms which will leverage next-generation technologies to maximize performance for compute-intensive workloads and applications. Building on the success of Supermicro's efficiency-optimized X14 servers that launched in June 2024, the new systems feature significant upgrades across the board, supporting a never-before-seen 256 performance cores (P-cores) in a single node, memory support up for MRDIMMs at 8800MT/s, and compatibility with next-generation SXM, OAM, and PCIe GPUs. This combination can drastically accelerate AI and compute as well as significantly reduce the time and cost of large-scale AI training, high-performance computing, and complex data analytics tasks. Approved customers can secure early access to complete, full-production systems via Supermicro's Early Ship Program or for remote testing with Supermicro JumpStart.

"We continue to add to our already comprehensive Data Center Building Block solutions with these new platforms, which will offer unprecedented performance, and new advanced features," said Charles Liang, president and CEO of Supermicro. "Supermicro is ready to deliver these high-performance solutions at rack-scale with the industry's most comprehensive direct-to-chip liquid cooled, total rack integration services, and a global manufacturing capacity of up to 5,000 racks per month including 1,350 liquid cooled racks. With our worldwide manufacturing capabilities, we can deliver fully optimized solutions which accelerate our time-to-delivery like never before, while also reducing TCO."

Intel Dives Deep into Lunar Lake, Xeon 6, and Gaudi 3 at Hot Chips 2024

Demonstrating the depth and breadth of its technologies at Hot Chips 2024, Intel showcased advancements across AI use cases - from the data center, cloud and network to the edge and PC - while covering the industry's most advanced and first-ever fully integrated optical compute interconnect (OCI) chiplet for high-speed AI data processing. The company also unveiled new details about the Intel Xeon 6 SoC (code-named Granite Rapids-D), scheduled to launch during the first half of 2025.

"Across consumer and enterprise AI usages, Intel continuously delivers the platforms, systems and technologies necessary to redefine what's possible. As AI workloads intensify, Intel's broad industry experience enables us to understand what our customers need to drive innovation, creativity and ideal business outcomes. While more performant silicon and increased platform bandwidth are essential, Intel also knows that every workload has unique challenges: A system designed for the data center can no longer simply be repurposed for the edge. With proven expertise in systems architecture across the compute continuum, Intel is well-positioned to power the next generation of AI innovation." -Pere Monclus, chief technology officer, Network and Edge Group at Intel.

ASUS Presents Comprehensive AI Server Lineup

ASUS today announced its ambitious All in AI initiative, marking a significant leap into the server market with a complete AI infrastructure solution, designed to meet the evolving demands of AI-driven applications from edge, inference and generative AI the new, unparalleled wave of AI supercomputing. ASUS has proven its expertise lies in striking the perfect balance between hardware and software, including infrastructure and cluster architecture design, server installation, testing, onboarding, remote management and cloud services - positioning the ASUS brand and AI server solutions to lead the way in driving innovation and enabling the widespread adoption of AI across industries.

Meeting diverse AI needs
In partnership with NVIDIA, Intel and AMD, ASUS offer comprehensive AI-infrastructure solutions with robust software platforms and services, from entry-level AI servers and machine-learning solutions to full racks and data centers for large-scale supercomputing. At the forefront is the ESC AI POD with NVIDIA GB200 NVL72, a cutting-edge rack designed to accelerate trillion-token LLM training and real-time inference operations. Complemented by the latest NVIDIA Blackwell GPUs, NVIDIA Grace CPUs and 5th Gen NVIDIA NVLink technology, ASUS servers ensure unparalleled computing power and efficiency.

Intel Postpones Innovation 2024 Event to 2025, No Word on Arrow Lake Launch

Intel announced that it has postponed the 2024 edition of its Innovation event to 2025. Among other things, the first-party event showcases innovations from the company's various business units made in the preceding year, includes a few key product launches, and teasers for what's next. The Innovation 2024 was poised to be particularly important for the company, as it was expected to launch its next generation Core Ultra "Arrow Lake" processors not just for mobiles, but even the desktop platform. Other key product showcase items include Xeon 6 server processors, and Gaudi 3 AI accelerator, besides updates from the company's foundry business, particularly the Intel 20A and Intel 18A nodes.

Intel's postponement of Innovation 2024 can be seen as a move to demonstrate sincerity that the company working to meet its goal of cutting cost of revenue by $10 billion through FY 2024, something that will bear results by mid-2025. It would have probably felt inappropriate for the company to host a lavish product showcase event in light of this. That said, there's no word on how this affects launch of products such as Core Ultra "Arrow Lake," it's possible that the company may launch them in a low-key dedicated media presentation.

Intel Reports Q2-2024 Financial Results; Announces $10 Billion Cost Reduction Plan, Shares Fall 20%+

Intel Corporation today reported second-quarter 2024 financial results. "Our Q2 financial performance was disappointing, even as we hit key product and process technology milestones. Second-half trends are more challenging than we previously expected, and we are leveraging our new operating model to take decisive actions that will improve operating and capital efficiencies while accelerating our IDM 2.0 transformation," said Pat Gelsinger, Intel CEO. "These actions, combined with the launch of Intel 18A next year to regain process technology leadership, will strengthen our position in the market, improve our profitability and create shareholder value."

"Second-quarter results were impacted by gross margin headwinds from the accelerated ramp of our AI PC product, higher than typical charges related to non-core businesses and the impact from unused capacity," said David Zinsner, Intel CFO. "By implementing our spending reductions, we are taking proactive steps to improve our profits and strengthen our balance sheet. We expect these actions to meaningfully improve liquidity and reduce our debt balance while enabling us to make the right investments to drive long-term value for shareholders."

Intel Submits Gaudi 2 Results on MLCommons' Newest Benchmark

Today, MLCommons published results of its industry AI performance benchmark, MLPerf Training v4.0. Intel's results demonstrate the choice that Intel Gaudi 2 AI accelerators give enterprises and customers. Community-based software simplifies generative AI (GenAI) development and industry-standard Ethernet networking enables flexible scaling of AI systems. For the first time on the MLPerf benchmark, Intel submitted results on a large Gaudi 2 system (1,024 Gaudi 2 accelerators) trained in Intel Tiber Developer Cloud to demonstrate Gaudi 2 performance and scalability and Intel's cloud capacity for training MLPerf's GPT-3 175B1 parameter benchmark model.

"The industry has a clear need: address the gaps in today's generative AI enterprise offerings with high-performance, high-efficiency compute options. The latest MLPerf results published by MLCommons illustrate the unique value Intel Gaudi brings to market as enterprises and customers seek more cost-efficient, scalable systems with standard networking and open software, making GenAI more accessible to more customers," said Zane Ball, Intel corporate vice president and general manager, DCAI Product Management.

Intel Ponte Vecchio Waves Goodbye, Company Focuses on Falcon Shores for 2025 Release

According to ServeTheHome, Intel has decided to discontinue its high-performance computing (HPC) product line, Ponte Vecchio, and shift its focus towards developing its next-generation data center GPU, codenamed Falcon Shores. This decision comes as Intel aims to streamline its operations and concentrate its resources on the most promising and competitive offerings. The Ponte Vecchio GPU, released in January of 2023, was intended to be Intel's flagship product for the HPC market, competing against the likes of NVIDIA's H100 and AMD's Instinct MI series. However, despite its impressive specifications and features, Ponte Vecchio faced significant delays and challenges in its development and production cycle. Intel's decision to abandon Ponte Vecchio is pragmatic, recognizing the intense competition and rapidly evolving landscape of the data center GPU market.

By pivoting its attention to Falcon Shores, Intel aims to deliver a more competitive and cutting-edge solution that can effectively challenge the dominance of its rivals. Falcon Shores, slated for release in 2025, is expected to leverage Intel's latest process node and architectural innovations. Currently, Intel has Gaudi 2 and Gaudi 3 accelerators for AI. However, the HPC segment is left without a clear leader in the company's product offerings. Intel's Ponte Vecchio is powering Aurora exascale supercomputer, which is the latest submission to the TOP500 supercomputer lists. This is also coming after the Rialto Bridge cancellation, which was supposed to be an HPC-focused card. In the future, the company will focus only on the Falcon Shores accelerator, which will unify HPC and AI needs for high-precision FP64 and lower-precision FP16/INT8.

Intel Launches Gaudi 3 AI Accelerator: 70% Faster Training, 50% Faster Inference Compared to NVIDIA H100, Promises Better Efficiency Too

During the Vision 2024 event, Intel announced its latest Gaudi 3 AI accelerator, promising significant improvements over its predecessor. Intel claims the Gaudi 3 offers up to 70% improvement in training performance, 50% better inference, and 40% better efficiency than Nvidia's H100 processors. The new AI accelerator is presented as a PCIe Gen 5 dual-slot add-in card with a 600 W TDP or an OAM module with 900 W. The PCIe card has the same peak 1,835 TeraFLOPS of FP8 performance as the OAM module despite a 300 W lower TDP. The PCIe version works as a group of four per system, while the OAM HL-325L modules can be run in an eight-accelerator configuration per server. This likely will result in a lower sustained performance, given the lower TDP, but it confirms that the same silicon is used, just finetuned with a lower frequency. Built on TSMC's N5 5 nm node, the AI accelerator features 64 Tensor Cores, delivering double the FP8 and quadruple FP16 performance over the previous generation Gaudi 2.

The Gaudi 3 AI chip comes with 128 GB of HBM2E with 3.7 TB/s of bandwidth and 24 200 Gbps Ethernet NICs, with dual 400 Gbps NICs used for scale-out. All of that is laid out on 10 tiles that make up the Gaudi 3 accelerator, which you can see pictured below. There is 96 MB of SRAM split between two compute tiles, which acts as a low-level cache that bridges data communication between Tensor Cores and HBM memory. Intel also announced support for the new performance-boosting standardized MXFP4 data format and is developing an AI NIC ASIC for Ultra Ethernet Consortium-compliant networking. The Gaudi 3 supports clusters of up to 8192 cards, coming from 1024 nodes comprised of systems with eight accelerators. It is on track for volume production in Q3, offering a cost-effective alternative to NVIDIA accelerators with the additional promise of a more open ecosystem. More information and a deeper dive can be found in the Gaudi 3 Whitepaper.

Intel Unleashes Enterprise AI with Gaudi 3, AI Open Systems Strategy and New Customer Wins

At the Intel Vision 2024 customer and partner conference, Intel introduced the Intel Gaudi 3 accelerator to bring performance, openness and choice to enterprise generative AI (GenAI), and unveiled a suite of new open scalable systems, next-gen products and strategic collaborations to accelerate GenAI adoption. With only 10% of enterprises successfully moving GenAI projects into production last year, Intel's latest offerings address the challenges businesses face in scaling AI initiatives.

"Innovation is advancing at an unprecedented pace, all enabled by silicon - and every company is quickly becoming an AI company," said Intel CEO Pat Gelsinger. "Intel is bringing AI everywhere across the enterprise, from the PC to the data center to the edge. Our latest Gaudi, Xeon and Core Ultra platforms are delivering a cohesive set of flexible solutions tailored to meet the changing needs of our customers and partners and capitalize on the immense opportunities ahead."
Return to Keyword Browsing
Dec 21st, 2024 23:17 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts