News Posts matching #accelerator

Return to Keyword Browsing

AMD Instinct GPUs are Ready to Take on Today's Most Demanding AI Models

Customers evaluating AI infrastructure today rely on a combination of industry-standard benchmarks and real-world model performance metrics—such as those from Llama 3.1 405B, DeepSeek-R1, and other leading open-source models—to guide their GPU purchase decisions. At AMD, we believe that delivering value across both dimensions is essential to driving broader AI adoption and real-world deployment at scale. That's why we take a holistic approach—optimizing performance for rigorous industry benchmarks like MLperf while also enabling Day 0 support and rapid tuning for the models most widely used in production by our customers.

This strategy helps ensure AMD Instinct GPUs deliver not only strong, standardized performance, but also high-throughput, scalable AI inferencing across the latest generative and language models used by customers. We will explore how AMD's continued investment in benchmarking, open model enablement, software and ecosystem tools helps unlock greater value for customers—from MLPerf Inference 5.0 results to Llama 3.1 405B and DeepSeek-R1 performance, ROCm software advances, and beyond.

IBM & Intel Announce the Availability of Gaudi 3 AI Accelerators on IBM Cloud

Yesterday, at Intel Vision 2025, IBM announced the availability of Intel Gaudi 3 AI accelerators on IBM Cloud. This offering delivers Intel Gaudi 3 in a public cloud environment for production workloads. Through this collaboration, IBM Cloud aims to help clients more cost-effectively scale and deploy enterprise AI. Intel Gaudi 3 AI accelerators on IBM Cloud are currently available in Frankfurt (eu-de) and Washington, D.C. (us-east) IBM Cloud regions, with future availability for the Dallas (us-south) IBM Cloud region in Q2 2025.

IBM's AI in Action 2024 report found that 67% of surveyed leaders reported revenue increases of 25% or more due to including AI in business operations. Although AI is demonstrating promising revenue increases, enterprises are also balancing the costs associated with the infrastructure needed to drive performance. By leveraging Intel's Gaudi 3 on IBM Cloud, the two companies are aiming to help clients more cost effectively test, innovate and deploy generative AI solutions. "By bringing Intel Gaudi 3 AI accelerators to IBM Cloud, we're enabling businesses to help scale generative AI workloads with optimized performance for inferencing and fine-tuning. This collaboration underscores our shared commitment to making AI more accessible and cost-effective for enterprises worldwide," said Saurabh Kulkarni, Vice President, Datacenter AI Strategy and Product Management, Intel.

SMIC Reportedly On Track to Finalize 5 nm Process in 2025, Projected to Cost 40-50% More Than TSMC Equivalent

According to a report produced by semiconductor industry analysts at Kiwoom Securities—a South Korean financial services firm—Semiconductor Manufacturing International Corporation (SMIC) is expected to complete the development of a 5 nm process at some point in 2025. Jukanlosreve summarized this projection in a recent social media post. SMIC is often considered to be China's flagship foundry business; the partially state-owned organization seems to heavily involved in the production of (rumored) next-gen Huawei Ascend 910 AI accelerators. SMIC foundry employees have reportedly struggled to break beyond a 7 nm manufacturing barrier, due to lack of readily accessible cutting-edge EUV equipment. As covered on TechPowerUp last month, leading lights within China's semiconductor industry are (allegedly) developing lithography solutions for cutting-edge 5 nm and 3 nm wafer production.

Huawei is reportedly evaluating an in-house developed laser-induced discharge plasma (LDP)-based machine, but finalized equipment will not be ready until 2026—at least for mass production purposes. Jukanlosreve's short interpretation of Kiwoom's report reads as follows: (SMIC) achieved mass production of the 7 nm (N+2) process without EUV and completed the development of the 5 nm process to support the mass production of the Huawei Ascend 910C. The cost of SMIC's 5 nm process is 40-50% higher than TSMC's, and its yield is roughly one-third." The nation's foundries are reliant on older ASML equipment, thus are unable to produce products that can compete with the advanced (volume and quality) output of "global" TSMC and Samsung chip manufacturing facilities. The fresh unveiling of SiCarrier's Color Mountain series has signalled a promising new era for China's foundry industry.

NVIDIA H20 AI GPU at Risk in China, Due to Revised Energy-efficiency Guidelines & Supply Problems

NVIDIA's supply of Chinese market-exclusive H20 AI GPU faces an uncertain future, due to recently introduced energy-efficiency guidelines. As covered over a year ago, Team Green readied a regional alternative to its "full fat" H800 "Hopper" AI GPU—designed and/or neutered to comply with US sanctions. Despite being less performant than Western siblings, the H20 model proved to be highly popular by mid-2024—industry analysis projected "$12 billion in take-home revenue" for NVIDIA. According to a fresh Reuters news piece, demand for cut-down "Hopper" hardware has surged throughout early 2025. The report cites "a rush to adopt Chinese AI startup DeepSeek's cost-effective AI models" as the main cause behind an increased snap up rate of H20 chips; with the nation's "big three" AI players—Tencent, Alibaba and ByteDance—driving the majority of sales.

The supply of H20 AI GPUs seems to be under threat on several fronts; Reuters points out that "U.S. officials were considering curbs on sales of H20 chips to China" back in January. Returning to the present day, their report sources "unofficial" statements from H3C—one of China's largest server equipment manufacturers and a key OEM partner for NVIDIA. An anonymous company insider outlined a murky outlook: "H20's international supply chain faces significant uncertainties...We were told the chips would be available, but when it came time to actually purchase them, we were informed they had already been sold at higher prices." More (rumored) bad news has arrived in the shape of alleged Chinese government intervention—the Financial Times posits that local regulators have privately advised that Tencent, Alibaba and ByteDance not purchase NVIDIA H20 chips.

Marvell Demonstrates Industry's First End-to-End PCIe Gen 6 Over Optics at OFC 2025

Marvell Technology, Inc., a leader in data infrastructure semiconductor solutions, today announced in collaboration with TeraHop, a global optical solutions provider for AI driven data centers, the demonstration of the industry's first end-to-end PCIe Gen 6 over optics in the Marvell booth #2129 at OFC 2025. The demonstration will showcase the extension of PCIe reach beyond traditional electrical limits to enable low-latency, standards-based AI scale-up infrastructure.

As AI workloads drive exponential data growth, PCIe connectivity must evolve to support higher bandwidth and longer reach. The Marvell Alaska P PCIe Gen 6 retimer and its PCIe Gen 7 SerDes technology enable low-latency, low bit-error-rate transmission over optical fiber, delivering the scalability, power efficiency, and high performance required for next-generation accelerated infrastructure. With PCIe over optics, system designers will be able to take advantage of longer links between devices that feature the low latency of PCIe technology.

PCI-SIG Ratifies PCI Express 7.0 Specification to Reach 128 GT/s

The AI data center buildout requires massive bandwidth from accelerator to accelerator and from accelerator to CPU. At the core of that bandwidth bridge is PCIe technology, which constantly needs to evolve to satisfy massive bandwidth requirements. Today, PCI-SIG, the working group behind the PCI and PCIe connector, is releasing details about the almost ready 0.9 version of the PCIe 7.0 connector and its final specifications. The latest PCIe 7.0 will bring 128 GT/s speeds, with a bi-directional bandwidth of 512 GB/s in the x16 lane configuration. Targeting applications like 800G Ethernet, AI/ML, cloud, quantum computing, hyperscalers, military/aerospace, and cloud providers all need massive bandwidth for their respective applications and use cases to work flawlessly.

Interestingly, as PCIe doubles bandwidth over the traditional three-year cadence, high bandwidth for things like storage is becoming available on fewer and fewer lanes. For example, PCIe 3.0 with x16 lanes delivers 32 GB/s of bi-directional bandwidth. And now, PCIe 7.0 delivers that same bandwidth on only a single x1 lane. Some other goals of the new PCIe 7.0 include significant improvements in channel parameters and signal integrity while enhancing power efficiency and maintaining the protocol's low-latency characteristics. All while ensuring complete backward compatibility with previous generations of the standard. Notably, the PCIe 7.0 standard uses PAM4 signaling, which was first presented for PCIe 6.0. Here is a nice PAM4 signaling primer if you want to learn more about PAM4 signaling. Below are the specifications of PCIe generations and their respective characteristics. We expect to see final version v1.0 by end of the year, and some PCIe 7.0 accelerators next year.
PCIe 7.0 PCIe 7.0 PCIe 7.0

Global Top 10 IC Design Houses See 49% YoY Growth in 2024, NVIDIA Commands Half the Market

TrendForce reveals that the combined revenue of the world's top 10 IC design houses reached approximately US$249.8 billion in 2024, marking a 49% YoY increase. The booming AI industry has fueled growth across the semiconductor sector, with NVIDIA leading the charge, posting an astonishing 125% revenue growth, widening its lead over competitors, and solidifying its dominance in the IC industry.

Looking ahead to 2025, advancements in semiconductor manufacturing will further enhance AI computing power, with LLMs continuing to emerge. Open-source models like DeepSeek could lower AI adoption costs, accelerating AI penetration from servers to personal devices. This shift positions edge AI devices as the next major growth driver for the semiconductor industry.

AMD Recommends EPYC Processors for Everyday AI Server Tasks

Ask a typical IT professional today whether they're leveraging AI, and there's a good chance they'll say yes-after all, they have reputations to protect! Kidding aside, many will report that their teams may use Web-based tools like ChatGPT or even have internal chatbots that serve their employee base on their intranet, but for that not much AI is really being implemented at the infrastructure level. As it turns out, the true answer is a bit different. AI tools and techniques have embedded themselves firmly into standard enterprise workloads and are a more common, everyday phenomena than even many IT people may realize. Assembly line operations now include computer vision-powered inspections. Supply chains use AI for demand forecasting making business move faster and of course, AI note-taking and meeting summary is embedded on virtually all the variants of collaboration and meeting software.

Increasingly, critical enterprise software tools incorporate built-in recommendation systems, virtual agents or some other form of AI-enabled assistance. AI is truly becoming a pervasive, complementary tool for everyday business. At the same time, today's enterprises are navigating a hybrid landscape where traditional, mission-critical workloads coexist with innovative AI-driven tasks. This "mixed enterprise and AI" workload environment calls for infrastructure that can handle both types of processing seamlessly. Robust, general-purpose CPUs like the AMD EPYC processors are designed to be powerful and secure and flexible to address this need. They handle everyday tasks—running databases, web servers, ERP systems—and offer strong security features crucial for enterprise operations augmented with AI workloads. In essence, modern enterprise infrastructure is about creating a balanced ecosystem. AMD EPYC CPUs play a pivotal role in creating this balance, delivering high performance, efficiency, and security features that underpin both traditional enterprise workloads and advanced AI operations.

Advantech Launches Next-Gen Edge AI Solutions Powered by the AMD Compute Portfolio

A global leader in intelligent IoT systems and embedded platforms, is excited to introduce its latest AIR series Edge AI systems, powered by the comprehensive AMD compute portfolio. These next-generation solutions leverage AMD Ryzen and EPYC processors alongside Instinct MI210 accelerators and Radeon PRO GPUs, delivering exceptional AI computing performance for demanding edge applications.

"Advantech and AMD continue to strengthen our collaboration in the Edge AI era, integrating advanced CPU platforms with high-performance AI accelerators and GPU solutions," said Aaron Su, Vice President of Advantech Embedded IoT Group. "This joint effort enables cutting-edge computing power to meet the demands of the rapidly evolving embedded AI applications.

Qualcomm Targets Bolstering of AI & IoT Capabilities with Edge Impulse Acquisition

At Embedded World Germany, Qualcomm Technologies, Inc. announced the entry into an agreement to acquire EdgeImpulse Inc., which will enhance its offering for developers and expand its leadership in AI capabilities to power AI-enabled products and services across IoT. The closing of this deal is subject to customary closing conditions. This acquisition is anticipated to complement Qualcomm Technologies' strategic approach to IoT transformation, which includes a comprehensive chipset roadmap, unified software architecture, a suite of services, developer resources, ecosystem partners, comprehensive solutions, and IoT blueprints to address diverse industry needs and challenges.

"We are thrilled about the opportunity to significantly enhance our IoT offerings with Edge Impulse's advanced AI-powered end-to-end platform that will complement our strategic approach to IoT transformation," said Nakul Duggal, group general manager, automotive, industrial and embedded IoT, and cloud computing, Qualcomm Technologies, Inc. "We anticipate that this acquisition will strengthen our leadership in AI and developer enablement, enhancing our ability to provide comprehensive technology for critical sectors such as retail, security, energy and utilities, supply chain management, and asset management. IoT opens the door for a myriad of opportunities, and success is about building real-world solutions, enabling developers and enterprises with AI capabilities to extract intelligence from data, and providing them with the tools to build the applications and services that will power the digital transformation of industries."

Meta Reportedly Reaches Test Phase with First In-house AI Training Chip

According to a Reuters technology report, Meta's engineering department is engaged in the testing of their "first in-house chip for training artificial intelligence systems." Two inside sources have declared this significant development milestone; involving a small-scale deployment of early samples. The owner of Facebook could ramp up production, upon initial batches passing muster. Despite a recent-ish showcasing of an open-architecture NVIDIA "Blackwell" GB200 system for enterprise, Meta leadership is reported to be pursuing proprietary solutions. Multiple big players—in the field of artificial intelligence—are attempting to breakaway from a total reliance on Team Green. Last month, press outlets concentrated on OpenAI's alleged finalization of an in-house design, with rumored involvement coming from Broadcom and TSMC.

One of the Reuters industry moles believes that Meta has signed up with TSMC—supposedly, the Taiwanese foundry was responsible for the production of test batches. Tom's Hardware reckons that Meta and Broadcom were working together with the tape out of the social media giant's "first AI training accelerator." Development of the company's "Meta Training and Inference Accelerator" (MTIA) series has stretched back a couple of years—according to Reuters, this multi-part project: "had a wobbly start for years, and at one point scrapped a chip at a similar phase of development...Meta last year, started using an MTIA chip to perform inference, or the process involved in running an AI system as users interact with it, for the recommendation systems that determine which content shows up on Facebook and Instagram news feeds." Leadership is reportedly aiming to get custom silicon solutions up and running for AI training by next year. Past examples of MTIA hardware were deployed with open-source RISC-V cores (for inference tasks), but is not clear whether this architecture will form the basis of Meta's latest AI chip design.

Biostar Showcases IPC Products at Embedded World 2025

BIOSTAR, a leading manufacturer of IPC solutions, motherboards, graphics cards, and PC peripherals, is currently showcasing its latest technologies at Embedded World 2025, held from March 11-13 at Nurnberg Messe, Germany.

BIOSTAR's showcase features AI-powered industrial PCs and edge computing platforms that provide secure, efficient, and scalable solutions for industries like automation, smart cities, and human-machine interfaces (HMI) at this year's exhibition. With cutting-edge solutions such as IPC motherboards, panel PC, and edge computing systems, and NVIDIA Jetson Orin edge AI system, the products demonstrate BIOSTAR's capabilities on edge AI and edge computing area for critical industrial applications.

Huawei Obtained Two Million Ascend 910B Dies from TSMC via Shell Companies to Circumvent US Sanctions

According to a recent Center for Strategic and International Studies report, Huawei got its hand on approximately two million Ascend 910B logic dies through shell companies that misled TSMC. This acquisition violates US export controls designed to restrict China's access to advanced semiconductor technology. The report details how Huawei leveraged intermediaries to procure chiplets for its AI accelerators before TSMC discovered the deception and halted shipments. These components are critical for Huawei's AI hardware roadmap, which progressed from the original Ascend 910 (manufactured by TSMC on N7+ until 2020) to the domestically produced Ascend 910B and 910C chips fabricated at SMIC using first and second-generation 7 nm-class technologies, respectively. Huawei reportedly wanted TSMC-made dies because of manufacturing challenges in domestic chip production. The Ascend 910B and 910C reportedly suffer from poor yields, with approximately 25% of units failing during the advanced packaging process that combines compute dies with HBM memory.

Despite these challenges, the performance gap with market-leading solutions still remains but has narrowed considerably, with the Ascend 910C reportedly delivering 60% of NVIDIA H100's performance. Huawei has executed a strategic stockpiling initiative, particularly for high-bandwidth memory components. The company likely acquired substantial HBM inventory between August and December 2024, when restrictions on advanced memory sales to China were announced but not yet implemented. The semiconductor supply chain breach shows that enforcing technology export controls is challenging, and third parties can still purchase silicon for restricted companies. While Huawei continues building AI infrastructure for both internal projects and external customers, manufacturing constraints may limit its ability to scale deployments against competitors with access to more advanced manufacturing processes. Perhaps a future domestic EUV-based silicon manufacturing flow will allow Huawei to gain access to more advanced domestic production, completely circumventing US-imposed restrictions.

Huawei Ascend AI Accelerator Production Yields Reportedly "Doubled" in Early 2025

Huawei is likely celebrating milestones on multiple fronts—as reported earlier this month, the Chinese technology manufacture has pulled in record revenues and experienced consistent growth. Additionally, industry insiders believe that things are going well within the company's production pipeline. According to a Financial Times report, Huawei's next-generation AI accelerator model is on the way—the unannounced "Ascend 910C" is touted to directly compete with NVIDIA's H100 AI GPU. Industry moles believe that Huawei has partnered with SMIC for the manufacture of in-house accelerator designs. Whispers suggest a selection of the foundry's 7 nm N+2 process.

The alleged doubling of production yields (within a year)—from 20% to 40%—signals a significant achievement. As reported by FT, this milestone indicates Huawei's Ascend chip production line becoming profitable for the very first time. Two inside sources propose that Huawei and SMIC are targeting a 60% yield goal in the near future. In 2025, leaked plans suggest production tallies of roughly 100,00 Ascend 910C processors, and 300,000 of the current-gen Ascend 910B chip.

Reports Suggest DeepSeek Running Inference on Huawei Ascend 910C AI GPUs

Huawei's Ascend 910C AI chip was positioned as one of the better Chinese-developed alternatives to NVIDIA's H100 accelerator—reports from last autumn suggested that samples were being sent to highly important customers. The likes of Alibaba, Baidu, and Tencent have long relied on Team Green enterprise hardware for all manner of AI crunching, but trade sanctions have severely limited the supply and potency of Western-developed AI chips. NVIDIA's region-specific B20 "Blackwell" accelerator is due for release this year, but industry watchdogs reckon that the Ascend 910C AI GPU is a strong rival. The latest online rumblings have pointed to another major Huawei customer—DeepSeek—having Ascend silicon in their back pockets.

DeepSeek's recent unveiling of its R1 open-source large language model has disrupted international AI markets. A lot of press attention has focused on DeepSeek's CEO stating that his team can access up to 50,000 NVIDIA H100 GPUs, but many have not looked into the company's (alleged) pool of natively-made chips. Yesterday, Alexander Doria—an LLM enthusiast—shared an interesting insight: "I feel this should be a much bigger story—DeepSeek has trained on NVIDIA H800, but is running inference on the new home Chinese chips made by Huawei, the 910C." Experts believe that there will be a plentiful supply of Ascend 910C GPUs—estimates from last September posit that 70,000 chips (worth around $2 billion) were in the mass production pipeline. Additionally, industry whispers suggest that Huawei is already working on a—presumably, even more powerful—successor.

Numem to Showcase Next-Gen Memory Solutions at the Upcoming Chiplet Summit

Numem, an innovator focused on accelerating memory for AI workloads, will be at the upcoming Chiplet Summit to showcase its high-performance solutions. By accelerating the delivery of data via new memory subsystem designs, Numem solutions are re-architecting the hierarchy of AI memory tiers to eliminate the bottlenecks that negatively impact power and performance.

The rapid growth of AI workloads and AI Processor/GPUs are exacerbating the memory bottleneck caused by the slowing performance improvements and scalability of SRAM and DRAM - presenting a major obstacle to maximizing system performance. To overcome this, there is a pressing need for intelligent memory solutions that offer higher power efficiency and greater bandwidth, coupled with a reevaluation of traditional memory architectures.

Supermicro Begins Volume Shipments of Max-Performance Servers Optimized for AI, HPC, Virtualization, and Edge Workloads

Supermicro, Inc. a Total IT Solution Provider for AI/ML, HPC, Cloud, Storage, and 5G/Edge is commencing shipments of max-performance servers featuring Intel Xeon 6900 series processors with P-cores. The new systems feature a range of new and upgraded technologies with new architectures optimized for the most demanding high-performance workloads including large-scale AI, cluster-scale HPC, and environments where a maximum number of GPUs are needed, such as collaborative design and media distribution.

"The systems now shipping in volume promise to unlock new capabilities and levels of performance for our customers around the world, featuring low latency, maximum I/O expansion providing high throughput with 256 performance cores per system, 12 memory channels per CPU with MRDIMM support, and high performance EDSFF storage options," said Charles Liang, president and CEO of Supermicro. "We are able to ship our complete range of servers with these new application-optimized technologies thanks to our Server Building Block Solutions design methodology. With our global capacity to ship solutions at any scale, and in-house developed liquid cooling solutions providing unrivaled cooling efficiency, Supermicro is leading the industry into a new era of maximum performance computing."

VeriSilicon Unveils Next-Gen Vitality Architecture GPU IP Series

VeriSilicon today announced the launch of its latest Vitality architecture Graphics Processing Unit (GPU) IP series, designed to deliver high-performance computing across a wide range of applications, including cloud gaming, AI PC, and both discrete and integrated graphics cards.

VeriSilicon's new generation Vitality GPU architecture delivers exceptional advancements in computational performance with scalability. It incorporates advanced features such as a configurable Tensor Core AI accelerator and a 32 MB to 64 MB Level 3 (L3) cache, offering both powerful processing power and superior energy efficiency. Additionally, the Vitality architecture supports up to 128 channels of cloud gaming per core, addressing the needs of high concurrency and high image quality cloud-based entertainment, while enabling large-scale desktop gaming and applications on Windows systems. With robust support for Microsoft DirectX 12 APIs and AI acceleration libraries, this architecture is ideally suited for a wide range of performance-intensive applications and complex computing workloads.

Synopsys Announces Industry's First Ultra Ethernet and UALink IP Solutions

Synopsys, Inc. today announced the industry's first Ultra Ethernet IP and UALink IP solutions, including controllers, PHYs, and verification IP, to meet the demand for standards-based, high-bandwidth, and low-latency HPC and AI accelerator interconnects. As hyperscale data center infrastructures evolve to support the processing of trillions of parameters in large language models, they must scale to hundreds of thousands of accelerators with highly efficient and fast connections. Synopsys Ultra Ethernet and UALink IP will provide a holistic, low-risk solution for high-speed and low-latency communication to scale-up and scale-out AI architectures.

"For more than 25 years, Synopsys has been at the forefront of providing best-in-class IP solutions that enable designers to accelerate the integration of standards-based functionality," said Neeraj Paliwal, senior vice president of IP product management at Synopsys. "With the industry's first Ultra Ethernet and UALink IP, companies can get a head start on developing a new generation of high-performance chips and systems with broad interoperability to scale future AI and HPC infrastructure."

NVIDIA Shows Future AI Accelerator Design: Silicon Photonics and DRAM on Top of Compute

During the prestigious IEDM 2024 conference, NVIDIA presented its vision for the future AI accelerator design, which the company plans to chase after in future accelerator iterations. Currently, the limits of chip packaging and silicon innovation are being stretched. However, future AI accelerators might need some additional verticals to gain the required performance improvement. The proposed design at IEDM 24 introduces silicon photonics (SiPh) at the center stage. NVIDIA's architecture calls for 12 SiPh connections for intrachip and interchip connections, with three connections per GPU tile across four GPU tiles per tier. This marks a significant departure from traditional interconnect technologies, which in the past have been limited by the natural properties of copper.

Perhaps the most striking aspect of NVIDIA's vision is the introduction of so-called "GPU tiers"—a novel approach that appears to stack GPU components vertically. This is complemented by an advanced 3D stacked DRAM configuration featuring six memory units per tile, enabling fine-grained memory access and substantially improved bandwidth. This stacked DRAM would have a direct electrical connection to the GPU tiles, mimicking the AMD 3D V-Cache on a larger scale. However, the timeline for implementation reflects the significant technological hurdles that must be overcome. The scale-up of silicon photonics manufacturing presents a particular challenge, with NVIDIA requiring the capacity to produce over one million SiPh connections monthly to make the design commercially viable. NVIDIA has invested in Lightmatter, which builds photonic packages for scaling the compute, so some form of its technology could end up in future NVIDIA accelerators

"Jaguar Shores" is Intel's Successor to "Falcon Shores" Accelerator for AI and HPC

Intel has prepared "Jaguar Shores," its "next-next" generation AI and HPC accelerator, successor to its upcoming "Falcon Shores" GPU. Revealed during a technical workshop at the SC2024 conference, the chip was unveiled by Intel's Habana Labs division, albeit unintentionally. This announcement positions Jaguar Shores as the successor to Falcon Shores, which is scheduled to launch next year. While details about Jaguar Shores remain sparse, its designation suggests it could be a general-purpose GPU (GPGPU) aimed at both AI training, inferencing, and HPC tasks. Intel's strategy aligns with its push to incorporate advanced manufacturing nodes, such as the 18A process featuring RibbonFET and backside power delivery, which promise significant efficiency gains, so we can expect to see upcoming AI accelerators incorporating these technologies.

Intel's AI chip lineup has faced numerous challenges, including shifting plans for Falcon Shores, which has transitioned from a CPU-GPU hybrid to a standalone GPU, and cancellation of Ponte Vecchio. Despite financial constraints and job cuts, Intel has maintained its focus on developing cutting-edge AI solutions. "We continuously evaluate our roadmap to ensure it aligns with the evolving needs of our customers. While we don't have any new updates to share, we are committed to providing superior enterprise AI solutions across our CPU and accelerator/GPU portfolio." an Intel spokesperson stated. The announcement of Jaguar Shores shows Intel's determination to remain competitive. However, the company faces steep competition. NVIDIA and AMD continue to set benchmarks with performant designs, while Intel has struggled to capture a significant share of the AI training market. The company's Gaudi lineup ends with third generation, and Gaudi IP will get integrated into Falcon Shores.

ASUS Presents All-New Storage-Server Solutions to Unleash AI Potential at SC24

ASUS today announced its groundbreaking next-generation infrastructure solutions at SC24, featuring a comprehensive lineup powered by AMD and Intel, as well as liquid-cooling solutions designed to accelerate the future of AI. By continuously pushing the limits of innovation, ASUS simplifies the complexities of AI and high-performance computing (HPC) through adaptive server solutions paired with expert cooling and software-development services, tailored for the exascale era and beyond. As a total-solution provider with a distinguished history in pioneering AI supercomputing, ASUS is committed to delivering exceptional value to its customers.

Comprehensive Lineup for AI and HPC Success
To fuel enterprise digital transformation through HPC and AI-driven architecture, ASUS provides a full lineup of server systems that are powered by AMD and Intel. Startups, research institutions, large enterprises or government organizations all could find the adaptive solutions to unlock value and accelerate business agility from the big data.

IBM Expands Its AI Accelerator Offerings; Announces Collaboration With AMD

IBM and AMD have announced a collaboration to deploy AMD Instinct MI300X accelerators as a service on IBM Cloud. This offering, which is expected to be available in the first half of 2025, aims to enhance performance and power efficiency for Gen AI models such as and high-performance computing (HPC) applications for enterprise clients. This collaboration will also enable support for AMD Instinct MI300X accelerators within IBM's watsonx AI and data platform, as well as Red Hat Enterprise Linux AI inferencing support.

"As enterprises continue adopting larger AI models and datasets, it is critical that the accelerators within the system can process compute-intensive workloads with high performance and flexibility to scale," said Philip Guido, executive vice president and chief commercial officer, AMD. "AMD Instinct accelerators combined with AMD ROCm software offer wide support including IBM watsonx AI, Red Hat Enterprise Linux AI and Red Hat OpenShift AI platforms to build leading frameworks using these powerful open ecosystem tools. Our collaboration with IBM Cloud will aim to allow customers to execute and scale Gen AI inferencing without hindering cost, performance or efficiency."

TSMC Cuts Off Chinese Firm For Reportedly Shipping to Sanctioned Huawei

According to a recent Reuters report, TSMC has decided to cut off Chinese firm Sophgo following the discovery of TSMC-manufactured components in Huawei's advanced AI processor. The suspension came after technology research firm TechInsights identified a TSMC-manufactured chip within Huawei's Ascend 910B processor during a detailed analysis. This discovery raised significant concerns, as Huawei has been restricted from accessing such technology under US export controls since 2020. TSMC promptly notified US authorities upon learning of the situation and launched an internal investigation. While being sanctioned by the US, Huawei needed to use a proxy firm to get access to high-end silicon manufacturing to produce its Ascend accelerators.

Sophgo, which has ties to cryptocurrency mining equipment manufacturer Bitmain, strongly denies any business relationship with Huawei. The company states it has provided TSMC with a detailed investigation report asserting its compliance with all applicable laws, saying: "SOPHGO has never been engaged in any direct or indirect business relationship with Huawei. SOPHGO has been conducting business in strict compliance with applicable laws and regulations, including but not limited to all the applicable US national export control laws and regulations, and has never been in violation of any of such laws and regulations. SOPHGO has provided detailed investigation report to TSMC to prove that SOPHGO is not related to the Huawei investigation."

Arm and Partners Develop AI CPU: Neoverse V3 CSS Made on 2 nm Samsung GAA FET

Yesterday, Arm has announced significant progress in its Total Design initiative. The program, launched a year ago, aims to accelerate the development of custom silicon for data centers by fostering collaboration among industry partners. The ecosystem has now grown to include nearly 30 participating companies, with recent additions such as Alcor Micro, Egis, PUF Security, and SEMIFIVE. A notable development is a partnership between Arm, Samsung Foundry, ADTechnology, and Rebellions to create an AI CPU chiplet platform. This collaboration aims to deliver a solution for cloud, HPC, and AI/ML workloads, combining Rebellions' AI accelerator with ADTechnology's compute chiplet, implemented using Samsung Foundry's 2 nm Gate-All-Around (GAA) FET technology. The platform is expected to offer significant efficiency gains for generative AI workloads, with estimates suggesting a 2-3x improvement over the standard CPU design for LLMs like Llama3.1 with 405 billion parameters.

Arm's approach emphasizes the importance of CPU compute in supporting the complete AI stack, including data pre-processing, orchestration, and advanced techniques like Retrieval-augmented Generation (RAG). The company's Compute Subsystems (CSS) are designed to address these requirements, providing a foundation for partners to build diverse chiplet solutions. Several companies, including Alcor Micro and Alphawave, have already announced plans to develop CSS-powered chiplets for various AI and high-performance computing applications. The initiative also focuses on software readiness, ensuring that major frameworks and operating systems are compatible with Arm-based systems. Recent efforts include the introduction of Arm Kleidi technology, which optimizes CPU-based inference for open-source projects like PyTorch and Llama.cpp. Notably, as Google claims, most AI workloads are being inferenced on CPUs, so creating the most efficient and most performant CPUs for AI makes a lot of sense.
Return to Keyword Browsing
Apr 5th, 2025 19:13 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts