News Posts matching #AI

Return to Keyword Browsing

Intel Accelerates AI Everywhere with Launch of Powerful Next-Gen Products

At its "AI Everywhere" launch in New York City today, Intel introduced an unmatched portfolio of AI products to enable customers' AI solutions everywhere—across the data center, cloud, network, edge and PC. "AI innovation is poised to raise the digital economy's impact up to as much as one-third of global gross domestic product," Gelsinger said. "Intel is developing the technologies and solutions that empower customers to seamlessly integrate and effectively run AI in all their applications—in the cloud and, increasingly, locally at the PC and edge, where data is generated and used."

Gelsinger showcased Intel's expansive AI footprint, spanning cloud and enterprise servers to networks, volume clients and ubiquitous edge environments. He also reinforced that Intel is on track to deliver five new process technology nodes in four years. "Intel is on a mission to bring AI everywhere through exceptionally engineered platforms, secure solutions and support for open ecosystems. Our AI portfolio gets even stronger with today's launch of Intel Core Ultra ushering in the age of the AI PC and AI-accelerated 5th Gen Xeon for the enterprise," Gelsinger said.

Intel's New 5th Gen "Emerald Rapids" Xeon Processors are Built with AI Acceleration in Every Core

Today at the "AI Everywhere" event, Intel launched its 5th Gen Intel Xeon processors (code-named Emerald Rapids) that deliver increased performance per watt and lower total cost of ownership (TCO) across critical workloads for artificial intelligence, high performance computing (HPC), networking, storage, database and security. This launch marks the second Xeon family upgrade in less than a year, offering customers more compute and faster memory at the same power envelope as the previous generation. The processors are software- and platform-compatible with 4th Gen Intel Xeon processors, allowing customers to upgrade and maximize the longevity of infrastructure investments while reducing costs and carbon emissions.

"Designed for AI, our 5th Gen Intel Xeon processors provide greater performance to customers deploying AI capabilities across cloud, network and edge use cases. As a result of our long-standing work with customers, partners and the developer ecosystem, we're launching 5th Gen Intel Xeon on a proven foundation that will enable rapid adoption and scale at lower TCO." -Sandra Rivera, Intel executive vice president and general manager of Data Center and AI Group.

United States Ease Stance on NVIDIA AI Chip Exports to China

The United States is softening restrictions on the significant GPU maker NVIDIA, selling artificial intelligence chips to China. While still limiting advanced chip exports deemed strategically threatening, Commerce Secretary Gina Raimondo clarified this week that NVIDIA could supply some AI processors to Chinese commercial companies. Previously, Raimondo had sharply criticized NVIDIA for attempting to sidestep regulations on selling powerful GPUs abroad. Her comments followed rumors that NVIDIA tweaked chip designs to avoid newly imposed export controls narrowly. However, after discussions between Raimondo and NVIDIA CEO Jensen Huang, the Commerce Department says NVIDIA and other US firms will be permitted to export AI chips to China for general commercial use cases. Exports are still banned on the very highest-end GPUs that could enable China to train advanced AI models rivaling American developments.

Raimondo said NVIDIA will collaborate with the US to comply with the export rules. Huang reaffirmed the company's commitment to adherence. The clarification may ease pressures on NVIDIA, as China accounts for up to 25% of its revenue. While optimistic about recent Chinese approvals for US joint ventures, Raimondo noted frustrations linger around technology controls integral to national security. The nuanced recalibration of restrictions illustrates the balances the administration must strike between economic and security interests. As one of the first big US technology exporters impacted by tightened restrictions, NVIDIA's ability to still partly supply the valuable Chinese chip market points to a selective enforcement approach from regulators in the future.

Zyxel Networks Announces Availability of 22 Gbps WiFi 7 Access Point

Zyxel Networks, a leader in delivering secure, AI- and cloud-powered business and home networking solutions, today announced the availability of WBE660S WiFi 7 BE22000 Triple-Radio NebulaFlex Pro Access Point, its first WiFi 7 access point for managed service providers (MSPs) and small- to medium-sized businesses (SMBs).

The enterprise-grade WBE660S features a triple radio BE22000 architecture and utilizes a wider 320 MHz channel to deliver speeds up to five times faster than WiFi 6/6E solutions. WBE660S enhances the user experience by providing seamless, latency-free connectivity to optimize high-bandwidth applications such as video streaming, broadcasting, online gaming, and VR/AR.

China Continues to Enhance AI Chip Self-Sufficiency, but High-End AI Chip Development Remains Constrained

Huawei's subsidiary HiSilicon has made significant strides in the independent R&D of AI chips, launching the next-gen Ascend 910B. These chips are utilized not only in Huawei's public cloud infrastructure but also sold to other Chinese companies. This year, Baidu ordered over a thousand Ascend 910B chips from Huawei to build approximately 200 AI servers. Additionally, in August, Chinese company iFlytek, in partnership with Huawei, released the "Gemini Star Program," a hardware and software integrated device for exclusive enterprise LLMs, equipped with the Ascend 910B AI acceleration chip, according to TrendForce's research.

TrendForce conjectures that the next-generation Ascend 910B chip is likely manufactured using SMIC's N+2 process. However, the production faces two potential risks. Firstly, as Huawei recently focused on expanding its smartphone business, the N+2 process capacity at SMIC is almost entirely allocated to Huawei's smartphone products, potentially limiting future capacity for AI chips. Secondly, SMIC remains on the Entity List, possibly restricting access to advanced process equipment.

NVIDIA CFO Hints at Intel Foundry Services Partnership

NVIDIA CFO Colette Kress, responding to a question in the Q&A session of the recent UBS Global Technology Conference, hinted at the possibility of NVIDIA onboarding a third semiconductor foundry partner besides its current TSMC and Samsung, with the implication being Intel Foundry Services (IFS). "We would love a third one. And that takes a work of what are they interested in terms of the services. Keep in mind, there is other ones that may come to the U.S. TSMC in the U.S. may be an option for us as well. Not necessarily different, but again in terms of the different region. Nothing that stops us from potentially adding another foundry."

NVIDIA currently sources its chips from TSMC and Samsung. It uses the premier Taiwanese fab for its latest "Ada" GPUs and "Hopper" AI processors, while using Samsung for its older generation "Ampere" GPUs. The addition of IFS as a third foundry partner could improve the company's supply-chain resilience in an uncertain geopolitical environment; given that IFS fabs are predominantly based in the US and the EU.

No Overclocking and Lower TGP for NVIDIA GeForce RTX 4090 D Edition for China

NVIDIA is preparing to launch the GeForce RTX 4090 D, or "Dragon" edition, designed explicitly for China. Circumventing the US export rules of GPUs that could potentially be used for AI acceleration, the GeForce RTX 4090 D is reportedly cutting back on overclocking as a feature. According to BenchLife, the AD102-250 GPU used in the RTX 4090 D will be a stranger to overclocking, as the card will not support it, possibly being disabled by firmware and/or physically in the die. The information from @Zed__Wang suggests that the Dragon version will be running at 2280 MHz base frequency, higher than the 2235 MHz of AD102-300 found in the regular RTX 4090, and 2520 MHz boost, matching the regular version.

Interestingly, the RTX 4090 D for China will also feature a slightly lower Total Graphics Power (TGP) of 425 Watts, down from the 450 Watts of the regular model. With memory configuration appearing to be the same, this new China-specific model will most likely perform within a few percent of the original design. Higher base frequency probably indicates a lack of a few CUDA cores to comply with the US export regulation policy and serve the Chinese GPU market. The NVIDIA GeForce RTX 4090 D is scheduled for rollout in January 2024 in China, which is just a few weeks away.

Supermicro Extends AI and GPU Rack Scale Solutions with Support for AMD Instinct MI300 Series Accelerators

Supermicro, Inc., a Total IT Solution Manufacturer for AI, Cloud, Storage, and 5G/Edge, is announcing three new additions to its AMD-based H13 generation of GPU Servers, optimized to deliver leading-edge performance and efficiency, powered by the new AMD Instinct MI300 Series accelerators. Supermicro's powerful rack scale solutions with 8-GPU servers with the AMD Instinct MI300X OAM configuration are ideal for large model training.

The new 2U liquid-cooled and 4U air-cooled servers with the AMD Instinct MI300A Accelerated Processing Units (APUs) accelerators are available and improve data center efficiencies and power the fast-growing complex demands in AI, LLM, and HPC. The new systems contain quad APUs for scalable applications. Supermicro can deliver complete liquid-cooled racks for large-scale environments with up to 1,728 TFlops of FP64 performance per rack. Supermicro worldwide manufacturing facilities streamline the delivery of these new servers for AI and HPC convergence.

AMD Showcases Growing Momentum for AMD Powered AI Solutions from the Data Center to PCs

Today at the "Advancing AI" event, AMD was joined by industry leaders including Microsoft, Meta, Oracle, Dell Technologies, HPE, Lenovo, Supermicro, Arista, Broadcom and Cisco to showcase how these companies are working with AMD to deliver advanced AI solutions spanning from cloud to enterprise and PCs. AMD launched multiple new products at the event, including the AMD Instinct MI300 Series data center AI accelerators, ROCm 6 open software stack with significant optimizations and new features supporting Large Language Models (LLMs) and Ryzen 8040 Series processors with Ryzen AI.

"AI is the future of computing and AMD is uniquely positioned to power the end-to-end infrastructure that will define this AI era, from massive cloud installations to enterprise clusters and AI-enabled intelligent embedded devices and PCs," said AMD Chair and CEO Dr. Lisa Su. "We are seeing very strong demand for our new Instinct MI300 GPUs, which are the highest-performance accelerators in the world for generative AI. We are also building significant momentum for our data center AI solutions with the largest cloud companies, the industry's top server providers, and the most innovative AI startups ꟷ who we are working closely with to rapidly bring Instinct MI300 solutions to market that will dramatically accelerate the pace of innovation across the entire AI ecosystem."

AMD Delivers Leadership Portfolio of Data Center AI Solutions with AMD Instinct MI300 Series

Today, AMD announced the availability of the AMD Instinct MI300X accelerators - with industry leading memory bandwidth for generative AI and leadership performance for large language model (LLM) training and inferencing - as well as the AMD Instinct MI300A accelerated processing unit (APU) - combining the latest AMD CDNA 3 architecture and "Zen 4" CPUs to deliver breakthrough performance for HPC and AI workloads.

"AMD Instinct MI300 Series accelerators are designed with our most advanced technologies, delivering leadership performance, and will be in large scale cloud and enterprise deployments," said Victor Peng, president, AMD. "By leveraging our leadership hardware, software and open ecosystem approach, cloud providers, OEMs and ODMs are bringing to market technologies that empower enterprises to adopt and deploy AI-powered solutions."

AMD Ryzen 8040 Series "Hawk Point" Mobile Processors Announced with a Faster NPU

AMD today announced the new Ryzen 8040 mobile processor series codenamed "Hawk Point." These chips are shipping to notebook manufacturers now, and the first notebooks powered by these should be available to consumers in Q1-2024. At the heart of this processor is a significantly faster neural processing unit (NPU), designed to accelerate AI applications that will become relevant next year, as Microsoft prepares to launch Windows 12, and software vendors make greater use of generative AI in consumer applications.

The Ryzen 8040 "Hawk Point" processor is almost identical in design and features to the Ryzen 7040 "Phoenix," except for a faster Ryzen AI NPU. While this is based on the same first-generation XDNA architecture, its NPU performance has been increased to 16 TOPS, compared to 10 TOPS of the NPU on the "Phoenix" silicon. AMD is taking a whole-of-silicon approach to AI acceleration, which includes not just the NPU, but also the "Zen 4" CPU cores that support the AVX-512 VNNI instruction set that's relevant to AI; and the iGPU based on the RDNA 3 graphics architecture, with each of its compute unit featuring two AI accelerators, components that make the SIMD cores crunch matrix math. The whole-of-silicon performance figures for "Phoenix" is 33 TOPS; while "Hawk Point" boasts of 39 TOPS. In benchmarks by AMD, "Hawk Point" is shown delivering a 40% improvement in vision models, and Llama 2, over the Ryzen 7040 "Phoenix" series.

GIGABYTE Unveils Next-gen HPC & AI Servers with AMD Instinct MI300 Series Accelerators

GIGABYTE Technology: Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, and IT infrastructure, today announced the GIGABYTE G383-R80 for the AMD Instinct MI300A APU and two GIGABYTE G593 series servers for the AMD Instinct MI300X GPU and AMD EPYC 9004 Series processor. As a testament to the performance of AMD Instinct MI300 Series family of products, the El Capitan supercomputer at Lawrence Livermore National Laboratory uses the MI300A APU to power exascale computing. And these new GIGABYTE servers are the ideal platform to propel discoveries in HPC & AI at exascale.⁠

Marrying of a CPU & GPU: G383-R80
For incredible advancements in HPC there is the GIGABYTE G383-R80 that houses four LGA6096 sockets for MI300A APUs. This chip integrates a CPU that has twenty-four AMD Zen 4 cores with a powerful GPU built with AMD CDNA 3 GPU cores. And the chiplet design shares 128 GB of unified HBM3 memory for impressive performance for large AI models. The G383 server has lots of expansion slots for networking, storage, or other accelerators, with a total of twelve PCIe Gen 5 slots. And in the front of the chassis are eight 2.5" Gen 5 NVMe bays to handle heavy workloads such as real-time big data analytics and latency-sensitive workloads in finance and telecom. ⁠

Acer Commits to Carbon Neutrality for Vero Laptop Line

Acer today shared its commitment to carbon neutrality for its Aspire Vero laptop line, starting from the new Aspire Vero 16 (AV16-51P). Following international standards for carbon footprint calculation and carbon neutrality, actions are taken at each stage of the device lifecycle to minimize its carbon footprint, and then, high-quality carbon credits will be applied to attain carbon neutrality.

"To help tackle the increasing challenges posed by climate change, on the product side, Acer is proposing 'conscious technology' designed and made with consideration for the future," said Jerry Kao, COO, Acer Inc. "On the corporate side, Acer has joined the RE100 initiative and committed to achieving 100% renewable electricity by 2035. We have also pledged to achieve net-zero emissions by 2050."

Set Your Calendars: Windows 12 is Coming in June 2024 with Arm Support and AI Features

Microsoft is preparing a big update for its Windows operating system. Currently at version 11, the company is gearing up for the launch of Windows 12, which is supposed to bring a monumental shift in the tectonic plates of the regular PC user experience. Enhanced by AI, the Windows 12 OS should utilize many features like generative AI, large language models, some GPT integration, and many other tools that could benefit AI, like photo editors. The confirmation for the Windows 12 launch coming in 2024 is sourced from the Taiwanese Commercial Times, which analyzed comments from Barry Lam, the founder and chairman of PC contract manufacturer Quanta, and Junsheng (Jason) Chen, the chairman and chief executive of Acer.

Both of them underscored the importance of AI and that AI PCs are coming with the next version of Windows. Supposedly, the launch date for Windows 12 is set for June 2024. In that timeframe, hardware vendors should roll out their SoCs embedding AI processing elements at every silicon block. Qualcomm is set to debut its Snapdragon Elite X SoCs in mid-2024, aligning with the alleged release schedule of Windows 12. With more players like NVIDIA, AMD, and others planning to utilize an Arm instruction set for their next-generation PC chips, we expect to see Windows 12 get full-fledged support for Arm ISA and treat it like a first-class citizen in the OS.

Intel "Emerald Rapids" Die Configuration Leaks, More Details Appear

Thanks to the leaked slides obtained by @InstLatX64, we have more details and some performance estimates about Intel's upcoming 5th Generation Xeon "Emerald Rapids" CPUs, boasting a significant performance leap over its predecessors. Leading the Emerald Rapids family is the top-end SKU, the Xeon 8592+, which features 64 cores and 128 threads, backed by a massive 480 MB L3 cache pool. The upcoming lineup shifts from a 4-tile to a 2-tile design to minimize latency and improve performance. The design utilizes the P-Core architecture under the Raptor Cove ISA and promises up to 40% faster performance than the current 4th Generation "Sapphire Rapids" CPUs in AI applications utilizing Intel AMX engine. Each chiplet has 35 cores, three of which are disabled, and each tile has two DDR5-5600 MT/s memory controllers, which operate two memory channels each and translating that into eight-channel design. There are three PCIe controllers per die, making it six in total.

Newer protocols and AI accelerators also back the upcoming lineup. Now, the Emerald Rapids family supports the Compute Express Link (CXL) Types 1/2/3 in addition to up to 80 PCIe Gen 5 lanes and enhanced Intel Ultra Path Interconnect (UPI). There are four UPI controllers spread over two dies. Moreover, features like the four on-die Intel Accelerator Engines, optimized power mode, and up to 17% improvement in general-purpose workloads make it seem like a big step up from the current generation. Much of this technology is found on the existing Sapphire Rapids SKUs, with the new generation enhancing the AI processing capability further. You can see the die configuration below. The 5th Generation Emerald Rapids designs are supposed to be official on December 14th, just a few days away.

NVIDIA Celebrates 500 Games & Apps with DLSS and RTX Technologies

NVIDIA today announced the important milestone of 500 games and apps that take advantage of NVIDIA RTX, the transformative set of gaming graphics technologies, that among many other things, mainstreamed real-time ray tracing in the consumer gaming space, and debuted the most profound gaming technology of recent times—DLSS, or performance uplifts through high-quality upscaling technologies. The company got to this milestone over a 5-year period, with RTX seeing the light of the day in August 2018. NVIDIA RTX is the combined feature-set of real time ray tracing, including NVIDIA-specific enhancements; and DLSS.

Although it started out as an upscaling-based performance enhancement that leverages AI, DLSS encompasses a whole suite of technologies aimed at enhancing performance at minimal quality loss, and in some cases even enhances the image quality over native resolution. This includes super resolution, or the classic DLSS and DLSS 2 feature set; DLSS 3 Frame Generation, which nearly doubles frame-rates by generating entire alternate frames entirely using AI, without involving the graphics rendering machinery; and DLSS 3.5 Ray Reconstruction, which attempts to vastly improve the fidelity of ray traced elements in upscaled scenarios.

Contract Prices Bottom Out in Q3, Reigniting Buyer Momentum and Boosting DRAM Revenue by Nearly 20%, Notes Report

TrendForce investigations reveal a significant leap in the DRAM industry for 3Q23, with total revenues soaring to US$13.48 billion—marking 18% QoQ growth. This surge is attributed to a gradual resurgence in demand, prompting buyers to re-energize their procurement activities. Looking ahead to Q4, while suppliers are firmly set on price hikes, with DRAM contract prices expected to rise by approximately 13-18%, demand recovery will not be as robust as in previous peak seasons. Overall, while there is demand for stockpiling, procurement for the server sector remains tentative due to high inventory levels, suggesting limited growth in DRAM industry shipments for Q4.

Three major manufacturers witnessed Q3 revenue growth. Samsung's revenue increased by about 15.9% to US$5.25 billion thanks to stable demand for high-capacity products fueled by AI advancements and the rollout of its 1alpha nm DDR5. SK hynix showcased the most notable growth among manufacturers with a 34.4% increase, reaching about US$4.626 billion and significantly narrowing its market share gap with Samsung to less than 5%. Micron's revenue rose by approximately 4.2% to US$3.075 billion—despite a slight drop in ASP—supported by an upswing in demand and shipment volumes.

Ethernet Switch Chips are Now Infected with AI: Broadcom Announces Trident 5-X12

Artificial intelligence has been a hot topic this year, and everything is now an AI processor, from CPUs to GPUs, NPUs, and many others. However, it was only a matter of time before we saw an integration of AI processing elements into the networking chips. Today, Broadcom announced its new Ethernet switching silicon called Trident 5-X12. The Trident 5-X12 delivers 16 Tb/s of bandwidth, double that of the previous Trident generation while adding support for fast 800G ports for connection to Tomahawk 5 spine switch chips. The 5-X12 is software-upgradable and optimized for dense 1RU top-of-rack designs, enabling configurations with up to 48x200G downstream server ports and 8x800G upstream fabric ports. The 800G support is added using 100G-PAM4 SerDes, which enables up to 4 m DAC and linear optics.

However, this is not only a switch chip on its own. Broadcom has added AI processing elements in an inference engine called NetGNT (Networking General-purpose Neural-network Traffic-analyzer). It can detect common traffic patterns and optimize data movement across the chip. Specifically, the company has listed an example of the system doing AI/ML workloads. In that case, NetGNT performs intelligent traffic analysis to avoid network congestion in these workloads. For example, it can detect the so-called "incast" patterns in real-time, where many flows converge simultaneously on the same port. By recognizing the start of incast early, NetGNT can invoke hardware-based congestion control techniques to prevent performance degradation without added latency.

Canalys Forecast: Global PC Market Set for 8% Growth in 2024

According to the latest Canalys forecasts, worldwide PC shipments are on the verge of recovery following seven consecutive quarters of decline. The market is expected to return to growth of 5% in Q4 2023, boosted by a strong holiday season and an improving macroeconomic environment. Looking ahead, full-year 2024 shipments are forecast to hit 267 million units, landing 8% higher than in 2023, helped by tailwinds including the Windows refresh cycle and emergence of AI-capable and Arm-based devices.

"The global PC market is on a recovery path and set to return to 2019 shipment levels by next year," said Canalys Analyst Ben Yeh. "The impact of AI on the PC industry will be profound, with leading players across OEMs, processor manufacturers, and operating system providers focused on delivering new AI-capable models in 2024. These initiatives will bolster refresh demand, particularly in the commercial sector. The total shipment share of AI-capable PCs is expected to be about 19% in 2024. This accounts for all M-series Mac products alongside the nascent offerings expected in the Windows ecosystem. However, as more compelling use-cases emerge and AI functionality becomes an expected feature, Canalys anticipates a fast ramp up in the development and adoption of AI-capable PCs."

NVIDIA Readies GeForce RTX 4090 D for China to Comply with U.S. Export Controls

NVIDIA is giving final touches to the new GeForce RTX 4090 D, a graphics card SKU specific to the Chinese market, aimed squarely at gamers. The card fills the void for gamers shopping in the enthusiast segment, as all inventory of the regular RTX 4090 has been bought up by Chinese companies to accelerate AI, and controls are in place that prevent NVIDIA from selling the card in its current form in the Chinese market.

What sets this SKU apart is that it is designed to comply with U.S. export controls of GPUs that have the dual use as a high compute-density AI accelerator. In other words, its performance with AI will be artificially limited. This is being done by lowering the card's TPP (total processing performance), which could mean that it ends up slower than the regular RTX 4090. This is somewhat similar in concept to LHR (lite hash rate) GPUs NVIDIA designed for gamers as their regular GPUs were being heaped up by crypto currency miners, although LHR wasn't created due to government policy, but in response to market demand. The RTX 4090 D is expected to retail for RMB 13,000, which is similar to the baseline price of the RTX 4090.

Dell Partners with Imbue on New AI Compute Cluster Using Nearly 10,000 NVIDIA H100 GPUs

Dell Technologies and Imbue, an independent AI research company, have entered into a $150 million agreement to build a new high-performance computing cluster for training foundation models optimized for reasoning. Imbue is one of the few independent AI labs that develops its own foundation models, and trains them to have more advanced reasoning capabilities—like knowing when to ask for more information, analyzing and critiquing their own outputs, or breaking down a difficult goal into a plan and then executing on it. Imbue trains AI agents on top of those models that can do work for people across diverse fields in ways that are robust, safe, and useful. Imbue's goal is to create practical tools for building agents that could enable workers across a broad set of domains, including helping engineers write new code, analysts understand and draft complex policy proposals, and much more.

AWS Unveils Next Generation AWS-Designed Graviton4 and Trainium2 Chips

At AWS re:Invent, Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), today announced the next generation of two AWS-designed chip families—AWS Graviton4 and AWS Trainium2—delivering advancements in price performance and energy efficiency for a broad range of customer workloads, including machine learning (ML) training and generative artificial intelligence (AI) applications. Graviton4 and Trainium2 mark the latest innovations in chip design from AWS. With each successive generation of chip, AWS delivers better price performance and energy efficiency, giving customers even more options—in addition to chip/instance combinations featuring the latest chips from third parties like AMD, Intel, and NVIDIA—to run virtually any application or workload on Amazon Elastic Compute Cloud (Amazon EC2).

Manufacturers Anticipate Completion of NVIDIA's HBM3e Verification by 1Q24; HBM4 Expected to Launch in 2026

TrendForce's latest research into the HBM market indicates that NVIDIA plans to diversify its HBM suppliers for more robust and efficient supply chain management. Samsung's HBM3 (24 GB) is anticipated to complete verification with NVIDIA by December this year. The progress of HBM3e, as outlined in the timeline below, shows that Micron provided its 8hi (24 GB) samples to NVIDIA by the end of July, SK hynix in mid-August, and Samsung in early October.

Given the intricacy of the HBM verification process—estimated to take two quarters—TrendForce expects that some manufacturers might learn preliminary HBM3e results by the end of 2023. However, it's generally anticipated that major manufacturers will have definite results by 1Q24. Notably, the outcomes will influence NVIDIA's procurement decisions for 2024, as final evaluations are still underway.

Special Chinese Factories are Dismantling NVIDIA GeForce RTX 4090 Graphics Cards and Turning Them into AI-Friendly GPU Shape

The recent U.S. government restrictions on AI hardware exports to China have significantly impacted several key semiconductor players, including NVIDIA, AMD, and Intel, restricting them from selling high-performance AI chips to Chinese land. This ban has notably affected NVIDIA's GeForce RTX 4090 gaming GPUs, pushing them out of mainland China due to their high computational capabilities. In anticipation of these restrictions, NVIDIA reportedly moved a substantial inventory of its AD102 GPUs and GeForce RTX 4090 graphics cards to China, which we reported earlier. This could have contributed to the global RTX 4090 shortage, driving the prices of these cards up to 2000 USD. In an interesting turn of events, insiders from the Chinese Baidu forums have disclosed that specialized factories across China are repurposing these GPUs, which arrived before the ban, into AI solutions.

This transformation involves disassembling the gaming GPUs, removing the cooling systems and extracting the AD102 GPU and GDDR6X memory from the main PCBs. These components are then re-soldered onto a domestically manufactured "reference" PCB, better suited for AI applications, and equipped with dual-slot blower-style coolers designed for server environments. The third-party coolers that these GPUs come with are 3-4 slots in size, whereas the blower-style cooler is only two slots wide, and many of them can be placed in parallel in an AI server. After rigorous testing, these reconfigured RTX 4090 AI solutions are supplied to Chinese companies running AI workloads. This adaptation process has resulted in an influx of RTX 4090 coolers and bare PCBs into the Chinese reseller market at markedly low prices, given that the primary GPU and memory components have been removed.
Below, you can see the dismantling of AIB GPUs before getting turned into blower-style AI server-friendly graphics cards.

NVIDIA Experiences Strong Cloud AI Demand but Faces Challenges in China, with High-End AI Server Shipments Expected to Be Below 4% in 2024

NVIDIA's most recent FY3Q24 financial reports reveal record-high revenue coming from its data center segment, driven by escalating demand for AI servers from major North American CSPs. However, TrendForce points out that recent US government sanctions targeting China have impacted NVIDIA's business in the region. Despite strong shipments of NVIDIA's high-end GPUs—and the rapid introduction of compliant products such as the H20, L20, and L2—Chinese cloud operators are still in the testing phase, making substantial revenue contributions to NVIDIA unlikely in Q4. Gradual shipments increases are expected from the first quarter of 2024.

The US ban continues to influence China's foundry market as Chinese CSPs' high-end AI server shipments potentially drop below 4% next year
TrendForce reports that North American CSPs like Microsoft, Google, and AWS will remain key drivers of high-end AI servers (including those with NVIDIA, AMD, or other high-end ASIC chips) from 2023 to 2024. Their estimated shipments are expected to be 24%, 18.6%, and 16.3%, respectively, for 2024. Chinese CSPs such as ByteDance, Baidu, Alibaba, and Tencent (BBAT) are projected to have a combined shipment share of approximately 6.3% in 2023. However, this could decrease to less than 4% in 2024, considering the current and potential future impacts of the ban.
Return to Keyword Browsing
Feb 21st, 2025 22:49 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts