News Posts matching #A800

Return to Keyword Browsing

Chinese Research Institute Utilizing "Banned" NVIDIA H100 AI GPUs

NVIDIA's freshly unveiled "Blackwell" B200 and GB200 AI GPUs will be getting plenty of coverage this year, but many organizations will be sticking with current or prior generation hardware. Team Green is in the process of shipping out compromised "Hopper" designs to customers in China, but the region's appetite for powerful AI-crunching hardware is growing. Last year's China-specific H800 design, and the older "Ampere" A800 chip were deemed too potent—new regulations prevented further sales. Recently, AMD's Instinct MI309 AI accelerator was considered "too powerful to gain unconditional approval from the US Department of Commerce." Natively-developed solutions are catching up with Western designs, but some institutions are not prepared to queue up for emerging technologies.

NVIDIA's new H20 AI GPU as well as Ada Lovelace-based L20 PCIe and L2 PCIe models are weakened enough to get a thumbs up from trade regulators, but likely not compelling enough for discerning clients. The Telegraph believes that NVIDIA's uncompromised H100 AI GPU is currently in use at several Chinese establishments—the report cites information presented within four academic papers published on ArXiv, an open access science website. The Telegraph's news piece highlights one of the studies—it was: "co-authored by a researcher at 4paradigm, an AI company that was last year placed on an export control list by the US Commerce Department for attempting to acquire US technology to support China's military." Additionally, the Chinese Academy of Sciences appears to have conducted several AI-accelerated experiments, involving the solving of complex mathematical and logical problems. The article suggests that this research organization has acquired a very small batch of NVIDIA H100 GPUs (up to eight units). A "thriving black market" for high-end NVIDIA processors has emerged in the region—last Autumn, the Center for a New American Security (CNAS) published an in-depth article about ongoing smuggling activities.

AMD Stalls on Instinct MI309 China AI Chip Launch Amid US Export Hurdles

According to the latest report from Bloomberg, AMD has hit a roadblock in offering its top-of-the-line AI accelerator in the Chinese market. The newest AI chip is called Instinct MI309, a lower-performance Instinct MI300 variant tailored to meet the latest US export rules for selling advanced chips to China-based entities. However, the Instinct MI309 still appears too powerful to gain unconditional approval from the US Department of Commerce, leaving AMD in need of an export license. Originally, the US Department of Commerce made a rule: Total Processing Performance (TPP) score should not exceed 4800, effectively capping AI performance at 600 FP8 TFLOPS. This rule ensures that processors with slightly lower performance may still be sold to Chinese customers, provided their performance density (PD) is sufficiently low.

However, AMD's latest creation, Instinct MI309, is everything but slow. Based on the powerful Instinct MI300, AMD has not managed to bring it down to acceptable levels to acquire a US export license from the Department of Commerce. It is still unknown which Chinese customer was trying to acquire AMD's Instinct MI309; however, it could be one of the Chinese AI labs trying to get ahold of more training hardware for their domestic models. NVIDIA has employed a similar tactic, selling A800 and H800 chips to China, until the US also ended the export of these chips to China. AI labs located in China can only use domestic hardware, including accelerators from Alibaba, Huawei, and Baidu. Cloud services hosting GPUs in US can still be accessed by Chinese companies, but that is currently under US regulators watchlist.

NVIDIA Readying H20 AI GPU for Chinese Market

NVIDIA's H800 AI GPU was rolled out last year to appease the Sanction Gods—but later on, the US Government deemed the cutdown "Hopper" part to be far too potent for Team Green's Chinese enterprise customers. Last October, newly amended export conditions banned sales of the H800, as well as the slightly older (plus similarly gimped) A800 "Ampere" GPU in the region. NVIDIA's engineering team returned to the drawing board, and developed a new range of compliantly weakened products. An exclusive Reuters report suggests that Team Green is taking pre-orders for a refreshed "Hopper" GPU—the latest China-specific flagship is called "HGX H20." NVIDIA web presences have not been updated with this new model, as well as Ada Lovelace-based L20 PCIe and L2 PCIe GPUs. Huawei's competing Ascend 910B is said to be slightly more performant in "some areas"—when compared to the H20—according to insiders within the distribution network.

The leakers reckon that NVIDIA's mainland distributors will be selling H20 models within a price range of $12,000 - $15,000—Huawei's locally developed Ascend 910B is priced at 120,000 RMB (~$16,900). One Reuters source stated that: "some distributors have started advertising the (NVIDIA H20) chips with a significant markup to the lower end of that range at about 110,000 yuan ($15,320). The report suggests that NVIDIA refused to comment on this situation. Another insider claimed that: "distributors are offering H20 servers, which are pre-configured with eight of the AI chips, for 1.4 million yuan. By comparison, servers that used eight of the H800 chips were sold at around 2 million yuan when they were launched a year ago." Small batches of H20 products are expected to reach important clients within the first quarter of 2024, followed by a wider release in Q2. It is believed that mass production will begin around Spring time.

China Continues to Enhance AI Chip Self-Sufficiency, but High-End AI Chip Development Remains Constrained

Huawei's subsidiary HiSilicon has made significant strides in the independent R&D of AI chips, launching the next-gen Ascend 910B. These chips are utilized not only in Huawei's public cloud infrastructure but also sold to other Chinese companies. This year, Baidu ordered over a thousand Ascend 910B chips from Huawei to build approximately 200 AI servers. Additionally, in August, Chinese company iFlytek, in partnership with Huawei, released the "Gemini Star Program," a hardware and software integrated device for exclusive enterprise LLMs, equipped with the Ascend 910B AI acceleration chip, according to TrendForce's research.

TrendForce conjectures that the next-generation Ascend 910B chip is likely manufactured using SMIC's N+2 process. However, the production faces two potential risks. Firstly, as Huawei recently focused on expanding its smartphone business, the N+2 process capacity at SMIC is almost entirely allocated to Huawei's smartphone products, potentially limiting future capacity for AI chips. Secondly, SMIC remains on the Entity List, possibly restricting access to advanced process equipment.

Manufacturers Anticipate Completion of NVIDIA's HBM3e Verification by 1Q24; HBM4 Expected to Launch in 2026

TrendForce's latest research into the HBM market indicates that NVIDIA plans to diversify its HBM suppliers for more robust and efficient supply chain management. Samsung's HBM3 (24 GB) is anticipated to complete verification with NVIDIA by December this year. The progress of HBM3e, as outlined in the timeline below, shows that Micron provided its 8hi (24 GB) samples to NVIDIA by the end of July, SK hynix in mid-August, and Samsung in early October.

Given the intricacy of the HBM verification process—estimated to take two quarters—TrendForce expects that some manufacturers might learn preliminary HBM3e results by the end of 2023. However, it's generally anticipated that major manufacturers will have definite results by 1Q24. Notably, the outcomes will influence NVIDIA's procurement decisions for 2024, as final evaluations are still underway.

NVIDIA Experiences Strong Cloud AI Demand but Faces Challenges in China, with High-End AI Server Shipments Expected to Be Below 4% in 2024

NVIDIA's most recent FY3Q24 financial reports reveal record-high revenue coming from its data center segment, driven by escalating demand for AI servers from major North American CSPs. However, TrendForce points out that recent US government sanctions targeting China have impacted NVIDIA's business in the region. Despite strong shipments of NVIDIA's high-end GPUs—and the rapid introduction of compliant products such as the H20, L20, and L2—Chinese cloud operators are still in the testing phase, making substantial revenue contributions to NVIDIA unlikely in Q4. Gradual shipments increases are expected from the first quarter of 2024.

The US ban continues to influence China's foundry market as Chinese CSPs' high-end AI server shipments potentially drop below 4% next year
TrendForce reports that North American CSPs like Microsoft, Google, and AWS will remain key drivers of high-end AI servers (including those with NVIDIA, AMD, or other high-end ASIC chips) from 2023 to 2024. Their estimated shipments are expected to be 24%, 18.6%, and 16.3%, respectively, for 2024. Chinese CSPs such as ByteDance, Baidu, Alibaba, and Tencent (BBAT) are projected to have a combined shipment share of approximately 6.3% in 2023. However, this could decrease to less than 4% in 2024, considering the current and potential future impacts of the ban.

New AI Accelerator Chips Boost HBM3 and HBM3e to Dominate 2024 Market

TrendForce reports that the HBM (High Bandwidth Memory) market's dominant product for 2023 is HBM2e, employed by the NVIDIA A100/A800, AMD MI200, and most CSPs' (Cloud Service Providers) self-developed accelerator chips. As the demand for AI accelerator chips evolves, manufacturers plan to introduce new HBM3e products in 2024, with HBM3 and HBM3e expected to become mainstream in the market next year.

The distinctions between HBM generations primarily lie in their speed. The industry experienced a proliferation of confusing names when transitioning to the HBM3 generation. TrendForce clarifies that the so-called HBM3 in the current market should be subdivided into two categories based on speed. One category includes HBM3 running at speeds between 5.6 to 6.4 Gbps, while the other features the 8 Gbps HBM3e, which also goes by several names including HBM3P, HBM3A, HBM3+, and HBM3 Gen2.

Report Suggests NVIDIA Prioritizing H800 GPU Production For Chinese AI Market

NVIDIA could be adjusting its enterprise-grade GPU production strategies for the Chinese market, according to an article published by MyDriver—despite major sanctions placed on semiconductor imports, Team Green is doing plenty of business with tech firms operating in the region thanks to an uptick in AI-related activities. NVIDIA offers two market specific accelerator models that have been cut down to conform to rules and regulations—the more powerful and expensive (250K RMB/~$35K) H800 is an adaptation of the western H100 GPU, while the A800 is a legal market alternative to the older A100.

The report proposes that NVIDIA is considering plans to reduce factory output of the A800 (sold for 100K RMB/~$14K per unit), so clients will be semi-forced into purchasing the higher-end H800 model instead (if they require a significant number of GPUs). The A800 seems to be the more popular choice for the majority of companies at the moment, with the heavy hitters—Alibaba, Baidu, Tencent, Jitwei and ByteDance—flexing their spending muscles and splurging on mixed shipments of the two accelerators. By limiting supplies of the lesser A800, Team Green could be generating more profit by prioritizing the more expensive (and readily available) model.

Intel Brings Gaudi2 Accelerator to China, to Fill Gap Created By NVIDIA Export Limitations

Intel has responded to the high demand for advanced chips in mainland China by bringing its processor, the Gaudi2, to the market. This move comes as the country grapples with US export restrictions, leading to a thriving market for smuggled NVIDIA GPUs. At a press conference in Beijing, Intel presented the Gaudi2 processor as an alternative to NVIDIA's A100 GPU, widely used for training AI systems. Despite US export controls, Intel recognizes the importance of the Chinese market, with 27 percent of its 2022 revenue generated from China. NVIDIA has also tried to comply with restrictions by offering modified versions of its GPUs, but limited supplies have driven the demand for smuggled GPUs. Intel's Gaudi2 aims to provide Chinese companies with various hardware options and bolster their ability to deploy AI through cloud and smart-edge technologies. By partnering with Inspur Group, a major AI server manufacturer, Intel plans to build Gaudi2-powered machines tailored explicitly for the Chinese market.

China's AI ambitions face potential challenges as the US government considers restricting Chinese companies access to American cloud computing services. This move could impede the utilization of advanced AI chips by major players like Amazon Web Services and Microsoft for their Chinese clients. Additionally, there are reports of a potential expansion of the US export ban to include NVIDIA's A800 GPU. As China continues to push forward with its AI development projects, Intel's introduction of the Gaudi2 processor helps country's demand for advanced chips. Balancing export controls and technological requirements within this complex trade landscape remains a crucial task for both companies and governments involved in the Chinese AI industry.

NVIDIA A800 China-Tailored GPU Performance within 70% of A100

The recent growth in demand for training Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT) has sparked the interest of many companies to invest in GPU solutions that are used to train these models. However, countries like China have struggled with US sanctions, and NVIDIA has to create custom models that meet US export regulations. Carrying two GPUs, H800 and A800, they represent cut-down versions of the original H100 and A100, respectively. We reported about H800; however, it remained as mysterious as A800 that we are talking about today. Thanks to MyDrivers, we have information that the A800 GPU performance is within 70% of the regular A100.

The regular A100 GPU manages 9.7 TeraFLOPs of FP64, 19.5 TeraFLOPS of FP64 Tensor, and up to 624 BF16/FP16 TeraFLOPS with sparsity. A rough napkin math would suggest that 70% performance of the original (a 30% cut) would equal 6.8 TeraFLOPs of FP64 precision, 13.7 TeraFLOPs of FP64 Tensor, and 437 BF16/FP16 TeraFLOPs with sparsity. MyDrivers notes that A800 can be had for 100,000 Yuan, translating to about 14,462 USD at the time of writing. This is not the most capable GPU that Chinese companies can acquire, as H800 exists. However, we don't have any information about its performance for now.

NVIDIA Prepares H800 Adaptation of H100 GPU for the Chinese Market

NVIDIA's H100 accelerator is one of the most powerful solutions for powering AI workloads. And, of course, every company and government wants to use it to power its AI workload. However, in countries like China, shipment of US-made goods is challenging. With export regulations in place, NVIDIA had to get creative and make a specific version of its H100 GPU for the Chinese market, labeled the H800 model. Late last year, NVIDIA also created a China-specific version of the A100 model called A800, with the only difference being the chip-to-chip interconnect bandwidth being dropped from 600 GB/s to 400 GB/s.

This year's H800 SKU also features similar restrictions, and the company appears to have made similar sacrifices for shipping its chips to China. From the 600 GB/s bandwidth of the regular H100 PCIe model, the H800 is gutted to only 300 GB/s of bi-directional chip-to-chip interconnect bandwidth speed. While we have no data if the CUDA or Tensor core count has been adjusted, the sacrifice of bandwidth to comply with export regulations will have consequences. As the communication speed is reduced, training large models will increase the latency and slow the workload compared to the regular H100 chip. This is due to the massive data size that needs to travel from one chip to another. According to Reuters, an NVIDIA spokesperson declined to discuss other differences, stating that "our 800 series products are fully compliant with export control regulations."
Return to Keyword Browsing
May 1st, 2024 08:00 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts