News Posts matching #Baidu

Return to Keyword Browsing

Huawei Starts Shipping "Ascend 910C" AI Accelerator Samples to Large NVIDIA Customers

Huawei has reportedly started shipping its Ascend 910C accelerator—the company's domestic alternative to NVIDIA's H100 accelerator for AI training and inference. As the report from China South Morning Post notes, Huawei is shipping samples of its accelerator to large NVIDIA customers. This includes companies like Alibaba, Baidu, and Tencent, which have ordered massive amounts of NVIDIA accelerators. However, Huawei is on track to deliver 70,000 chips, potentially worth $2 billion. With NVIDIA working on a B20 accelerator SKU that complies with US government export regulations, the Huawei Ascend 910C accelerator could potentially outperform NVIDIA's B20 processor, per some analyst expectations.

If the Ascend 910C receives positive results from Chinese tech giants, it could be the start of Huawei's expansion into data center accelerators, once hindered by the company's ability to manufacture advanced chips. Now, with foundries like SMIC printing 7 nm designs and possibly 5 nm coming soon, Huawei will leverage this technology to satisfy the domestic demand for more AI processing power. Competing on a global scale, though, remains a challenge. Companies like NVIDIA, AMD, and Intel have access to advanced nodes, which gives their AI accelerators more efficiency and performance.

Global AI Server Demand Surge Expected to Drive 2024 Market Value to US$187 Billion; Represents 65% of Server Market

TrendForce's latest industry report on AI servers reveals that high demand for advanced AI servers from major CSPs and brand clients is expected to continue in 2024. Meanwhile, TSMC, SK hynix, Samsung, and Micron's gradual production expansion has significantly eased shortages in 2Q24. Consequently, the lead time for NVIDIA's flagship H100 solution has decreased from the previous 40-50 weeks to less than 16 weeks.

TrendForce estimates that AI server shipments in the second quarter will increase by nearly 20% QoQ, and has revised the annual shipment forecast up to 1.67 million units—marking a 41.5% YoY growth.

Qualcomm Snapdragon X Elite and X Plus SKU Lineup Leaks Out

A Qualcomm Snapdragon X Elite "X1E80100" processor model was leaked in late February—it is likely that several SKUs have been distributed for evaluation purposes. Geekbench Browser is normally a good source of pre-release information—a benched Lenovo "83ED" laptop was spotted last week. That entry outed a "Snapdragon X Elite-X1E78100" processor, sporting twelve cores with maximum frequencies of 3.42 GHz. The latest exposures arrive courtesy of a Baidu forum post. Qualcomm has publicly revealed its "X Elite" range of Nuvia-designed Oryon core CPUs, but insiders have uncovered an additional "X Plus" family—probably a series of less expensive/lower spec alternatives.

The leaked list of SKUs does not include any detailed information—it reconfirms the existence of Qualcomm's top-tier X1E80100 and X1E78100 models and the presence of Adreno iGPUs. Driver information points to Qualcomm's next-gen integrated graphics solutions being readied for modern APIs: DX11, DX12, and OpenGL. The firm's ARM-based mobile PC CPUs are expected to launch within a mid-2024 period, according to the company's official statements—insiders believe that the NPU-enhanced Snapdragon X processors are destined to debut within next-gen "Windows 12" AI-centric notebooks.

AMD Stalls on Instinct MI309 China AI Chip Launch Amid US Export Hurdles

According to the latest report from Bloomberg, AMD has hit a roadblock in offering its top-of-the-line AI accelerator in the Chinese market. The newest AI chip is called Instinct MI309, a lower-performance Instinct MI300 variant tailored to meet the latest US export rules for selling advanced chips to China-based entities. However, the Instinct MI309 still appears too powerful to gain unconditional approval from the US Department of Commerce, leaving AMD in need of an export license. Originally, the US Department of Commerce made a rule: Total Processing Performance (TPP) score should not exceed 4800, effectively capping AI performance at 600 FP8 TFLOPS. This rule ensures that processors with slightly lower performance may still be sold to Chinese customers, provided their performance density (PD) is sufficiently low.

However, AMD's latest creation, Instinct MI309, is everything but slow. Based on the powerful Instinct MI300, AMD has not managed to bring it down to acceptable levels to acquire a US export license from the Department of Commerce. It is still unknown which Chinese customer was trying to acquire AMD's Instinct MI309; however, it could be one of the Chinese AI labs trying to get ahold of more training hardware for their domestic models. NVIDIA has employed a similar tactic, selling A800 and H800 chips to China, until the US also ended the export of these chips to China. AI labs located in China can only use domestic hardware, including accelerators from Alibaba, Huawei, and Baidu. Cloud services hosting GPUs in US can still be accessed by Chinese companies, but that is currently under US regulators watchlist.

China Continues to Enhance AI Chip Self-Sufficiency, but High-End AI Chip Development Remains Constrained

Huawei's subsidiary HiSilicon has made significant strides in the independent R&D of AI chips, launching the next-gen Ascend 910B. These chips are utilized not only in Huawei's public cloud infrastructure but also sold to other Chinese companies. This year, Baidu ordered over a thousand Ascend 910B chips from Huawei to build approximately 200 AI servers. Additionally, in August, Chinese company iFlytek, in partnership with Huawei, released the "Gemini Star Program," a hardware and software integrated device for exclusive enterprise LLMs, equipped with the Ascend 910B AI acceleration chip, according to TrendForce's research.

TrendForce conjectures that the next-generation Ascend 910B chip is likely manufactured using SMIC's N+2 process. However, the production faces two potential risks. Firstly, as Huawei recently focused on expanding its smartphone business, the N+2 process capacity at SMIC is almost entirely allocated to Huawei's smartphone products, potentially limiting future capacity for AI chips. Secondly, SMIC remains on the Entity List, possibly restricting access to advanced process equipment.

Special Chinese Factories are Dismantling NVIDIA GeForce RTX 4090 Graphics Cards and Turning Them into AI-Friendly GPU Shape

The recent U.S. government restrictions on AI hardware exports to China have significantly impacted several key semiconductor players, including NVIDIA, AMD, and Intel, restricting them from selling high-performance AI chips to Chinese land. This ban has notably affected NVIDIA's GeForce RTX 4090 gaming GPUs, pushing them out of mainland China due to their high computational capabilities. In anticipation of these restrictions, NVIDIA reportedly moved a substantial inventory of its AD102 GPUs and GeForce RTX 4090 graphics cards to China, which we reported earlier. This could have contributed to the global RTX 4090 shortage, driving the prices of these cards up to 2000 USD. In an interesting turn of events, insiders from the Chinese Baidu forums have disclosed that specialized factories across China are repurposing these GPUs, which arrived before the ban, into AI solutions.

This transformation involves disassembling the gaming GPUs, removing the cooling systems and extracting the AD102 GPU and GDDR6X memory from the main PCBs. These components are then re-soldered onto a domestically manufactured "reference" PCB, better suited for AI applications, and equipped with dual-slot blower-style coolers designed for server environments. The third-party coolers that these GPUs come with are 3-4 slots in size, whereas the blower-style cooler is only two slots wide, and many of them can be placed in parallel in an AI server. After rigorous testing, these reconfigured RTX 4090 AI solutions are supplied to Chinese companies running AI workloads. This adaptation process has resulted in an influx of RTX 4090 coolers and bare PCBs into the Chinese reseller market at markedly low prices, given that the primary GPU and memory components have been removed.
Below, you can see the dismantling of AIB GPUs before getting turned into blower-style AI server-friendly graphics cards.

NVIDIA Experiences Strong Cloud AI Demand but Faces Challenges in China, with High-End AI Server Shipments Expected to Be Below 4% in 2024

NVIDIA's most recent FY3Q24 financial reports reveal record-high revenue coming from its data center segment, driven by escalating demand for AI servers from major North American CSPs. However, TrendForce points out that recent US government sanctions targeting China have impacted NVIDIA's business in the region. Despite strong shipments of NVIDIA's high-end GPUs—and the rapid introduction of compliant products such as the H20, L20, and L2—Chinese cloud operators are still in the testing phase, making substantial revenue contributions to NVIDIA unlikely in Q4. Gradual shipments increases are expected from the first quarter of 2024.

The US ban continues to influence China's foundry market as Chinese CSPs' high-end AI server shipments potentially drop below 4% next year
TrendForce reports that North American CSPs like Microsoft, Google, and AWS will remain key drivers of high-end AI servers (including those with NVIDIA, AMD, or other high-end ASIC chips) from 2023 to 2024. Their estimated shipments are expected to be 24%, 18.6%, and 16.3%, respectively, for 2024. Chinese CSPs such as ByteDance, Baidu, Alibaba, and Tencent (BBAT) are projected to have a combined shipment share of approximately 6.3% in 2023. However, this could decrease to less than 4% in 2024, considering the current and potential future impacts of the ban.

NVIDIA Might be Forced to Cancel US$5 Billion Worth of Orders from China

The U.S. Commerce Department seems to have thrown a big spanner into the NVIDIA machinery, by informing the company that some US$5 billion worth of AI chip orders for China falls under the latest US export restrictions. The orders are said to have been heading for Alibaba, ByteDance and Baidu, as well as possibly other major tech companies in China. This made NVIDIA's shares drop sharply when the market opened in the US earlier today, by close to five percent, dropping NVIDIA's market cap below the US$1 Trillion mark. The share price recovered somewhat in the afternoon, putting NVIDIA back in the trillion dollar club.

Based on a statement to Reuters, NVIDIA doesn't seem overly concerned, despite what appears to be huge loss in sales, with a company spokesperson issuing the following statement "These new export controls will not have a meaningful impact in the near term". The US government will implement these new export restrictions from November, which obviously didn't give NVIDIA much of a chance to avoid them and it looks as if the company is going to have to find new customers for the AI chips. Considering the current demand for NVIDIA's chips, this might not be too much of a challenge for the company though.

Baidu Launches ERNIE 4.0 Foundation Model, Leading a New Wave of AI-Native Applications

Baidu, Inc., a leading AI company with strong Internet foundation, today hosted its annual flagship technology conference Baidu World 2023 in Beijing, marking the conference's return to an offline format after four years. With the theme "Prompt the World," this year's Baidu World conference saw Baidu launch ERNIE 4.0, Baidu's next-generation and most powerful foundation model offering drastically enhanced core AI capabilities. Baidu also showcased some of its most popular applications, solutions, and products re-built around the company's state-of-the-art generative AI.

"ERNIE 4.0 has achieved a full upgrade with drastically improved performance in understanding, generation, reasoning, and memory," Robin Li, Co-founder, Chairman and CEO of Baidu, said at the event. "These four core capabilities form the foundation of AI-native applications and have now unleashed unlimited opportunities for new innovations."

Major CSPs Aggressively Constructing AI Servers and Boosting Demand for AI Chips and HBM, Advanced Packaging Capacity Forecasted to Surge 30~40%

TrendForce reports that explosive growth in generative AI applications like chatbots has spurred significant expansion in AI server development in 2023. Major CSPs including Microsoft, Google, AWS, as well as Chinese enterprises like Baidu and ByteDance, have invested heavily in high-end AI servers to continuously train and optimize their AI models. This reliance on high-end AI servers necessitates the use of high-end AI chips, which in turn will not only drive up demand for HBM during 2023~2024, but is also expected to boost growth in advanced packaging capacity by 30~40% in 2024.

TrendForce highlights that to augment the computational efficiency of AI servers and enhance memory transmission bandwidth, leading AI chip makers such as Nvidia, AMD, and Intel have opted to incorporate HBM. Presently, Nvidia's A100 and H100 chips each boast up to 80 GB of HBM2e and HBM3. In its latest integrated CPU and GPU, the Grace Hopper Superchip, Nvidia expanded a single chip's HBM capacity by 20%, hitting a mark of 96 GB. AMD's MI300 also uses HBM3, with the MI300A capacity remaining at 128 GB like its predecessor, while the more advanced MI300X has ramped up to 192 GB, marking a 50% increase. Google is expected to broaden its partnership with Broadcom in late 2023 to produce the AISC AI accelerator chip TPU, which will also incorporate HBM memory, in order to extend AI infrastructure.

Chinese Tech Firms Buying Plenty of NVIDIA Enterprise GPUs

TikTok developer ByteDance, and other major Chinese tech firms including Tencent, Alibaba and Baidu are reported (by local media) to be snapping up lots of NVIDIA HPC GPUs, with even more orders placed this year. ByteDance is alleged to have spent enough on new products in 2023 to match the expenditure of the entire Chinese tech market on similar NVIDIA purchases for FY2022. According to news publication Jitwei, ByteDance has placed orders totaling $1 billion so far this year with Team Green—the report suggests that a mix of A100 and H800 GPU shipments have been sent to the company's mainland data centers.

The older Ampere-based A100 units were likely ordered prior to trade sanctions enforced on China post-August 2022, with further wiggle room allowed—meaning that shipments continued until September. The H800 GPU is a cut-down variant of 2022's flagship "Hopper" H100 model, designed specifically for the Chinese enterprise market—with reduced performance in order to meet export restriction standards. The H800 costs around $10,000 (average sale price per accelerator) according to Tom's Hardware, so it must offer some level of potency at that price. ByteDance has ordered roughly 100,000 units—with an unspecified split between H800 and A100 stock. Despite the development of competing HPC products within China, it seems that the nation's top-flight technology companies are heading directly to NVIDIA to acquire the best-of-the-best and highly mature AI processing hardware.

Shipments of AI Servers Will Climb at CAGR of 10.8% from 2022 to 2026

According to TrendForce's latest survey of the server market, many cloud service providers (CSPs) have begun large-scale investments in the kinds of equipment that support artificial intelligence (AI) technologies. This development is in response to the emergence of new applications such as self-driving cars, artificial intelligence of things (AIoT), and edge computing since 2018. TrendForce estimates that in 2022, AI servers that are equipped with general-purpose GPUs (GPGPUs) accounted for almost 1% of annual global server shipments. Moving into 2023, shipments of AI servers are projected to grow by 8% YoY thanks to ChatBot and similar applications generating demand across AI-related fields. Furthermore, shipments of AI servers are forecasted to increase at a CAGR of 10.8% from 2022 to 2026.

TrendForce: YoY Growth Rate of Global Server Shipments for 2023 Has Been Lowered to 1.31%

The four major North American cloud service providers (CSPs) have made cuts to their server procurement quantities for this year because of economic headwinds and high inflation. Turning to server OEMs such as Dell and HPE, they are observed to have scaled back the production of server motherboards at their ODM partners. Given these developments, TrendForce now projects that global server shipments will grow by just 1.31% YoY to 14.43 million units for 2023. This latest figure is a downward correction from the earlier estimation. The revisions that server OEMs have made to their outlooks on shipments shows that the demand for end products has become much weaker than expected. They also highlight factors such as buyers of enterprise servers imposing a stricter control of their budgets and server OEMs' inventory corrections.

TikTok's Parent Company ByteDance Starts Developing Custom Processors

TikTok's parent company ByteDance has recently begun hiring chip designers to help develop specialized processors for fields where they haven't been able to find existing suppliers. The company is looking to design chips that are optimized for hosting their video, information, and entertainment apps without any plans to sell these processors to other companies. This latest announcement follows various other Chinese companies such as Alibaba and Baidu in developing custom processors to decrease their reliance on foreign companies and improve performance in specific tasks. The initial job listings only include 31 openings for positions such as experts, specialists, and interns with more staff likely required in the future.

GCP, AWS Projected to Become Main Drivers of Global Server Demand with 25-30% YoY Increase in Server Procurement, Says TrendForce

Thanks to their flexible pricing schemes and diverse service offerings, CSPs have been a direct, major driver of enterprise demand for cloud services, according to TrendForce's latest investigations. As such, the rise of CSPs have in turn brought about a gradual shift in the prevailing business model of server supply chains from sales of traditional branded servers (that is, server OEMs) to ODM Direct sales instead. Incidentally, the global public cloud market operates as an oligopoly dominated by North American companies including Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP), which collectively possess an above-50% share in this market. More specifically, GCP and AWS are the most aggressive in their data center build-outs. Each of these two companies is expected to increase its server procurement by 25-30% YoY this year, followed closely by Azure.

Global Server Shipment for 2021 Projected to Grow by More than 5% YoY, Says TrendForce

Enterprise demand for cloud services has been rising steady in the past two years owing to the rapidly changing global markets and uncertainties brought about by the COVID-19 pandemic. TrendForce's investigations find that most enterprises have been prioritizing cloud service adoption across applications ranging from AI to other emerging technologies as cloud services have relatively flexible costs. Case in point, demand from clients in the hyperscale data center segment constituted more than 40% of total demand for servers in 4Q20, while this figure may potentially approach 45% for 2021. For 2021, TrendForce expects global server shipment to increase by more than 5% YoY and ODM Direct server shipment to increase by more than 15% YoY.

Logitech and Baidu Brain Partner to Transform the Way We Work Using AI and Voice

Today Logitech announced a long-term partnership with Baidu Brain, beginning with the launch of its intuitive new Logitech Voice M380 Wireless Mouse with Speech Input in China. Designed especially for people who create large amounts of content, this innovative product lets you dictate with your voice, creating content two or three times faster than typing. The Logitech Voice M380 Wireless Mouse is powered exclusively by intelligent Baidu Speech technology* from Baidu Brain and features the comfort, performance and quality that users expect in a Logitech mouse.

"We saw an opportunity to leverage the power of Baidu AI to bring fast, accurate speech recognition to our customers and the result is pure magic—a mouse that allows you to instantly start dictating with your voice at the click of a button," said Delphine Donne-Crock, general manager of the creativity and productivity business group at Logitech. "We are thrilled to tap into Baidu's AI superpower for the launch of Logitech Voice M380, and we look forward to collaborating on future products and solutions that unleash everyone's productivity and creativity in the digital world."

NVIDIA Extends Data Center Infrastructure Processing Roadmap with BlueField-3 DPU

NVIDIA today announced the NVIDIA BlueField -3 DPU, its next-generation data processing unit, to deliver the most powerful software-defined networking, storage and cybersecurity acceleration capabilities available for data centers.

The first DPU built for AI and accelerated computing, BlueField-3 lets every enterprise deliver applications at any scale with industry-leading performance and data center security. It is optimized for multi-tenant, cloud-native environments, offering software-defined, hardware-accelerated networking, storage, security and management services at data-center scale.

ASUS ROG Zephyrus Duo 15 Owners are Applying Custom GPU vBIOS with Higher TGP Presets

With NVIDIA's GeForce RTX 30-series lineup of GPUs, laptop manufacturers are offered a wide variety of GPU SKUs that internally differ simply by having different Total Graphics Power (TGP), which in turn results in different clock speeds and thus different performance. ASUS uses NVIDIA's variant of GeForce RTX 3080 mobile GPU inside the company's ROG Zephyrus Duo (GX551QS) with a TGP of 115 Watts, and Dynamic Boost technology that can ramp up the card to 130 Watts. However, this doesn't represent the maximum for RTX 3080 mobile graphics card. The maximum TGP for RTX 3080 mobile goes up to 150 Watts, which is a big improvement that lets the GPU reach higher frequencies and more performance.

Have you ever wondered what would happen if you manually applied vBIOS that allows the card to use more power? Well, Baidu forum users are reporting a successful experiment of transforming their 115 W RTX 3080 to 150 W TGP card. Using GPU vBIOS from MSI Leopard G76, which features a 150 W power limit, and applying it to the ROG's Zephyrus Duo power-limited RTX 3080 cards is giving results. Users have successfully used this vBIOS to squeeze out more performance from their laptops. As seen on the 3D Mark Time Spy rank list, the entries are now dominated solely by modified laptops. Performance improvement is, of course, present and it reaches up to a 20% increase.

Hot Chips 2020 Program Announced

Today the Hot Chips program committee officially announced the August conference line-up, posted to hotchips.org. For this first-ever live-streamed Hot Chips Symposium, the program is better than ever!

In a session on deep learning training for data centers, we have a mix of talks from the internet giant Google showcasing their TPUv2 and TPUv3, and a talk from startup Cerebras on their 2nd gen wafer-scale AI solution, as well as ETH Zurich's 4096-core RISC-V based AI chip. And in deep learning inference, we have talks from several of China's biggest AI infrastructure companies: Baidu, Alibaba, and SenseTime. We also have some new startups that will showcase their interesting solutions—LightMatter talking about its optical computing solution, and TensTorrent giving a first-look at its new architecture for AI.
Hot Chips

Samsung Starts Production of AI Chips for Baidu

Baidu, a leading Chinese-language Internet search provider, and Samsung Electronics, a world leader in advanced semiconductor technology, today announced that Baidu's first cloud-to-edge AI accelerator, Baidu KUNLUN, has completed its development and will be mass-produced early next year. Baidu KUNLUN chip is built on the company's advanced XPU, a home-grown neural processor architecture for cloud, edge, and AI, as well as Samsung's 14-nanometer (nm) process technology with its I-Cube (Interposer-Cube) package solution.

The chip offers 512 gigabytes per second (GBps) memory bandwidth and supplies up to 260 Tera operations per second (TOPS) at 150 watts. In addition, the new chip allows Ernie, a pre-training model for natural language processing, to infer three times faster than the conventional GPU/FPGA-accelerating model. Leveraging the chip's limit-pushing computing power and power efficiency, Baidu can effectively support a wide variety of functions including large-scale AI workloads, such as search ranking, speech recognition, image processing, natural language processing, autonomous driving, and deep learning platforms like PaddlePaddle.

Baidu Unveils 'Kunlun' High-Performance AI Chip

Baidu Inc. today announced Kunlun, China's first cloud-to-edge AI chip, built to accommodate high performance requirements of a wide variety of AI scenarios. The announcement includes training chip "818-300"and inference chip "818-100". Kunlun can be applied to both cloud and edge scenarios, such as data centers, public clouds and autonomous vehicles.

Kunlun is a high-performance and cost-effective solution for the high processing demands of AI. It leverages Baidu's AI ecosystem, which includes AI scenarios like search ranking and deep learning frameworks like PaddlePaddle. Baidu's years of experience in optimizing the performance of these AI services and frameworks afforded the company the expertise required to build a world class AI chip.

MSI Z170A Xpower Titanium Modded, Supports Intel Core i3-8350K

The seemingly impossible (as per Intel) has happened: an Intel Z170 motherboard was made to support Intel's latest Core i3-8350K. This news comes after various reports and counter reports went for and against this being actually viable, according to motherboard socket pin count and function allocation. That this happened not on a Z270 motherboard, but on a Z170, really does serve to open our eyes as customers to what sort of games might be being played by tech companies in product refreshes and new motherboard chipsets.

Seagate and Baidu Sign Strategic Cooperation Agreement for Big Data Analysis

Seagate Technology plc., a world leader in storage solutions, today announced the signing of a strategic cooperation agreement with Baidu (NASDAQ: BIDU), the leading Chinese language internet search provider, covering the fields of information technology, big data analysis and advanced storage system development and implementation.

The pact renews an existing agreement between the two firms signed in September 2014, under which Baidu would give priority to Seagate when selecting storage products and solutions, and Seagate would give advanced access to products, services and support to Baidu, as well as assign a dedicated team of engineers to the company.

The cooperation between the two parties builds on this, with Seagate providing double the amount of technical cooperation, and closer links to Baidu for its business needs. With regard to new products, Baidu will be at the forefront of Internet users in China implementing Seagate's new storage products, and also the two sides will jointly develop customized systems to meet Baidu business needs. In addition, the procurement model for both companies will be further upgraded to save costs for each side.

NVIDIA Unveils Palm-Sized, Energy-Efficient AI Computer for Self-Driving Cars

NVIDIA today unveiled a palm-sized, energy-efficient artificial intelligence (AI) computer that automakers can use to power automated and autonomous vehicles for driving and mapping. The new single-processor configuration of the NVIDIA DRIVE PX 2 AI computing platform for AutoCruise functions -- which include highway automated driving and HD mapping -- consumes just 10 watts of power and enables vehicles to use deep neural networks to process data from multiple cameras and sensors. It will be deployed by China's Baidu as the in-vehicle car computer for its self-driving cloud-to-car system.

DRIVE PX 2 enables automakers and their tier 1 suppliers to accelerate production of automated and autonomous vehicles. A car using the small form-factor DRIVE PX 2 for AutoCruise can understand in real time what is happening around it, precisely locate itself on an HD map and plan a safe path forward. "Bringing an AI computer to the car in a small, efficient form factor is the goal of many automakers," said Rob Csongor, vice president and general manager of Automotive at NVIDIA. "NVIDIA DRIVE PX 2 in the car solves this challenge for our OEM and tier 1 partners, and complements our data center solution for mapping and training."
Return to Keyword Browsing
Nov 21st, 2024 07:43 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts