News Posts matching #GPU

Return to Keyword Browsing

ASUS Republic of Gamers Announces Completely Redesigned Zephyrus G14 and G16

ASUS Republic of Gamers (ROG) today announced the 2024 Zephyrus G14 and Zephyrus G16, the latest in an illustrious lineup of supremely powerful thin-and-light gaming laptops. These machines feature a new CNC-machined aluminium chassis, a customizable Slash Lighting array, and a brand-new Platinum White colorway, while cutting-edge AI accelerated silicon from Intel, AMD, and NVIDIA stand ready to push gamers and creators to new heights of performance. Both the Zephyrus G14 and G16 come equipped with the ROG Nebula Display, stunningly color-accurate OLED panels that are also G-SYNC capable for incredible gaming experiences. Ultra-efficient cooling technology, including tri-fan technology, liquid metal, and vapor chambers on select models enable the Zephyrus G14 and G16 to breathe easily despite their ultra-portable designs.

Brand-new chassis design
The 2024 Zephyrus G14 and Zephyrus G16 have been completely redesigned inside and out. Both machines boast all-new and all-aluminium CNC-machined chassis for the perfect mix of weight reduction, structural rigidity, and increased chassis space. This allows for an edge-to-edge keyboard design, as well as the inclusion of larger and louder speakers with superior bass response down to 100 Hz. The speakers are 25% larger than the previous generation, with a 47% volume increase for more immersive audio experiences than ever before. The Zephyrus G14 and G16 also come with larger individual keycaps and a larger touchpad, for superior typing, precision scrolling, and fluid gaming. Both the 2024 Zephyrus G14 and Zephyrus G16 ship with three months of Xbox Game Pass Ultimate, providing access to a library of hundreds of great games.

PNY Unveils the NVIDIA GeForce RTX SUPER 40-Series GPU Family

PNY announced today the arrival of the new VERTO GeForce RTX 4080 SUPER 16GB, RTX 4070 Ti SUPER 16GB, and RTX 4070 SUPER 12GB graphics cards to its lineup of NVIDIA GeForce RTX GPUs. The latest generation of RTX, GeForce RTX SUPER 40-series graphics cards are blazingly fast, offering gamers and creators an unparalleled boost in performance, neural rendering, and many more cutting-edge platform features. They are fueled by the revolutionary NVIDIA Ada Lovelace architecture; a major advancement in GPU technology which empowers accelerated content production techniques, amazing AI capabilities, and hyper-realistic gaming experiences.

The new GeForce RTX SUPER GPUs are the ultimate way to experience AI on PCs. Specialized AI Tensor Cores deliver up to 836 AI TOPS to deliver transformative capabilities for AI in gaming, creating and everyday productivity. PC gamers demand the very best in visual quality, and AI-powered NVIDIA Deep Learning Super Sampling (DLSS) Super Resolution, Frame Generation and Ray Reconstruction combine with ray tracing to offer stunning worlds. With DLSS, seven out of eight pixels can be AI-generated, accelerating full ray tracing by up to 4x with better image quality.

GIGABYTE Launches AMD Radeon RX 7600 XT 16GB Graphics Card

GIGABYTE TECHNOLOGY Co. Ltd, a leading manufacturer of premium gaming hardware, today launches a new graphics card powered by AMD RDNA 3 architecture. The GIGABYTE AMD Radeon RX 7600 XT GAMING OC 16G graphics card comes with the top-of-the-line WINDFORCE cooling system from GIGABYTE. It delivers unmatched performance, stunning visual effects, and exceptional efficiency, perfect for smooth 1080p gaming and streaming experience.

The GIGABYTE WINDFORCE cooling system is tailored for gamers, boasting three unique blade fans with alternate spinning, composite copper heat pipes in direct contact with the GPU, 3D active fans and screen cooling. The Alternate Spinning technology rotates the central fan in the opposite direction of the side fans, directing airflow in the same direction and doubling air pressure while reducing turbulence. This design effectively dissipates heat from both the top and the bottom of the graphics card, resulting in improved overall cooling performance.

Acer Expands SpatialLabs Stereoscopic 3D Portfolio with New Laptop and Gaming Monitor

Acer today announced the extension of its SpatialLabs stereoscopic 3D lineup to the Aspire line of laptops and Predator gaming monitors.

The new Aspire 3D 15 SpatialLabs Edition laptop delivers captivating 3D content for entertainment and creation on its 15.6-inch UHD display; It also comes with a suite of AI-powered SpatialLabs applications for 3D viewing and content creation, without the need for specialized glasses, delighting users when watching their favorite content and empowering developers to see their designs in their real 3D forms. With Microsoft Copilot in Windows 11, users can experience upscaled creativity and productivity with AI-powered task assistance, while Acer's suite of AI-supported solutions in Acer PurifiedView and PurifiedVoice elevate conference calls on the 3D laptop.

VESA Updates Adaptive-Sync Display Standard with New Dual-Mode Support

The Video Electronics Standards Association (VESA) today announced that it has published an update to its Adaptive-Sync Display Compliance Test Specification (Adaptive-Sync Display CTS), which is the first publicly open standard for front-of-screen performance of variable refresh rate displays. Adaptive-Sync Display version 1.1a provides updated testing procedures and logo support for an emerging category of displays that can operate at different maximum refresh rates when resolution is reduced. This optional "Dual Mode" testing and logo support allows display OEMs with qualifying hardware to certify their products at two different sets of resolution and refresh rate (for example, 4K/144 Hz and 1080p/280 Hz).

Adaptive-Sync Display v1.1a also includes an update that allows display OEMs to achieve a higher AdaptiveSync Display refresh rate certification for displays that support an "overclocked" or faster mode option that is not enabled by default in the factory configuration. In such cases, the overclocked mode must support Adaptive-Sync-enabled GPUs in a non-proprietary manner, and the display must pass all of the rigorous Adaptive-Sync Display compliance tests in both its factory default mode, and completely retested a second time in the overclocking mode. Both the dual mode and overclocking changes to the Adaptive-Sync Display CTS v1.1a only apply to the VESA Certified AdaptiveSync Display logo program; they do not apply to the VESA Certified MediaSync Display logo program.To date, more than 100 products have been certified to the Adaptive-Sync Display standard. A complete list of Adaptive-Sync Display certified products can be found at https://www.adaptivesync.org/certified-products/.

AMD Withholds Radeon RX 7600 XT Launch in China Amid Strong RX 6750 GRE Sales

According to the latest round of reports, AMD has decided not to include China in the initial global launch of its upcoming Radeon RX 7600 XT graphics card. The RX 7600 XT, featuring 16 GB of memory and based on AMD's next-generation RDNA 3 architecture, was expected to launch soon at a price of around $300. However, the company is currently re-evaluating its Chinese GPU launch strategy due to the runaway success of its existing Radeon RX 6750 Golden Rabbit Edition (GRE) series in the region. The RX 6750 GRE cards with 10 GB and 12 GB configurations retail between $269-$289 in China, offering exceptional value compared to rival NVIDIA RTX models. AMD seems hesitant to risk undercutting sales of its popular RX 6750 GPUs by launching the newer 7600 XT.

While the RX 7600 XT promises more raw performance thanks to advanced RDNA 3 architecture, 6750 GRE, with its RDNA 2 design, seemingly remains efficient enough for most Chinese mainstream gamers. With the RX 6750 GRE still selling strongly in China, AMD has postponed the RX 7600 XT introduction for this key market. Final launch timelines for the 7600 XT in China and globally remain unconfirmed by AMD at time of writing. The company appears to be treading cautiously amidst the shifting competitive landscape.

TSMC Plans to Put a Trillion Transistors on a Single Package by 2030

During the recent IEDM conference, TSMC previewed its process roadmap for delivering next-generation chip packages packing over one trillion transistors by 2030. This aligns with similar long-term visions from Intel. Such enormous transistor counts will come through advanced 3D packaging of multiple chipsets. But TSMC also aims to push monolithic chip complexity higher, ultimately enabling 200 billion transistor designs on a single die. This requires steady enhancement of TSMC's planned N2, N2P, N1.4, and N1 nodes, which are slated to arrive between now and the end of the decade. While multi-chipset architectures are currently gaining favor, TSMC asserts both packaging density and raw transistor density must scale up in tandem. Some perspective on the magnitude of TSMC's goals include NVIDIA's 80 billion transistor GH100 GPU—among today's largest chips, excluding wafer-scale designs from Cerebras.

Yet TSMC's roadmap calls for more than doubling that, first with over 100 billion transistor monolithic designs, then eventually 200 billion. Of course, yields become more challenging as die sizes grow, which is where advanced packaging of smaller chiplets becomes crucial. Multi-chip module offerings like AMD's MI300X and Intel's Ponte Vecchio already integrate dozens of tiles, with PVC having 47 tiles. TSMC envisions this expansion to chip packages housing more than a trillion transistors via its CoWoS, InFO, 3D stacking, and many other technologies. While the scaling cadence has recently slowed, TSMC remains confident in achieving both packaging and process breakthroughs to meet future density demands. The foundry's continuous investment ensures progress in unlocking next-generation semiconductor capabilities. But physics ultimately dictates timelines, no matter how aggressive the roadmap.

SUNON: Pioneering Innovative Liquid Cooling Solutions for Modern Data Centers

In the era of high-tech development and the ever-increasing demand for data processing power, data centers are consuming more energy and generating excess heat. As a global leader in thermal solutions, SUNON is at the forefront, offering a diverse range of cutting-edge liquid cooling solutions tailored to advanced data centers equipped with high-capacity CPU and GPU computing for AI, edge, and cloud servers.

SUNON's liquid cooling design services are ideally suited for modern data centers, generative AI computing, and high-performance computing (HPC) applications. These solutions are meticulously customized to fit the cooling space and server density of each data center. With their compact yet comprehensive design, they guarantee exceptional cooling efficiency and reliability, ultimately contributing to a significant reduction in a client's total cost of ownership (TCO) in the long term. In the pursuit of net-zero emissions standards, SUNON's liquid cooling solutions play a pivotal role in enhancing corporate sustainability. They o ff er a win-win scenario for clients seeking to transition toward greener and more digitalized operations.

MemryX Demos Production Ready AI Accelerator (MX3) During 2024 CES Show

MemryX Inc. is announcing the availability of production level silicon of its cutting-edge AI Accelerator (MX3). MemryX is a pioneering startup specializing in accelerating artificial intelligence (AI) processing for edge devices. In less than 30 days after receiving production silicon from TSMC, MemryX will publicly showcase the ability to efficiently run hundreds of unaltered AI models at the 2024 Consumer Electronics Show (CES) in Las Vegas from Jan 9 through Jan 12.

Apple Wants to Store LLMs on Flash Memory to Bring AI to Smartphones and Laptops

Apple has been experimenting with Large Language Models (LLMs) that power most of today's AI applications. The company wants these LLMs to serve the users best and deliver them efficiently, which is a difficult task as they require a lot of resources, including compute and memory. Traditionally, LLMs have required AI accelerators in combination with large DRAM capacity to store model weights. However, Apple has published a paper that aims to bring LLMs to devices with limited memory capacity. By storing LLMs on NAND flash memory (regular storage), the method involves constructing an inference cost model that harmonizes with the flash memory behavior, guiding optimization in two critical areas: reducing the volume of data transferred from flash and reading data in larger, more contiguous chunks. Instead of storing the model weights on DRAM, Apple wants to utilize flash memory to store weights and only pull them on-demand to DRAM once it is needed.

Two principal techniques are introduced within this flash memory-informed framework: "windowing" and "row-column bundling." These methods collectively enable running models up to twice the size of the available DRAM, with a 4-5x and 20-25x increase in inference speed compared to native loading approaches on CPU and GPU, respectively. Integrating sparsity awareness, context-adaptive loading, and a hardware-oriented design pave the way for practical inference of LLMs on devices with limited memory, such as SoCs with 8/16/32 GB of available DRAM. Especially with DRAM prices outweighing NAND Flash, setups such as smartphone configurations could easily store and inference LLMs with multi-billion parameters, even if the DRAM available isn't sufficient. For a more technical deep dive, read the paper on arXiv here.

Phison Predicts 2024: Security is Paramount, PCIe 5.0 NAND Flash Infrastructure Imminent as AI Requires More Balanced AI Data Ecosystem

Phison Electronics Corp., a global leader in NAND flash controller and storage solutions, today announced the company's predictions for 2024 trends in NAND flash infrastructure deployment. The company predicts that rapid proliferation of artificial intelligence (AI) technologies will continue apace, with PCIe 5.0-based infrastructure providing high-performance, sustainable support for AI workload consistency as adoption rapidly expands. PCIe 5.0 NAND flash solutions will be at the core of a well-balanced hardware ecosystem, with private AI deployments such as on-premise large language models (LLMs) driving significant growth in both everyday AI and the infrastructure required to support it.

"We are moving past initial excitement over AI toward wider everyday deployment of the technology. In these configurations, high-quality AI output must be achieved by infrastructure designed to be secure, while also being affordable. The organizations that leverage AI to boost productivity will be incredibly successful," said Sebastien Jean, CTO, Phison US. "Building on the widespread proliferation of AI applications, infrastructure providers will be responsible for making certain that AI models do not run up against the limitations of memory - and NAND flash will become central to how we configure data center architectures to support today's developing AI market while laying the foundation for success in our fast-evolving digital future."

RISC-V Breaks Into Handheld Console Market with Sipeed Lichee Pocket 4A

Chinese company Sipeed has introduced the Lichee Pocket 4A, one of the first handheld gaming devices based on the RISC-V open-source instruction set architecture (ISA). Sipeed positions the device as a retro gaming platform capable of running simple titles via software rendering or GPU acceleration. At its core is Alibaba's T-Head TH1520 processor featuring four 2.50 GHz Xuantie C910 RISC-V general-purpose CPU cores and an unnamed Imagination GPU. The chip was originally aimed at laptop designs. Memory options include 8 GB or 16 GB LPDDR4X RAM and 32 GB or 128 GB of storage. The Lichee Pocket 4A has a 7-inch 1280x800 LCD touchscreen, Wi-Fi/Bluetooth connectivity, and an array of wired ports like USB and Ethernet. It weighs under 500 grams. The device can run Android or Linux distributions like Debian, Ubuntu, and others.

As an early RISC-V gaming entrant, performance expectations should be modest—the focus is retro gaming and small indie titles, not modern AAA games. Specific gaming capabilities remain to be fully tested. However, the release helps showcase RISC-V's potential for consumer electronics and competitive positioning against proprietary ISAs like ARM. Pricing is still undefined, but another Sipeed handheld console retails for around $250 currently. Reception from enthusiasts and developers will demonstrate whether there's a viable market for RISC-V gaming devices. Success could encourage additional hardware experimentation efforts across emerging open architectures. With a 6000 mAh battery, battery life should be decent. Other specifications can be seen in the table below, and the pre-order link is here.

FSP Readies 2500 Watt PSU with Four PCIe 12V-2×6 GPU Power Cables

Taiwanese power supply manufacturer FSP showcased upcoming products for 2023 and 2024. This included new power supply lineups with updated naming schemes - the entry-level VITA series, mid-range ADVAN series, and high-end MEGA and DAGGER series. The simplified naming clarifies the differentiation between affordable, mainstream, and premium offerings across wattages and efficiency certifications. Specific new PSU models include 1500+ Watts beasts for maxed-out systems, redundant server-class units ensuring uptime, and 80+ Titanium efficiency ratings for eco-conscious builds. Star of the show is FSP's flagship unit, which boasts a staggering 2500 Watts, 100% modular cabling, and cutting-edge 12V-2x6 PCIe Gen 5 graphics card power connectors.

Called the Cannon Pro, the 2500-watt power supply has four 12V-2x6 PCIe Gen 5 connectors to feed even the highest power-rated GPUs and the three 6+2-pin connectors. This new PSU is also rated for ATX 3.1 specifications, 80+ Platinum Specification, and the upgraded version of the 12VHPWR PCIe Gen 5 connector, supposedly overcoming all the issues, in the form of a 12V-2x6 PCIe Gen 5 connector. The PSU should be able to power four NVIDIA GeForce RTX 4090 GPUs simultaneously with its high capacity. Pricing and availability aren't specified, so we must wait for FSP to launch these products in 2024.

Acer Unleashes New Predator Triton Neo 16 with Intel Core Ultra Processors

Acer today announced the new Predator Triton Neo 16 (PTN16-51) gaming laptop, designed with the new Intel Core Ultra processors with dedicated AI acceleration capabilities and NVIDIA GeForce RTX 40 Series GPUs that support demanding games and creative applications. Players and content creators can marvel at enhanced video game scenes and designs on the laptop's 16-inch display with up to a stunning 3.2K resolution and 165 Hz refresh rate and Calman-Verified displays, producing accurate colors right out-of-the-box.

The state-of-the-art cooling system combines a 5th Gen AeroBlade fan and liquid metal thermal grease on the CPU to keep the laptop running at full steam, while users stay on top of communications and device management thanks to the AI-enhanced Acer PurifiedVoice 2.0 software and the PredatorSense utility app. This Windows 11 gaming PC also provides players with amazing performance experiences and one month of Xbox Game Pass for access to hundreds of high-quality PC games.

TYAN Upgrades HPC, AI and Data Center Solutions with the Power of 5th Gen Intel Xeon Scalable Processors

TYAN, a leading server platform design manufacturer and a MiTAC Computing Technology Corporation subsidiary, today introduced upgraded server platforms and motherboards based on the brand-new 5th Gen Intel Xeon Scalable Processors, formerly codenamed Emerald Rapids.

5th Gen Intel Xeon processor has increased to 64 cores, featuring a larger shared cache, higher UPI and DDR5 memory speed, as well as PCIe 5.0 with 80 lanes. Growing and excelling with workload-optimized performance, 5th Gen Intel Xeon delivers more compute power and faster memory within the same power envelope as the previous generation. "5th Gen Intel Xeon is the second processor offering inside the 2023 Intel Xeon Scalable platform, offering improved performance and power efficiency to accelerate TCO and operational efficiency", said Eric Kuo, Vice President of Server Infrastructure Business Unit, MiTAC Computing Technology Corporation. "By harnessing the capabilities of Intel's new Xeon CPUs, TYAN's 5th-Gen Intel Xeon-supported solutions are designed to handle the intense demands of HPC, data centers, and AI workloads.

United States Ease Stance on NVIDIA AI Chip Exports to China

The United States is softening restrictions on the significant GPU maker NVIDIA, selling artificial intelligence chips to China. While still limiting advanced chip exports deemed strategically threatening, Commerce Secretary Gina Raimondo clarified this week that NVIDIA could supply some AI processors to Chinese commercial companies. Previously, Raimondo had sharply criticized NVIDIA for attempting to sidestep regulations on selling powerful GPUs abroad. Her comments followed rumors that NVIDIA tweaked chip designs to avoid newly imposed export controls narrowly. However, after discussions between Raimondo and NVIDIA CEO Jensen Huang, the Commerce Department says NVIDIA and other US firms will be permitted to export AI chips to China for general commercial use cases. Exports are still banned on the very highest-end GPUs that could enable China to train advanced AI models rivaling American developments.

Raimondo said NVIDIA will collaborate with the US to comply with the export rules. Huang reaffirmed the company's commitment to adherence. The clarification may ease pressures on NVIDIA, as China accounts for up to 25% of its revenue. While optimistic about recent Chinese approvals for US joint ventures, Raimondo noted frustrations linger around technology controls integral to national security. The nuanced recalibration of restrictions illustrates the balances the administration must strike between economic and security interests. As one of the first big US technology exporters impacted by tightened restrictions, NVIDIA's ability to still partly supply the valuable Chinese chip market points to a selective enforcement approach from regulators in the future.

No Overclocking and Lower TGP for NVIDIA GeForce RTX 4090 D Edition for China

NVIDIA is preparing to launch the GeForce RTX 4090 D, or "Dragon" edition, designed explicitly for China. Circumventing the US export rules of GPUs that could potentially be used for AI acceleration, the GeForce RTX 4090 D is reportedly cutting back on overclocking as a feature. According to BenchLife, the AD102-250 GPU used in the RTX 4090 D will be a stranger to overclocking, as the card will not support it, possibly being disabled by firmware and/or physically in the die. The information from @Zed__Wang suggests that the Dragon version will be running at 2280 MHz base frequency, higher than the 2235 MHz of AD102-300 found in the regular RTX 4090, and 2520 MHz boost, matching the regular version.

Interestingly, the RTX 4090 D for China will also feature a slightly lower Total Graphics Power (TGP) of 425 Watts, down from the 450 Watts of the regular model. With memory configuration appearing to be the same, this new China-specific model will most likely perform within a few percent of the original design. Higher base frequency probably indicates a lack of a few CUDA cores to comply with the US export regulation policy and serve the Chinese GPU market. The NVIDIA GeForce RTX 4090 D is scheduled for rollout in January 2024 in China, which is just a few weeks away.

GIGABYTE Unveils Next-gen HPC & AI Servers with AMD Instinct MI300 Series Accelerators

GIGABYTE Technology: Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, and IT infrastructure, today announced the GIGABYTE G383-R80 for the AMD Instinct MI300A APU and two GIGABYTE G593 series servers for the AMD Instinct MI300X GPU and AMD EPYC 9004 Series processor. As a testament to the performance of AMD Instinct MI300 Series family of products, the El Capitan supercomputer at Lawrence Livermore National Laboratory uses the MI300A APU to power exascale computing. And these new GIGABYTE servers are the ideal platform to propel discoveries in HPC & AI at exascale.⁠

Marrying of a CPU & GPU: G383-R80
For incredible advancements in HPC there is the GIGABYTE G383-R80 that houses four LGA6096 sockets for MI300A APUs. This chip integrates a CPU that has twenty-four AMD Zen 4 cores with a powerful GPU built with AMD CDNA 3 GPU cores. And the chiplet design shares 128 GB of unified HBM3 memory for impressive performance for large AI models. The G383 server has lots of expansion slots for networking, storage, or other accelerators, with a total of twelve PCIe Gen 5 slots. And in the front of the chassis are eight 2.5" Gen 5 NVMe bays to handle heavy workloads such as real-time big data analytics and latency-sensitive workloads in finance and telecom. ⁠

GALAX GeForce RTX 4060 Ti Max 16 GB Unparalleled Max is the First Single-Slot RTX 40 Series GPU

NVIDIA's GeForce RTX 40 series lineup of graphics cards has been supporting massive cooler designs ranging from three to four slots in thickness, with gamers rarely even getting a standard two-slot solution available. However, GALAX recently announced a novel entry into this lineup: the RTX 4060 Ti Max 16 GB Unparalleled Max, a GPU noted for its unprecedented single-slot design and exceptionally thin 20 mm profile. This model, previously previewed on GALAX's China website, stands out with its unique vapor chamber cooling system paired with a copper heatsink, diverging from the typical multi-fan setups seen in the market. Measuring at 267x111x20 mm, the design is very friendly towards smaller cases with room for only a single slot cooler.

The RTX 4060 Ti Max is set to operate at a default clock speed of 2535 MHz, with a power target of 165 Watts, suggesting a solid performance base for all GPU-intensive sessions. Currently, GALAX has yet to indicate the availability of an 8 GB version or the inclusion of a non-Ti model with this cooler, as only a 16 GB version has been shown. Interestingly, GALAX has made overclocking the card possible; however, the voltage regulation module setup is 6+2 VRMs placed on a six-layer PCB, not providing an ideal overclocking setup. Additionally, while feasible, overclocking the GPU with such a tiny single-slot cooler should be approached cautiously.
More images, along with specification table (in Chinese), can be seen below.

Ethernet Switch Chips are Now Infected with AI: Broadcom Announces Trident 5-X12

Artificial intelligence has been a hot topic this year, and everything is now an AI processor, from CPUs to GPUs, NPUs, and many others. However, it was only a matter of time before we saw an integration of AI processing elements into the networking chips. Today, Broadcom announced its new Ethernet switching silicon called Trident 5-X12. The Trident 5-X12 delivers 16 Tb/s of bandwidth, double that of the previous Trident generation while adding support for fast 800G ports for connection to Tomahawk 5 spine switch chips. The 5-X12 is software-upgradable and optimized for dense 1RU top-of-rack designs, enabling configurations with up to 48x200G downstream server ports and 8x800G upstream fabric ports. The 800G support is added using 100G-PAM4 SerDes, which enables up to 4 m DAC and linear optics.

However, this is not only a switch chip on its own. Broadcom has added AI processing elements in an inference engine called NetGNT (Networking General-purpose Neural-network Traffic-analyzer). It can detect common traffic patterns and optimize data movement across the chip. Specifically, the company has listed an example of the system doing AI/ML workloads. In that case, NetGNT performs intelligent traffic analysis to avoid network congestion in these workloads. For example, it can detect the so-called "incast" patterns in real-time, where many flows converge simultaneously on the same port. By recognizing the start of incast early, NetGNT can invoke hardware-based congestion control techniques to prevent performance degradation without added latency.

AWS and NVIDIA Partner to Deliver 65 ExaFLOP AI Supercomputer, Other Solutions

Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), and NVIDIA (NASDAQ: NVDA) today announced an expansion of their strategic collaboration to deliver the most-advanced infrastructure, software and services to power customers' generative artificial intelligence (AI) innovations. The companies will bring together the best of NVIDIA and AWS technologies—from NVIDIA's newest multi-node systems featuring next-generation GPUs, CPUs and AI software, to AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability—that are ideal for training foundation models and building generative AI applications.

The expanded collaboration builds on a longstanding relationship that has fueled the generative AI era by offering early machine learning (ML) pioneers the compute performance required to advance the state-of-the-art in these technologies.

Manufacturers Anticipate Completion of NVIDIA's HBM3e Verification by 1Q24; HBM4 Expected to Launch in 2026

TrendForce's latest research into the HBM market indicates that NVIDIA plans to diversify its HBM suppliers for more robust and efficient supply chain management. Samsung's HBM3 (24 GB) is anticipated to complete verification with NVIDIA by December this year. The progress of HBM3e, as outlined in the timeline below, shows that Micron provided its 8hi (24 GB) samples to NVIDIA by the end of July, SK hynix in mid-August, and Samsung in early October.

Given the intricacy of the HBM verification process—estimated to take two quarters—TrendForce expects that some manufacturers might learn preliminary HBM3e results by the end of 2023. However, it's generally anticipated that major manufacturers will have definite results by 1Q24. Notably, the outcomes will influence NVIDIA's procurement decisions for 2024, as final evaluations are still underway.

Special Chinese Factories are Dismantling NVIDIA GeForce RTX 4090 Graphics Cards and Turning Them into AI-Friendly GPU Shape

The recent U.S. government restrictions on AI hardware exports to China have significantly impacted several key semiconductor players, including NVIDIA, AMD, and Intel, restricting them from selling high-performance AI chips to Chinese land. This ban has notably affected NVIDIA's GeForce RTX 4090 gaming GPUs, pushing them out of mainland China due to their high computational capabilities. In anticipation of these restrictions, NVIDIA reportedly moved a substantial inventory of its AD102 GPUs and GeForce RTX 4090 graphics cards to China, which we reported earlier. This could have contributed to the global RTX 4090 shortage, driving the prices of these cards up to 2000 USD. In an interesting turn of events, insiders from the Chinese Baidu forums have disclosed that specialized factories across China are repurposing these GPUs, which arrived before the ban, into AI solutions.

This transformation involves disassembling the gaming GPUs, removing the cooling systems and extracting the AD102 GPU and GDDR6X memory from the main PCBs. These components are then re-soldered onto a domestically manufactured "reference" PCB, better suited for AI applications, and equipped with dual-slot blower-style coolers designed for server environments. The third-party coolers that these GPUs come with are 3-4 slots in size, whereas the blower-style cooler is only two slots wide, and many of them can be placed in parallel in an AI server. After rigorous testing, these reconfigured RTX 4090 AI solutions are supplied to Chinese companies running AI workloads. This adaptation process has resulted in an influx of RTX 4090 coolers and bare PCBs into the Chinese reseller market at markedly low prices, given that the primary GPU and memory components have been removed.
Below, you can see the dismantling of AIB GPUs before getting turned into blower-style AI server-friendly graphics cards.

NVIDIA Experiences Strong Cloud AI Demand but Faces Challenges in China, with High-End AI Server Shipments Expected to Be Below 4% in 2024

NVIDIA's most recent FY3Q24 financial reports reveal record-high revenue coming from its data center segment, driven by escalating demand for AI servers from major North American CSPs. However, TrendForce points out that recent US government sanctions targeting China have impacted NVIDIA's business in the region. Despite strong shipments of NVIDIA's high-end GPUs—and the rapid introduction of compliant products such as the H20, L20, and L2—Chinese cloud operators are still in the testing phase, making substantial revenue contributions to NVIDIA unlikely in Q4. Gradual shipments increases are expected from the first quarter of 2024.

The US ban continues to influence China's foundry market as Chinese CSPs' high-end AI server shipments potentially drop below 4% next year
TrendForce reports that North American CSPs like Microsoft, Google, and AWS will remain key drivers of high-end AI servers (including those with NVIDIA, AMD, or other high-end ASIC chips) from 2023 to 2024. Their estimated shipments are expected to be 24%, 18.6%, and 16.3%, respectively, for 2024. Chinese CSPs such as ByteDance, Baidu, Alibaba, and Tencent (BBAT) are projected to have a combined shipment share of approximately 6.3% in 2023. However, this could decrease to less than 4% in 2024, considering the current and potential future impacts of the ban.

Dell Allegedly Prohibits Sales of High-End Radeon and Instinct MI GPUs in China

AMD's lineup of Radeon and Instinct GPUs, including the flagship RX 7900 XTX/XT, the professional-grade PRO W7900, and the upcoming Instinct MI300, are facing sales prohibitions in China, according to an alleged sales advisory guide from Dell. This restriction mirrors the earlier ban on NVIDIA's RTX 4090, underscoring the increasing export limitations U.S.-based companies face for high-end semiconductor products that could be repurposed for military and strategic applications. Notably, Dell's report lists several AMD Instinct accelerators, which are integral to data center infrastructure, and Radeon GPUs, which are widely used in PCs, indicating the broad impact of the advisory.

The ban includes discrete GPUs like AMD's Radeon RX 7900 XTX and 7900 XT, which, despite their data-center potential, may still be sold under specific "NEC" eligibility. This status allows for continued sales in restricted regions like sales of NVIDIA's RTX 4090. However, the process to secure NEC eligibility is lengthy, potentially leading to supply shortages and increased GPU prices—a trend already observed with the RX 7900 XTX in China, where it's become a high-end alternative in light of the RTX 4090's scarcity and inflated pricing. The Dell sales advisory also lists that sales of the aforementioned products are banned in 22 countries, including Russia, Iran, Iraq, and others listed below.
Return to Keyword Browsing
May 21st, 2024 22:29 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts