News Posts matching #accelerator

Return to Keyword Browsing

Sony PlayStation 5 Pro Specifications Confirmed, Console Arrives Before Holidays

Thanks for the detailed information obtained by The Verge, today we confirm previously leaked details as Sony gears up to unveil the highly anticipated PlayStation 5 Pro, codenamed "Trinity." According to insider reports, Sony is urging developers to optimize their games for the PS5 Pro, with a primary focus on enhancing ray tracing capabilities. The console is expected to feature an RDNA 3 GPU with 30 WGP running BVH8, capable of 33.5 TeraFLOPS of FP32 single-precision computing power, and a slightly quicker CPU running at 3.85 GHz, enabling it to render games with ray tracing enabled or achieve higher resolutions and frame rates in select titles. Sony anticipates GPU rendering on the PS5 Pro to be approximately 45 percent faster than the standard PlayStation 5. The PS5 Pro GPU will be larger and utilize faster system memory to bolster ray tracing performance, boasting up to three times the speed of the regular PS5.

Additionally, the console will employ a more powerful ray tracing architecture, backed by PlayStation Spectral Super Resolution (PSSR), allowing developers to leverage graphics features like ray tracing more extensively. To support this endeavor, Sony is providing developers with test kits, and all games submitted for certification from August onward must be compatible with the PS5 Pro. Insider Gaming, the first to report the full PS5 Pro specs, suggests a potential release during the 2024 holiday period. The PS5 Pro will also feature modifications for developers regarding system memory, with Sony increasing the memory bandwidth from 448 GB/s to 576 GB/s, enhancing efficiency for an even more immersive gaming experience. To do AI processing, there is an custom AI accelerator capable of 300 8-bit INT8 TOPS and 67 16-bit FP16 TeraFLOPS, in addition to ACV audio codec running up to 35% faster.

Intel Launches Gaudi 3 AI Accelerator: 70% Faster Training, 50% Faster Inference Compared to NVIDIA H100, Promises Better Efficiency Too

During the Vision 2024 event, Intel announced its latest Gaudi 3 AI accelerator, promising significant improvements over its predecessor. Intel claims the Gaudi 3 offers up to 70% improvement in training performance, 50% better inference, and 40% better efficiency than Nvidia's H100 processors. The new AI accelerator is presented as a PCIe Gen 5 dual-slot add-in card with a 600 W TDP or an OAM module with 900 W. The PCIe card has the same peak 1,835 TeraFLOPS of FP8 performance as the OAM module despite a 300 W lower TDP. The PCIe version works as a group of four per system, while the OAM HL-325L modules can be run in an eight-accelerator configuration per server. This likely will result in a lower sustained performance, given the lower TDP, but it confirms that the same silicon is used, just finetuned with a lower frequency. Built on TSMC's N5 5 nm node, the AI accelerator features 64 Tensor Cores, delivering double the FP8 and quadruple FP16 performance over the previous generation Gaudi 2.

The Gaudi 3 AI chip comes with 128 GB of HBM2E with 3.7 TB/s of bandwidth and 24 200 Gbps Ethernet NICs, with dual 400 Gbps NICs used for scale-out. All of that is laid out on 10 tiles that make up the Gaudi 3 accelerator, which you can see pictured below. There is 96 MB of SRAM split between two compute tiles, which acts as a low-level cache that bridges data communication between Tensor Cores and HBM memory. Intel also announced support for the new performance-boosting standardized MXFP4 data format and is developing an AI NIC ASIC for Ultra Ethernet Consortium-compliant networking. The Gaudi 3 supports clusters of up to 8192 cards, coming from 1024 nodes comprised of systems with eight accelerators. It is on track for volume production in Q3, offering a cost-effective alternative to NVIDIA accelerators with the additional promise of a more open ecosystem. More information and a deeper dive can be found in the Gaudi 3 Whitepaper.

Intel Unleashes Enterprise AI with Gaudi 3, AI Open Systems Strategy and New Customer Wins

At the Intel Vision 2024 customer and partner conference, Intel introduced the Intel Gaudi 3 accelerator to bring performance, openness and choice to enterprise generative AI (GenAI), and unveiled a suite of new open scalable systems, next-gen products and strategic collaborations to accelerate GenAI adoption. With only 10% of enterprises successfully moving GenAI projects into production last year, Intel's latest offerings address the challenges businesses face in scaling AI initiatives.

"Innovation is advancing at an unprecedented pace, all enabled by silicon - and every company is quickly becoming an AI company," said Intel CEO Pat Gelsinger. "Intel is bringing AI everywhere across the enterprise, from the PC to the data center to the edge. Our latest Gaudi, Xeon and Core Ultra platforms are delivering a cohesive set of flexible solutions tailored to meet the changing needs of our customers and partners and capitalize on the immense opportunities ahead."

Unannounced AMD Instinct MI388X Accelerator Pops Up in SEC Filing

AMD's Instinct family has welcomed a new addition—the MI388X AI accelerator—as discovered in a lengthy regulatory 10K filing (submitted to the SEC). The document reveals that the unannounced SKU—along with the MI250, MI300X and MI300A integrated circuits—cannot be sold to Chinese customers due to updated US trade regulations (new requirements were issued around October 2023). Versal VC2802 and VE2802 FPGA products are also mentioned in the same section. Earlier this month, AMD's Chinese market-specific Instinct MI309 package was deemed to be too powerful for purpose by the US Department of Commerce.

AMD has not published anything about the Instinct MI388X's official specification, and technical details have not emerged via leaks. The "X" tag likely implies that it has been designed for AI and HPC applications, akin to the recently launched MI300X accelerator. The designation of a higher model number could (naturally) point to a potentially more potent spec sheet, although Tom's Hardware posits that MI388X is a semi-custom spinoff of an existing model.

Arm China Develops NPU Accelerator for AI, Targeting Domestic CPUs

Arm China is making strides in the AI accelerator market with its new neural processing unit (NPU) called Zhouyi. The company aims to integrate the NPU into low-cost domestic CPUs, potentially giving it an edge over competitors like AMD and Intel. Initially a part of Arm Holdings, which licensed IP in China, Arm China took on a new strategy of developing its own IP specifically for Chinese customers a few years ago. While the company does not develop high-performance general-purpose cores, its Zhouyi NPU could become a fundamental building block for affordable processors. A significant step forward is the upcoming addition of an open-source driver for Zhouyi to the Linux kernel. This will make the IP easy to program for software developers, increasing its appeal to chip designers.

Being an open-source driver, the integration in the Linux kernel brings assurance to developers that Zhouyi NPU could be the first in many generations from Arm China. While Zhouyi may not directly compete with offerings from AMD or Intel, its potential for widespread adoption in millions of devices could help Arm China acquire local customers with their IP. The project, which began three years ago with a kernel-only driver, has since evolved into a full driver stack. There is even a development kit board called EAIDK310, powered by Rockwell SoC and Zhouyi NPU, which is available on Aliexpress and Amazon. The integration of AI accelerator technology into the Linux ecosystem is a significant development, though there is still work to be done. Nonetheless, Arm China's Zhouyi NPU and open-source driver are essential to making AI capabilities more accessible and widely available in the domestic Chinese market.

Taiwan Dominates Global AI Server Supply - Government Reportedly Estimates 90% Share

The Taiwanese Ministry of Economic Affairs (MOEA) managed to herd government representatives and leading Information and Communication Technology (ICT) industry figures together for an important meeting, according to DigiTimes Asia. The report suggests that the main topic of discussion focused on an anticipated growth of Taiwan's ICT industry—current market trends were analyzed, revealing that the nation absolutely dominates in the AI server segment. The MOEA has (allegedly) determined that Taiwan has shipped 90% of global AI server equipment—DigiTimes claims (based on insider info) that: "American brand vendors are expected to source their AI servers from Taiwanese partners." North American customers could be (presently) 100% reliant on supplies of Taiwanese-produced equipment—a scenario that potentially complicates ongoing international tensions.

The report posits that involved parties have formed plans to seize opportunities within an evergrowing global demand for AI hardware—a 90% market dominance is clearly not enough for some very ambitious industry bosses—although manufacturers will need to jump over several (rising) cost hurdles. Key components for AI servers are reported to be much higher than vanilla server parts—DigiTimes believes that AI processor/accelerator chips are priced close to ten times higher than general purpose server CPUs. Similar price hikes have reportedly affected AI adjacent component supply chains—notably cooling, power supplies and passive parts. Taiwanese manufacturers have spread operations around the world, but industry watchdogs (largely) believe that the best stuff gets produced on home ground—global expansions are underway, perhaps inching closer to better balanced supply conditions.

NVIDIA Hopper Leaps Ahead in Generative AI at MLPerf

It's official: NVIDIA delivered the world's fastest platform in industry-standard tests for inference on generative AI. In the latest MLPerf benchmarks, NVIDIA TensorRT-LLM—software that speeds and simplifies the complex job of inference on large language models—boosted the performance of NVIDIA Hopper architecture GPUs on the GPT-J LLM nearly 3x over their results just six months ago. The dramatic speedup demonstrates the power of NVIDIA's full-stack platform of chips, systems and software to handle the demanding requirements of running generative AI. Leading companies are using TensorRT-LLM to optimize their models. And NVIDIA NIM—a set of inference microservices that includes inferencing engines like TensorRT-LLM—makes it easier than ever for businesses to deploy NVIDIA's inference platform.

Raising the Bar in Generative AI
TensorRT-LLM running on NVIDIA H200 Tensor Core GPUs—the latest, memory-enhanced Hopper GPUs—delivered the fastest performance running inference in MLPerf's biggest test of generative AI to date. The new benchmark uses the largest version of Llama 2, a state-of-the-art large language model packing 70 billion parameters. The model is more than 10x larger than the GPT-J LLM first used in the September benchmarks. The memory-enhanced H200 GPUs, in their MLPerf debut, used TensorRT-LLM to produce up to 31,000 tokens/second, a record on MLPerf's Llama 2 benchmark. The H200 GPU results include up to 14% gains from a custom thermal solution. It's one example of innovations beyond standard air cooling that systems builders are applying to their NVIDIA MGX designs to take the performance of Hopper GPUs to new heights.

Samsung Prepares Mach-1 Chip to Rival NVIDIA in AI Inference

During its 55th annual shareholders' meeting, Samsung Electronics announced its entry into the AI processor market with the upcoming launch of its Mach-1 AI accelerator chips in early 2025. The South Korean tech giant revealed its plans to compete with established players like NVIDIA in the rapidly growing AI hardware sector. The Mach-1 generation of chips is an application-specific integrated circuit (ASIC) design equipped with LPDDR memory that is envisioned to excel in edge computing applications. While Samsung does not aim to directly rival NVIDIA's ultra-high-end AI solutions like the H100, B100, or B200, the company's strategy focuses on carving out a niche in the market by offering unique features and performance enhancements at the edge, where low power and efficient computing is what matters the most.

According to SeDaily, the Mach-1 chips boast a groundbreaking feature that significantly reduces memory bandwidth requirements for inference to approximately 0.125x compared to existing designs, which is an 87.5% reduction. This innovation could give Samsung a competitive edge in terms of efficiency and cost-effectiveness. As the demand for AI-powered devices and services continues to soar, Samsung's foray into the AI chip market is expected to intensify competition and drive innovation in the industry. While NVIDIA currently holds a dominant position, Samsung's cutting-edge technology and access to advanced semiconductor manufacturing nodes could make it a formidable contender. The Mach-1 has been field-verified on an FPGA, while the final design is currently going through a physical design for SoC, which includes placement, routing, and other layout optimizations.

Chinese Research Institute Utilizing "Banned" NVIDIA H100 AI GPUs

NVIDIA's freshly unveiled "Blackwell" B200 and GB200 AI GPUs will be getting plenty of coverage this year, but many organizations will be sticking with current or prior generation hardware. Team Green is in the process of shipping out compromised "Hopper" designs to customers in China, but the region's appetite for powerful AI-crunching hardware is growing. Last year's China-specific H800 design, and the older "Ampere" A800 chip were deemed too potent—new regulations prevented further sales. Recently, AMD's Instinct MI309 AI accelerator was considered "too powerful to gain unconditional approval from the US Department of Commerce." Natively-developed solutions are catching up with Western designs, but some institutions are not prepared to queue up for emerging technologies.

NVIDIA's new H20 AI GPU as well as Ada Lovelace-based L20 PCIe and L2 PCIe models are weakened enough to get a thumbs up from trade regulators, but likely not compelling enough for discerning clients. The Telegraph believes that NVIDIA's uncompromised H100 AI GPU is currently in use at several Chinese establishments—the report cites information presented within four academic papers published on ArXiv, an open access science website. The Telegraph's news piece highlights one of the studies—it was: "co-authored by a researcher at 4paradigm, an AI company that was last year placed on an export control list by the US Commerce Department for attempting to acquire US technology to support China's military." Additionally, the Chinese Academy of Sciences appears to have conducted several AI-accelerated experiments, involving the solving of complex mathematical and logical problems. The article suggests that this research organization has acquired a very small batch of NVIDIA H100 GPUs (up to eight units). A "thriving black market" for high-end NVIDIA processors has emerged in the region—last Autumn, the Center for a New American Security (CNAS) published an in-depth article about ongoing smuggling activities.

AI-Capable PCs Forecast to Make Up 40% of Global PC Shipments in 2025

Canalys' latest forecast predicts that an estimated 48 million AI-capable PCs will ship worldwide in 2024, representing 18% of total PC shipments. But this is just the start of a major market transition, with AI-capable PC shipments projected to surpass 100 million in 2025, 40% of all PC shipments. In 2028, Canalys expects vendors to ship 205 million AI-capable PCs, representing a staggering compound annual growth rate of 44% between 2024 and 2028.

These PCs, integrating dedicated AI accelerators, such as Neural Processing Units (NPUs), will unlock new capabilities for productivity, personalization and power efficiency, disrupting the PC market and delivering significant value gains to vendors and their partners.

Sony PlayStation 5 Pro Details Emerge: Faster CPU, More System Bandwidth, and Better Audio

Sony is preparing to launch its next-generation PlayStation 5 Pro console in the Fall of 2024, right around the holidays. We previously covered a few graphics details about the console. However, today, we get more details about the CPU and the overall system, thanks to the exclusive information from Insider Gaming. Starting off, the sources indicate that PS5 Pro system memory will get a 28% bump in bandwidth, where the standard PS5 console had 448 GB/s, and the upgraded PS5 Pro will get 576 GB/s. Apparently, the memory system is more efficient, likely coming from an upgrade in memory from the GDDR6 SDRAM of the regular PS5. The next upgrade is the CPU, which has special modes for the main processor. The CPU uArch is likely the same, with clocks pushed to 3.85 GHz, resulting in a 10% frequency increase.

However, this is only achieved in the "High CPU Frequency Mode," which steals the SoC's power from the GPU and downclocks it slightly to allocate more power to the CPU in highly CPU-intense settings. The GPU we discussed here is an RDNA 3 IP with up to 45% faster graphics rendering. The ray tracing performance can be up to four times higher than the regular PS5, while the entire GPU delivers 33.5 TeraFLOPS of FP32 single-precision computing. This comes from 30 WGP running BVH8 shaders vs the 18 WGPs running BVH4 shaders on the regular PS5. There are PSSR upscalers present, and the GPU can output 8K resolution, which will come with future software updates. Last but not least, the AI front also has a custom AI accelerator capable of 300 8-bit INT8 TOPS and 67 16-bit FP16 TeraFLOPS. Audio codecs are getting some love, as well, with ACV running up to 35% faster.

Next-Generation NVIDIA DGX Systems Could Launch Soon with Liquid Cooling

During the 2024 SIEPR Economic Summit, NVIDIA CEO Jensen Huang acknowledged that the company's next-generation DGX systems, designed for AI and high-performance computing workloads, will require liquid cooling due to their immense power consumption. Huang also hinted that these new systems are set to be released in the near future. The revelation comes as no surprise, given the increasing power of GPUs needed to satisfy AI and machine learning applications. As computational requirements continue to grow, so does the need for more powerful hardware. However, with great power comes great heat generation, necessitating advanced cooling solutions to maintain optimal performance and system stability. Liquid cooling has long been a staple in high-end computing systems, offering superior thermal management compared to traditional air cooling methods.

By implementing liquid cooling in the upcoming DGX systems, NVIDIA aims to push the boundaries of performance while ensuring the hardware remains reliable and efficient. Although Huang did not provide a specific release date for the new DGX systems, his statement suggests that they are on the horizon. Whether the next generation of DGX systems uses the current NVIDIA H200 or the upcoming Blackwell B100 GPU as their primary accelerator, the performance will undoubtedly be delivered. As the AI and high-performance computing landscape continues to evolve, NVIDIA's position continues to strengthen, and liquid-cooled systems will certainly play a crucial role in shaping the future of these industries.

Marvell Announces Industry's First 2 nm Platform for Accelerated Infrastructure Silicon

Marvell Technology, Inc., a leader in data infrastructure semiconductor solutions, is extending its collaboration with TSMC to develop the industry's first technology platform to produce 2 nm semiconductors optimized for accelerated infrastructure.

Behind the Marvell 2 nm platform is the company's industry-leading IP portfolio that covers the full spectrum of infrastructure requirements, including high-speed long-reach SerDes at speeds beyond 200 Gbps, processor subsystems, encryption engines, system-on-chip fabrics, chip-to-chip interconnects, and a variety of high-bandwidth physical layer interfaces for compute, memory, networking and storage architectures. These technologies will serve as the foundation for producing cloud-optimized custom compute accelerators, Ethernet switches, optical and copper interconnect digital signal processors, and other devices for powering AI clusters, cloud data centers and other accelerated infrastructure.

Intel Gaudi 2 AI Accelerator Powers Through Llama 2 Text Generation

Intel's "AI Everywhere" hype campaign has generated the most noise in mainstream and enterprise segments. Team Blue's Gaudi—a family of deep learning accelerators—does not hit the headlines all that often. Their current generation model, Gaudi 2, is overshadowed by Team Green and Red alternatives—according to Intel's official marketing spiel: "it performs competitively on deep learning training and inference, with up to 2.4x faster performance than NVIDIA A100." Habana, an Intel subsidiary, has been working on optimizing Large Language Model (LLM) inference on Gaudi 1 and 2 for a while—their co-operation with Hugging Face has produced impressive results, as of late February. Siddhant Jagtap, an Intel Data Scientist, has demonstrated: "how easy it is to generate text with the Llama 2 family of models (7b, 13b and 70b) using Optimum Habana and a custom pipeline class."

Jagtap reckons that folks will be able to: "run the models with just a few lines of code" on Gaudi 2 accelerators—additionally, Intel's hardware is capable of accepting single and multiple prompts. The custom pipeline class: "has been designed to offer great flexibility and ease of use. Moreover, it provides a high level of abstraction and performs end-to-end text-generation which involves pre-processing and post-processing." His article/blog outlines various prerequisites and methods of getting Llama 2 text generation up and running on Gaudi 2. Jagtap concluded that Habana/Intel has: "presented a custom text-generation pipeline on Intel Gaudi 2 AI accelerator that accepts single or multiple prompts as input. This pipeline offers great flexibility in terms of model size as well as parameters affecting text-generation quality. Furthermore, it is also very easy to use and to plug into your scripts, and is compatible with LangChain." Hugging Face reckons that Gaudi 2 delivers roughly twice the throughput speed of NVIDIA A100 80 GB in both training and inference scenarios. Intel has teased third generation Gaudi accelerators—industry watchdogs believe that next-gen solutions are designed to compete with Team Green H100 AI GPUs.

Tiny Corp. Builds AI Platform with Six AMD Radeon RX 7900 XTX GPUs

Tiny Corp., a neural network framework specialist, has revealed intimate details about the ongoing development and building of its "tinybox" system: "I don't think there's much value in secrecy. We have the parts to build 12 boxes and a case that's pretty close to final. Beating back all the PCI-E AER errors was hard, as anyone knows who has tried to build a system like this. Our BOM cost is around $10k, and we are selling them for $15k. We've put a year of engineering into this, it's a lot harder than it first seemed. You are welcome to believe me or not, but unless you are building in huge quantity, you are getting a great deal for $15k." The startup has taken the unusual step of integrating Team Red's current flagship gaming GPU into its AI-crunching platform. Tiny Corp. founder—George Hotz—has documented his past rejections of NVIDIA AI hardware on social media, but TinyBox will not be running AMD's latest Instinct MI300X accelerators. RDNA 3.0 is seemingly favored over CDNA 3.0—perhaps due to growing industry demand for enterprise-grade GPUs.

The rack-mounted 12U TinyBox build houses an AMD EPYC 7532 processor with 128 GB of system memory. Five 1 TB SN850X SSDs take care of storage duties (4 in raid, 1 for boot), and an unoccupied 16x OCP 3.0 slot is designated for networking tasks Two 1600 W PSUs provide necessary electrical juice. The Tiny Corp. social media picture feed indicates that they have acquired a pile of XFX Speedster MERC310 RX 7900 XTX graphics cards—six units are hooked up inside of each TinyBox system. Hotz's young startup has ambitious plans: "The system image shipping with the box will be Ubuntu 22.04. It will only include tinygrad out of the box, but PyTorch and JAX support on AMD have come a long way, and your hardware is your hardware. We make money either way, you are welcome to buy it for any purpose. The goal of the tiny corp is to commoditize the petaflop, and we believe tinygrad is the best way to do it. Solving problems in software is cheaper than in hardware. tinygrad will elucidate the deep structure of what neural networks are. We have 583 preorders, and next week we'll place an order for 100 sets of parts. This is $1M in outlay. We will also ship five of the 12 boxes we have to a few early people who I've communicated with. For everyone else, they start shipping in April. The production line started running yesterday."

AMD CTO Teases Memory Upgrades for Revised Instinct MI300-series Accelerators

Brett Simpson, Partner and Co-Founder of Arete Research, sat down with AMD CTO Mark Papermaster during the former's "Investor Webinar Conference." A transcript of the Arete + AMD question and answer session appeared online last week—the documented fireside chat concentrated mostly on "AI compute market" topics. Papermaster was asked about his company's competitive approach when taking on NVIDIA's very popular range of A100 and H100 AI GPUs, as well as the recently launched GH200 chip. The CTO did not reveal any specific pricing strategies—a "big picture" was painted instead: "I think what's important when you just step back is to look at total cost of ownership, not just one GPU, one accelerator, but total cost of ownership. But now when you also look at the macro, if there's not competition in the market, you're going to see not only a growth of the price of these devices due to the added content that they have, but you're -- without a check and balance, you're going to see very, very high margins, more than that could be sustained without a competitive environment."

Papermaster continued: "And what I think is very key with -- as AMD has brought competition market for these most powerful AI training and inference devices is you will see that check and balance. And we have a very innovative approach. We've been a leader in chiplet design. And so we have the right technology for the right purpose of the AI build-out that we do. We have, of course, a GPU accelerator. But there's many other circuitry associated with being able to scale and build out these large clusters, and we're very, very efficient in our design." Team Red started to ship its flagship accelerator, Instinct MI300X, to important customers at the start of 2024—Arete Research's Simpson asked about the possibility of follow-up models. In response, AMD's CTO referenced some recent history: "Well, I think the first thing that I'll highlight is what we did to arrive at this point, where we are a competitive force. We've been investing for years in building up our GPU road map to compete in both HPC and AI. We had a very, very strong harbor train that we've been on, but we had to build our muscle in the software enablement."

NVIDIA Prepared to Offer Custom Chip Designs to AI Clients

NVIDIA is reported to be setting up an AI-focused semi-custom chip design business unit, according to inside sources known to Reuters—it is believed that Team Green leadership is adapting to demands leveraged by key data-center customers. Many companies are seeking cheaper alternatives, or have devised their own designs (budget/war chest permitting)—NVIDIA's current range of AI GPUs are simply off-the-shelf solutions. OpenAI has generated the most industry noise—their alleged early 2024 fund-raising pursuits have attracted plenty of speculative/kind-of-serious interest from notable semiconductor personalities.

Team Green is seemingly reacting to emerging market trends—Jensen Huang (CEO, president and co-founder) has hinted that NVIDIA custom chip designing services are on the cusp. Stephen Nellis—a Reuters reporter specializing in tech industry developments—has highlighted select NVIDIA boss quotes from an incoming interview piece: "We're always open to do that. Usually, the customization, after some discussion, could fall into system reconfigurations or recompositions of systems." The Team Green chief teased that his engineering team is prepared to take on the challenge meeting exact requests: "But if it's not possible to do that, we're more than happy to do a custom chip. And the benefit to the customer, as you can imagine, is really quite terrific. It allows them to extend our architecture with their know-how and their proprietary information." The rumored NVIDIA semi-custom chip design business unit could be introduced in an official capacity at next month's GTC 2024 Conference.

NVIDIA Expects Upcoming Blackwell GPU Generation to be Capacity-Constrained

NVIDIA is anticipating supply issues for its upcoming Blackwell GPUs, which are expected to significantly improve artificial intelligence compute performance. "We expect our next-generation products to be supply constrained as demand far exceeds supply," said Colette Kress, NVIDIA's chief financial officer, during a recent earnings call. This prediction of scarcity comes just days after an analyst noted much shorter lead times for NVIDIA's current flagship Hopper-based H100 GPUs tailored to AI and high-performance computing. The eagerly anticipated Blackwell architecture and B100 GPUs built on it promise major leaps in capability—likely spurring NVIDIA's existing customers to place pre-orders already. With skyrocketing demand in the red-hot AI compute market, NVIDIA appears poised to capitalize on the insatiable appetite for ever-greater processing power.

However, the scarcity of NVIDIA's products may present an excellent opportunity for significant rivals like AMD and Intel. If both companies can offer a product that could beat NVIDIA's current H100 and provide a suitable software stack, customers would be willing to jump to their offerings and not wait many months for the anticipated high lead times. Intel is preparing the next-generation Gaudi 3 and working on the Falcon Shores accelerator for AI and HPC. AMD is shipping its Instinct MI300 accelerator, a highly competitive product, while already working on the MI400 generation. It remains to be seen if AI companies will begin the adoption of non-NVIDIA hardware or if they will remain a loyal customer and agree to the higher lead times of the new Blackwell generation. However, capacity constrain should only be a problem at launch, where the availability should improve from quarter to quarter. As TSMC improves CoWoS packaging capacity and 3 nm production, NVIDIA's allocation of the 3 nm wafers will likely improve over time as the company moves its priority from H100 to B100.

Acer Launches Swift Series Laptops Powered by AMD Ryzen 8040 Series

Acer today announced new models of the Acer Swift Edge 16 and Acer Swift Go 14 laptops, blending AI power and innovative features in stylish thin and light devices. The latest additions to the Swift lineup feature AMD Ryzen 8040 Series processors with up to AMD Radeon 780M Graphics and equipped with Ryzen AI for versatile performance and support for Acer's AI-powered capabilities such as Acer PurifiedVoice, Acer PurfiedView, and the new Acer LiveArt photo-editing feature. Intuitive control and seamless navigation on the AI PCs are made possible thanks to the AcerSense application and Copilot in Windows with instant access through dedicated keys. Users can also appreciate clear images and rich colors when working or streaming through the OLED laptops' displays, as well as Microsoft Pluton technology, enabled by default, to help secure devices, personal data, and encryption keys.

Harnessing the Power of AI with AMD Ryzen 8040 Series Processors
Designed to deliver premium AI experiences and reliable performance for everyday productivity, the Swift Edge 16 and Swift Go 14 laptops are powered by AMD Ryzen 8040 Series processors with Ryzen AI technology built in. AMD's latest processors enable the efficient distribution of AI workloads between accelerators on the NPU, GPU, and CPU, to advance user experiences with AI technology on the devices. It leverages AMD's "Zen 4" processor architecture with up to eight cores and delivers up to 16 threads of processing power, so creative professionals and mainstream users can expect fast, power-efficient computing and longer battery life on the ultrathin Swift laptops.

NVIDIA Accelerates Quantum Computing Exploration at Australia's Pawsey Supercomputing Centre

NVIDIA today announced that Australia's Pawsey Supercomputing Research Centre will add the NVIDIA CUDA Quantum platform accelerated by NVIDIA Grace Hopper Superchips to its National Supercomputing and Quantum Computing Innovation Hub, furthering its work driving breakthroughs in quantum computing.

Researchers at the Perth-based center will leverage CUDA Quantum - an open-source hybrid quantum computing platform that features powerful simulation tools, and capabilities to program hybrid CPU, GPU and QPU systems - as well as, the NVIDIA cuQuantum software development kit of optimized libraries and tools for accelerating quantum computing workflows. The NVIDIA Grace Hopper Superchip - which combines the NVIDIA Grace CPU and Hopper GPU architectures - provides extreme performance to run high-fidelity and scalable quantum simulations on accelerators and seamlessly interface with future quantum hardware infrastructure.

Financial Analyst Outs AMD Instinct MI300X "Projected" Pricing

AMD's December 2023 launch of new Instinct series accelerators has generated a lot of tech news buzz and excitement within the financial world, but not many folks are privy to Team Red's MSRP for the CDNA 3.0 powered MI300X and MI300A models. A Citi report has pulled back the curtain, albeit with "projected" figures—an inside source claims that Microsoft has purchased the Instinct MI300X 192 GB model for ~$10,000 a piece. North American enterprise customers appear to have taken delivery of the latest MI300 products around mid-January time—inevitably, top secret information has leaked out to news investigators. SeekingAlpha's article (based on Citi's findings) alleges that the Microsoft data center division is AMD's top buyer of MI300X hardware—GPT-4 is reportedly up and running on these brand new accelerators.

The leakers claim that businesses further down the (AI and HPC) food chain are having to shell out $15,000 per MI300X unit, but this is a bargain when compared to NVIDIA's closest competing package—the venerable H100 SXM5 80 GB professional card. Team Green, similarly, does not reveal its enterprise pricing to the wider public—Tom's Hardware has kept tabs on H100 insider info and market leaks: "over the recent quarters, we have seen NVIDIA's H100 80 GB HBM2E add-in-card available for $30,000, $40,000, and even much more at eBay. Meanwhile, the more powerful H100 80 GB SXM with 80 GB of HBM3 memory tends to cost more than an H100 80 GB AIB." Citi's projection has Team Green charging up to four times more for its H100 product, when compared to Team Red MI300X pricing. NVIDIA's dominant AI GPU market position could be challenged by cheaper yet still very performant alternatives—additionally chip shortages have caused Jensen & Co. to step outside their comfort zone. Tom's Hardware reached out to AMD for comment on the Citi pricing claims—a company representative declined this invitation.

AMD Instinct MI300X Released at Opportune Moment. NVIDIA AI GPUs in Short Supply

LaminiAI appeared to be one of the first customers to receive an initial shipment of AMD's Instinct MI300X accelerators, as disclosed by their CEO posting about functioning hardware on social media late last week. A recent Taiwan Economic Daily article states that the "MI300X is rumored to have begun supply"—we are not sure about why they have adopted a semi-secretive tone in their news piece, but a couple of anonymous sources are cited. A person familiar with supply chains in Taiwan divulged that: "(they have) been receiving AMD MI300X chips one after another...due to the huge shortage of NVIDIA AI chips, the arrival of new AMD products is really a timely rainfall." Favorable industry analysis (from earlier this month) has placed Team Red in a position of strength, due to growing interest in their very performant flagship AI accelerator.

The secrecy seems to lie in Team Red's negotiation strategies in Taiwan—the news piece alleges that big manufacturers in the region have been courted. AMD has been aggressive in a push to: "cooperate and seize AI business opportunities, with GIGABYTE taking the lead and attracting the most attention. Not only was GIGABYTE the first to obtain a partnership with AMD's MI300A chip, which had previously been mass-produced, but GIGABYTE was also one of the few Taiwanese manufacturers included in AMD's first batch of MI300X partners." GIGABYTE is expected to release two new "G593" product lines of server hardware later this year, based on combinations of AMD's Instinct MI300X accelerator and EPYC 9004 series processors.

Google Faces Potential Billion-Dollar Damages in TPU Patent Dispute

Tech giant Google is embroiled in a high-stakes legal battle over the alleged infringement of patents related to its Tensor Processing Units (TPUs), custom AI accelerator chips used to power machine learning applications. Massachusetts-based startup Singular Computing has accused Google of incorporating architectures described in several of its patents into the design of the TPU without permission. The disputed patents, first filed in 2009, outline computer architectures optimized for executing a high volume of low-precision calculations per cycle - an approach well-suited for neural network-based AI. In a 2019 lawsuit, Singular argues that Google knowingly infringed on these patents in developing its TPU v2 and TPU v3 chips introduced in 2017 and 2018. Singular Computing is seeking between $1.6 billion and $5.19 billion in damages from Google.

Google denies these claims, stating that its TPUs were independently developed over many years. The company is currently appealing to have Singular's patents invalidated, which would undermine the infringement allegations. The high-profile case highlights mounting legal tensions as tech giants race to dominate the burgeoning field of AI hardware. With billions in potential damages at stake, the outcome could have major implications for the competitive landscape in cloud-based machine learning services. As both sides prepare for court, the dispute underscores the massive investments tech leaders like Google make to integrate specialized AI accelerators into their cloud infrastructures. Dominance in this sphere is a crucial strategic advantage as more industries embrace data-hungry neural network applications.

Update 17:25 UTC: According to Reuters, Google and Singular Computing have settled the case with details remaining private for the time being.

HBM Industry Revenue Could Double by 2025 - Growth Driven by Next-gen AI GPUs Cited

Samsung, SK hynix, and Micron are considered to be the top manufacturing sources of High Bandwidth Memory (HBM)—the HBM3 and HBM3E standards are becoming increasingly in demand, due to a widespread deployment of GPUs and accelerators by generative AI companies. Taiwan's Commercial Times proposes that there is an ongoing shortage of HBM components—but this presents a growth opportunity for smaller manufacturers in the region. Naturally, the big name producers are expected to dive in head first with the development of next generation models. The aforementioned financial news article cites research conducted by the Gartner group—they predict that the HBM market will hit an all-time high of $4.976 billion (USD) by 2025.

This estimate is almost double that of projected revenues (just over $2 billion) generated by the HBM market in 2023—the explosive growth of generative AI applications has "boosted" demand for the most performant memory standards. The Commercial Times report states that SK Hynix is the current HBM3E leader, with Micron and Samsung trailing behind—industry experts believe that stragglers will need to "expand HBM production capacity" in order to stay competitive. SK Hynix has shacked up with NVIDIA—the GH200 Grace Hopper platform was unveiled last summer; outfitted with the South Korean firm's HBM3e parts. In a similar timeframe, Samsung was named as AMD's preferred supplier of HBM3 packages—as featured within the recently launched Instinct MI300X accelerator. NVIDIA's HBM3E deal with SK Hynix is believed to extend to the internal makeup of Blackwell GB100 data-center GPUs. The HBM4 memory standard is expected to be the next major battleground for the industry's hardest hitters.

OpenAI CEO Reportedly Seeking Funds for Purpose-built Chip Foundries

OpenAI CEO, Sam Altman, had a turbulent winter 2023 career moment, but appears to be going all in with his company's future interests. A Bloomberg report suggests that the tech visionary has initiated a major fundraising initiative for the construction of OpenAI-specific semiconductor production plants. The AI evangelist reckons that his industry will become prevalent enough to demand a dedicated network of manufacturing facilities—the U.S. based artificial intelligence (AI) research organization is (reportedly) exploring custom artificial intelligence chip designs. Proprietary AI-focused GPUs and accelerators are not novelties at this stage in time—many top tech companies rely on NVIDIA solutions, but are keen to deploy custom-built hardware in the near future.

OpenAI's popular ChatGPT system is reliant on NVIDIA H100 and A100 GPUs, but tailor-made alternatives seem to be the desired route for Altman & Co. The "on their own terms" pathway seemingly skips an expected/traditional chip manufacturing process—the big foundries could struggle to keep up with demand for AI-oriented silicon. G42 (an Abu Dhabi-based AI development holding company) and SoftBank Group are mentioned as prime investment partners in OpenAI's fledgling scheme—Bloomberg proposes that Altman's team is negotiating a $8 to 10 billion deal with top brass at G42. OpenAI's planned creation of its own foundry network is certainly a lofty and costly goal—the report does not specify whether existing facilities will be purchased and overhauled, or new plants being constructed entirely from scratch.
Return to Keyword Browsing
Apr 30th, 2024 18:48 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts