News Posts matching #CUDA

Return to Keyword Browsing

Interview with AMD's Senior Vice President and Chief Software Officer Andrej Zdravkovic: UDNA, ROCm for Radeon, AI Everywhere, and Much More!

A few days ago, we reported on AMD's newest expansion plans for Serbia. The company opened two new engineering design centers with offices in Belgrade and Nis. We were invited to join the opening ceremony and got an exclusive interview with one of AMD's top executives, Andrej Zdravkovic, who is the senior vice president and Chief Software Officer. Previously, we reported on AMD's transition to become a software company. The company has recently tripled its software engineering workforce and is moving some of its best people to support these teams. AMD's plan is spread over a three to five-year timeframe to improve its software ecosystem, accelerating hardware development to launch new products more frequently and to react to changes in software demand. AMD found that to help these expansion efforts, opening new design centers in Serbia would be very advantageous.

We sat down with Andrej Zdravkovic to discuss the purpose of AMD's establishment in Serbia and the future of some products. Zdravkovic is actually an engineer from Serbia, where he completed his Bachelor's and Master's degrees in electrical engineering from Belgrade University. In 1998, Zdravkovic joined ATI and quickly rose through the ranks, eventually becoming a senior director. During his decade-long tenure, Zdravkovic witnessed a significant industry shift as AMD acquired ATI in 2006. After a brief stint at another company, Zdravkovic returned to AMD in 2015, bringing with him a wealth of experience and a unique perspective on the evolution of the graphics and computing industry.
Here is the full interview:

AMD to Unify Gaming "RDNA" and Data Center "CDNA" into "UDNA": Singular GPU Architecture Similar to NVIDIA's CUDA

According to new information from Tom's Hardware, AMD has announced plans to unify its consumer-focused gaming RDNA and data center CDNA graphics architectures into a single, unified design called "UDNA." The announcement was made by AMD's Jack Huynh, Senior Vice President and General Manager of the Computing and Graphics Business Group, at IFA 2024 in Berlin. The goal of the new UDNA architecture is to provide a single focus point for developers so that each optimized application can run on consumer-grade GPU like Radeon RX 7900XTX as well as high-end data center GPU like Instinct MI300. This will create a unification similar to NVIDIA's CUDA, which enables CUDA-focused developers to run applications on everything ranging from laptops to data centers.
Jack HuynhSo, part of a big change at AMD is today we have a CDNA architecture for our Instinct data center GPUs and RDNA for the consumer stuff. It's forked. Going forward, we will call it UDNA. There'll be one unified architecture, both Instinct and client [consumer]. We'll unify it so that it will be so much easier for developers versus today, where they have to choose and value is not improving.

NVIDIA Shifts Gears: Open-Source Linux GPU Drivers Take Center Stage

Just a few months after hiring Ben Skeggs, a lead maintainer of the open-source NVIDIA GPU driver for Linux kernel, NVIDIA has announced a complete transition to open-source GPU kernel modules in its upcoming R560 driver release for Linux. This decision comes two years after the company's initial foray into open-source territory with the R515 driver in May 2022. The tech giant began focusing on data center compute GPUs, while GeForce and Workstation GPU support remained in the alpha stages. Now, after extensive development and optimization, NVIDIA reports that its open-source modules have achieved performance parity with, and in some cases surpassed, their closed-source counterparts. This transition brings a host of new capabilities, including heterogeneous memory management support, confidential computing features, and compatibility with NVIDIA's Grace platform's coherent memory architectures.

The move to open-source is expected to foster greater collaboration within the Linux ecosystem and potentially lead to faster bug fixes and feature improvements. However, not all GPUs will be compatible with the new open-source modules. While cutting-edge platforms like NVIDIA Grace Hopper and Blackwell will require open-source drivers, older GPUs from the Maxwell, Pascal, or Volta architectures must stick with proprietary drivers. NVIDIA has developed a detection helper script to guide driver selection for users who are unsure about compatibility. The shift also brings changes to NVIDIA's installation processes. The default driver version for most installation methods will now be the open-source variant. This affects package managers with the CUDA meta package, run file installations and even Windows Subsystem for Linux.

AMD is Becoming a Software Company. Here's the Plan

Just a few weeks ago, AMD invited us to Barcelona as part of a roundtable, to share their vision for the future of the company, and to get our feedback. On site, were prominent AMD leadership, including Phil Guido, Executive Vice President & Chief Commercial Officer and Jack Huynh, Senior VP & GM, Computing and Graphics Business Group. AMD is making changes in a big way to how they are approaching technology, shifting their focus from hardware development to emphasizing software, APIs, and AI experiences. Software is no longer just a complement to hardware; it's the core of modern technological ecosystems, and AMD is finally aligning its strategy accordingly.

The major difference between AMD and NVIDIA is that AMD is a hardware company that makes software on the side to support its hardware; while NVIDIA is a software company that designs hardware on the side to accelerate its software. This is about to change, as AMD is making a pivot toward software. They believe that they now have the full stack of computing hardware—all the way from CPUs, to AI accelerators, to GPUs, to FPGAs, to data-processing and even server architecture. The only frontier left for AMD is software.

New Performance Optimizations Supercharge NVIDIA RTX AI PCs for Gamers, Creators and Developers

NVIDIA today announced at Microsoft Build new AI performance optimizations and integrations for Windows that help deliver maximum performance on NVIDIA GeForce RTX AI PCs and NVIDIA RTX workstations. Large language models (LLMs) power some of the most exciting new use cases in generative AI and now run up to 3x faster with ONNX Runtime (ORT) and DirectML using the new NVIDIA R555 Game Ready Driver. ORT and DirectML are high-performance tools used to run AI models locally on Windows PCs.

WebNN, an application programming interface for web developers to deploy AI models, is now accelerated with RTX via DirectML, enabling web apps to incorporate fast, AI-powered capabilities. And PyTorch will support DirectML execution backends, enabling Windows developers to train and infer complex AI models on Windows natively. NVIDIA and Microsoft are collaborating to scale performance on RTX GPUs. These advancements build on NVIDIA's world-leading AI platform, which accelerates more than 500 applications and games on over 100 million RTX AI PCs and workstations worldwide.

NVIDIA Blackwell Platform Pushes the Boundaries of Scientific Computing

Quantum computing. Drug discovery. Fusion energy. Scientific computing and physics-based simulations are poised to make giant steps across domains that benefit humanity as advances in accelerated computing and AI drive the world's next big breakthroughs. NVIDIA unveiled at GTC in March the NVIDIA Blackwell platform, which promises generative AI on trillion-parameter large language models (LLMs) at up to 25x less cost and energy consumption than the NVIDIA Hopper architecture.

Blackwell has powerful implications for AI workloads, and its technology capabilities can also help to deliver discoveries across all types of scientific computing applications, including traditional numerical simulation. By reducing energy costs, accelerated computing and AI drive sustainable computing. Many scientific computing applications already benefit. Weather can be simulated at 200x lower cost and with 300x less energy, while digital twin simulations have 65x lower cost and 58x less energy consumption versus traditional CPU-based systems and others.

NVIDIA Accelerates Quantum Computing Centers Worldwide With CUDA-Q Platform

NVIDIA today announced that it will accelerate quantum computing efforts at national supercomputing centers around the world with the open-source NVIDIA CUDA-Q platform. Supercomputing sites in Germany, Japan and Poland will use the platform to power the quantum processing units (QPUs) inside their NVIDIA-accelerated high-performance computing systems.

QPUs are the brains of quantum computers that use the behavior of particles like electrons or photons to calculate differently than traditional processors, with the potential to make certain types of calculations faster. Germany's Jülich Supercomputing Centre (JSC) at Forschungszentrum Jülich is installing a QPU built by IQM Quantum Computers as a complement to its JUPITER supercomputer, supercharged by the NVIDIA GH200 Grace Hopper Superchip. The ABCI-Q supercomputer, located at the National Institute of Advanced Industrial Science and Technology (AIST) in Japan, is designed to advance the nation's quantum computing initiative. Powered by the NVIDIA Hopper architecture, the system will add a QPU from QuEra. Poland's Poznan Supercomputing and Networking Center (PSNC) has recently installed two photonic QPUs, built by ORCA Computing, connected to a new supercomputer partition accelerated by NVIDIA Hopper.

AIO Workstation Combines 128-Core Arm Processor and Four NVIDIA GPUs Totaling 28,416 CUDA Cores

All-in-one computers are often traditionally seen as lower-powered alternatives to traditional desktop workstations. However, a new offering from Alafia AI, a startup focused on medical imaging appliances, aims to shatter that perception. The company's upcoming Alafia Aivas SuperWorkstation packs serious hardware muscle, demonstrating that all-in-one systems can match the performance of their more modular counterparts. At the heart of the Aivas SuperWorkstation lies a 128-core Ampere Altra processor, running at 3.0 GHz clock speed. This CPU is complemented by not one but three NVIDIA L4 GPUs for compute, and a single NVIDIA RTX 4000 Ada GPU for video output, delivering a combined 28,416 CUDA cores for accelerated parallel computing tasks. The system doesn't skimp on other components, either. It features a 4K touch display with up to 360 nits of brightness, an extensive 2 TB of DDR4 RAM, and storage options up to an 8 TB solid-state drive. This combination of cutting-edge CPU, GPU, memory, and storage is squarely aimed at the demands of medical imaging and AI development workloads.

The all-in-one form factor packs this incredible hardware into a sleek, purposefully designed clinical research appliance. While initially targeting software developers, Alafia AI hopes that institutions that can optimize their applications for the Arm architecture can eventually deploy the Aivas SuperWorkstation for production medical imaging workloads. The company is aiming for application integration in Q3 2024 and full ecosystem device integration by Q4 2024. With this powerful new offering, Alafia AI is challenging long-held assumptions about the performance limitations of all-in-one systems. The Aivas SuperWorkstation demonstrates that the right hardware choices can transform these compact form factors into true powerhouse workstations. Especially with a combined total output of three NVIDIA L4 compute units, alongside RTX 4000 Ada graphics card, the AIO is more powerful than some of the high-end desktop workstations.

Nvidia CEO Reiterates Solid Partnership with TSMC

One key takeaway from the ongoing GTC is that Nvidia's AI empire has taken shape with strong partnerships from TSMC and other Taiwanese makers, such as those major server ODMs.

According to the news report from the technology-focused media DIGITIMES Asia, during his keynote at GTC on March 18, Huang underscored his company's partnerships with TSMC, as well as the supply chain in Taiwan. Speaking to the press later, Huang said Nvidia will have a very strong demand for CoWoS, the advanced packaging services TSMC offers.

Jensen Huang Celebrates Rise of Portable AI Workstations

2024 will be the year generative AI gets personal, the CEOs of NVIDIA and HP said today in a fireside chat, unveiling new laptops that can build, test and run large language models. "This is a renaissance of the personal computer," said NVIDIA founder and CEO Jensen Huang at HP Amplify, a gathering in Las Vegas of about 1,500 resellers and distributors. "The work of creators, designers and data scientists is going to be revolutionized by these new workstations."

Greater Speed and Security
"AI is the biggest thing to come to the PC in decades," said HP's Enrique Lores, in the runup to the announcement of what his company billed as "the industry's largest portfolio of AI PCs and workstations." Compared to running their AI work in the cloud, the new systems will provide increased speed and security while reducing costs and energy, Lores said in a keynote at the event. New HP ZBooks provide a portfolio of mobile AI workstations powered by a full range of NVIDIA RTX Ada Generation GPUs. Entry-level systems with the NVIDIA RTX 500 Ada Generation Laptop GPU let users run generative AI apps and tools wherever they go. High-end models pack the RTX 5000 to deliver up to 682 TOPS, so they can create and run LLMs locally, using retrieval-augmented generation (RAG) to connect to their content for results that are both personalized and private.

NVIDIA and HP Supercharge Data Science and Generative AI on Workstations

NVIDIA and HP Inc. today announced that NVIDIA CUDA-X data processing libraries will be integrated with HP AI workstation solutions to turbocharge the data preparation and processing work that forms the foundation of generative AI development.

Built on the NVIDIA CUDA compute platform, CUDA-X libraries speed data processing for a broad range of data types, including tables, text, images and video. They include the NVIDIA RAPIDS cuDF library, which accelerates the work of the nearly 10 million data scientists using pandas software by up to 110x using an NVIDIA RTX 6000 Ada Generation GPU instead of a CPU-only system, without requiring any code changes.

NVIDIA Cracks Down on CUDA Translation Layers, Changes Licensing Terms

NVIDIA's Compute Unified Device Architecture (CUDA) has long been the de facto standard programming interface for developing GPU-accelerated software. Over the years, NVIDIA has built an entire ecosystem around CUDA, cementing its position as the leading GPU computing and AI manufacturer. However, rivals AMD and Intel have been trying to make inroads with their own open API offerings—ROCm from AMD and oneAPI from Intel. The idea was that developers could more easily run existing CUDA code on non-NVIDIA GPUs by providing open access through translation layers. Developers had created projects like ZLUDA to translate CUDA to ROCm, and Intel's CUDA to SYCL aimed to do the same for oneAPI. However, with the release of CUDA 11.5, NVIDIA appears to have cracked down on these translation efforts by modifying its terms of use, according to developer Longhorn on X.

"You may not reverse engineer, decompile or disassemble any portion of the output generated using Software elements for the purpose of translating such output artifacts to target a non-NVIDIA platform," says the CUDA 11.5 terms of service document. The changes don't seem to be technical in nature but rather licensing restrictions. The impact remains to be seen, depending on how much code still requires translation versus running natively on each vendor's API. While CUDA gave NVIDIA a unique selling point, its supremacy has diminished as more libraries work across hardware. Still, the move could slow the adoption of AMD and Intel offerings by making it harder for developers to port existing CUDA applications. As GPU-accelerated computing grows in fields like AI, the battle for developer mindshare between NVIDIA, AMD, and Intel is heating up.

NVIDIA Announces RTX 500 and 1000 Professional Ada Generation Laptop GPUs

With generative AI and hybrid work environments becoming the new standard, nearly every professional, whether a content creator, researcher or engineer, needs a powerful, AI-accelerated laptop to help users tackle their industry's toughest challenges - even on the go. The new NVIDIA RTX 500 and 1000 Ada Generation Laptop GPUs will be available in new, highly portable mobile workstations, expanding the NVIDIA Ada Lovelace architecture-based lineup, which includes the RTX 2000, 3000, 3500, 4000 and 5000 Ada Generation Laptop GPUs.

AI is rapidly being adopted to drive efficiencies across professional design and content creation workflows and everyday productivity applications, underscoring the importance of having powerful local AI acceleration and sufficient processing power in systems. The next generation of mobile workstations with Ada Generation GPUs, including the RTX 500 and 1000 GPUs, will include both a neural processing unit (NPU), a component of the CPU, and an NVIDIA RTX GPU, which includes Tensor Cores for AI processing. The NPU helps offload light AI tasks, while the GPU provides up to an additional 682 TOPS of AI performance for more demanding day-to-day AI workflows.

NVIDIA Accelerates Quantum Computing Exploration at Australia's Pawsey Supercomputing Centre

NVIDIA today announced that Australia's Pawsey Supercomputing Research Centre will add the NVIDIA CUDA Quantum platform accelerated by NVIDIA Grace Hopper Superchips to its National Supercomputing and Quantum Computing Innovation Hub, furthering its work driving breakthroughs in quantum computing.

Researchers at the Perth-based center will leverage CUDA Quantum - an open-source hybrid quantum computing platform that features powerful simulation tools, and capabilities to program hybrid CPU, GPU and QPU systems - as well as, the NVIDIA cuQuantum software development kit of optimized libraries and tools for accelerating quantum computing workflows. The NVIDIA Grace Hopper Superchip - which combines the NVIDIA Grace CPU and Hopper GPU architectures - provides extreme performance to run high-fidelity and scalable quantum simulations on accelerators and seamlessly interface with future quantum hardware infrastructure.

AMD Develops ROCm-based Solution to Run Unmodified NVIDIA's CUDA Binaries on AMD Graphics

AMD has quietly funded an effort over the past two years to enable binary compatibility for NVIDIA CUDA applications on their ROCm stack. This allows CUDA software to run on AMD Radeon GPUs without adapting the source code. The project responsible is ZLUDA, which was initially developed to provide CUDA support on Intel graphics. The developer behind ZLUDA, Andrzej Janik, was contracted by AMD in 2022 to adapt his project for use on Radeon GPUs with HIP/ROCm. He spent two years bringing functional CUDA support to AMD's platform, allowing many real-world CUDA workloads to run without modification. AMD decided not to productize this effort for unknown reasons but did open-source it once funding ended per their agreement. Over at Phoronix, there were several benchmarks testing AMD's ZLUDA implementation over a wide variety of benchmarks.

Benchmarks found that proprietary CUDA renderers and software worked on Radeon GPUs out-of-the-box with the drop-in ZLUDA library replacements. CUDA-optimized Blender 4.0 rendering now runs faster on AMD Radeon GPUs than the native ROCm/HIP port, reducing render times by around 10-20%, depending on the scene. The implementation is surprisingly robust, considering it was a single-developer project. However, there are some limitations—OptiX and PTX assembly codes still need to be fully supported. Overall, though, testing showed very promising results. Over the generic OpenCL runtimes in Geekbench, CUDA-optimized binaries produce up to 75% better results. With the ZLUDA libraries handling API translation, unmodified CUDA binaries can now run directly on top of ROCm and Radeon GPUs. Strangely, the ZLUDA port targets AMD ROCm 5.7, not the newest 6.x versions. Only time will tell if AMD continues investing in this approach to simplify porting of CUDA software. However, the open-sourced project now enables anyone to contribute and help improve compatibility. For a complete review, check out Phoronix tests.

Intel Open Image Denoise v2.2 Adds Metal Support & AArch64 Improvements

An Open Image Denoise 2.2 release candidate was released earlier today—as discovered by Phoronix's founder and principal writer; Michael Larabel. Intel's dedicated website has not been updated with any new documentation or changelogs (at the time of writing), but a GitHub release page shows all of the crucial information. Team Blue's open-source oneAPI has been kept up-to-date with the latest technologies—not only limited to Intel's stable of Xe-LP, Xe-HPG and Xe-HPC components—the Phonorix article highlights updated support on competing platforms. The v2.2 preview adds support for Meteor Lake's integrated Arc graphics solution, and additional "denoising quality enhancements and other improvements."

Non-Intel platform improvements include updates for Apple's M-series chipsets, AArch64 processors, and NVIDIA CUDA. OIDn 2.2-rc: "adds Metal device support for Apple Silicon GPUs on recent versions of macOS. OIDn has already been supporting ARM64/AArch64 for Apple Silicon CPUs while now Open Image Denoise has extended that AArch64 support to work on Windows and Linux too. There is better performance in general for Open Image Denoise on CPUs with this forthcoming release." The changelog also highlights a general improvement performance across processors, and a fix that resolves a crash incident: "when releasing a buffer after releasing the device."

Aetina Introduces New MXM GPUs Powered by NVIDIA Ada Lovelace for Enhanced AI Capabilities at the Edge

Aetina, a leading global Edge AI solution provider, announces the release of its new embedded MXM GPU series utilizing the NVIDIA Ada Lovelace architecture - MX2000A-VP, MX3500A-SP, and MX5000A-WP. Designed for real-time ray tracing and AI-based neural graphics, this series significantly enhances GPU performance, delivering outstanding gaming and creative, professional graphics, AI, and compute performance. It provides the ultimate AI processing and computing capabilities for applications in smart healthcare, autonomous machines, smart manufacturing, and commercial gaming.

The global GPU (graphics processing unit) market is expected to achieve a 34.4% compound annual growth rate from 2023 to 2028, with advancements in the artificial intelligence (AI) industry being a key driver of this growth. As the trend of AI applications expands from the cloud to edge devices, many businesses are seeking to maximize AI computing performance within minimal devices due to space constraints in deployment environments. Aetina's latest embedded MXM modules - MX2000A-VP, MX3500A-SP, and MX5000A-WP, adopting the NVIDIA Ada Lovelace architecture, not only make significant breakthroughs in performance and energy efficiency but also enhance the performance of ray tracing and AI-based neural graphics. The modules, with their compact design, efficiently save space, thereby opening up more possibilities for edge AI devices.

NVIDIA GeForce RTX 4080 SUPER GPUs Pop Up in Geekbench Browser

We are well aware that NVIDIA GeForce RTX 4080 SUPER graphics cards are next up on the review table (January 31)—TPU's W1zzard has so far toiled away on getting his evaluations published on time for options further down the Ada Lovelace SUPER food chain. This process was interrupted briefly by the appearance of custom Radeon RX 7600 XT models, but today's attention soon returned to another batch of GeForce RTX 4070 Ti SUPER cards. Reviewers are already toying around with driver-enabled GeForce RTX 4080 SUPER sample units—under strict confidentiality conditions—but the occasional leak is expected to happen. The appropriately named Benchleaks social media account has kept track of emerging test results.

The Geekbench Browser database was updated earlier today with premature GeForce RTX 4080 SUPER GPU test results—one entry highlighted by Benchleaks provides a quick look at the card's prowess in three of Geekbench 5.1's graphics API trials: Vulkan, CUDA and OpenCL. VideoCardz points out that all of the scores could be fundamentally flawed; in particular the Vulkan result of 100378 points—the regular (non-SUPER) GeForce RTX 4080 GPU can achieve almost double that figure in Geekbench 6. The SUPER's other results included a Geekbench 5 CUDA score of 309554, and an achievement of 264806 points in OpenCL. A late morning entrant looks to be hitting the right mark—an ASUS testbed (PRIME Z790-A WIFI + Intel Core i9-13900KF) managed to score 210551 points in Geekbench 6.2.2 Vulkan.

Possible NVIDIA GeForce RTX 3050 6 GB Edition Specifications Appear

Alleged full specifications leaked for NVIDIA's upcoming GeForce RTX 3050 6 GB graphics card show extensive reductions beyond merely reducing memory size versus the 8 GB model. If accurate, performance could lag the existing RTX 3050 8 GB SKU by up to 25%, making it weaker competition even for AMD's budget RX 6500 XT. Previous rumors suggested only capacity and bandwidth differences on a partially disabled memory bus between 3050 variants, which would reduce the memory to 6 GB and 96-bit bus, from 8 GB and 128-bit bus.. But leaked specs indicate CUDA core counts, clock speeds, and TDP all see cuts for the upcoming 6 GB version. With 18 SMs and 2304 cores rather than 20 SMs and 2560 cores at lower base and boost frequencies, the impact looks more severe than expected. A 70 W TDP does allow passive cooling but hurts performance versus the 3050 8 GB's 130 W design.

Some napkin math suggests the 3050 6 GB could deliver only 75% of its elder sibling's frame rates, putting it more in line with the entry-level 6500 XT. While having 50% more VRAM helps, dramatic core and clock downgrades counteract that memory advantage. According to rumors, the RTX 3050 6 GB is set to launch in February, bringing lower-end Ampere to even more budget-focused builders. But with specifications seemingly hobbled beyond just capacity, its real-world gaming value remains to be determined. NVIDIA likely intends RTX 3060 6 GB primarily for less demanding esports titles. Given the scale of cutbacks and the modern AAA title's recommended specifications, mainstream AAA gaming performance seems improbable.

No Overclocking and Lower TGP for NVIDIA GeForce RTX 4090 D Edition for China

NVIDIA is preparing to launch the GeForce RTX 4090 D, or "Dragon" edition, designed explicitly for China. Circumventing the US export rules of GPUs that could potentially be used for AI acceleration, the GeForce RTX 4090 D is reportedly cutting back on overclocking as a feature. According to BenchLife, the AD102-250 GPU used in the RTX 4090 D will be a stranger to overclocking, as the card will not support it, possibly being disabled by firmware and/or physically in the die. The information from @Zed__Wang suggests that the Dragon version will be running at 2280 MHz base frequency, higher than the 2235 MHz of AD102-300 found in the regular RTX 4090, and 2520 MHz boost, matching the regular version.

Interestingly, the RTX 4090 D for China will also feature a slightly lower Total Graphics Power (TGP) of 425 Watts, down from the 450 Watts of the regular model. With memory configuration appearing to be the same, this new China-specific model will most likely perform within a few percent of the original design. Higher base frequency probably indicates a lack of a few CUDA cores to comply with the US export regulation policy and serve the Chinese GPU market. The NVIDIA GeForce RTX 4090 D is scheduled for rollout in January 2024 in China, which is just a few weeks away.

NVIDIA and AMD Deliver Powerful Workstations to Accelerate AI, Rendering and Simulation

To enable professionals worldwide to build and run AI applications right from their desktops, NVIDIA and AMD are powering a new line of workstations equipped with NVIDIA RTX Ada Generation GPUs and AMD Ryzen Threadripper PRO 7000 WX-Series CPUs. Bringing together the highest levels of AI computing, rendering and simulation capabilities, these new platforms enable professionals to efficiently tackle the most resource-intensive, large-scale AI workflows locally.

Bringing AI Innovation to the Desktop
Advanced AI tasks typically require data-center-level performance. Training a large language model with a trillion parameters, for example, takes thousands of GPUs running for weeks, though research is underway to reduce model size and enable model training on smaller systems while still maintaining high levels of AI model accuracy. The new NVIDIA RTX GPU and AMD CPU-powered AI workstations provide the power and performance required for training such smaller models, as well as local fine-tuning, and helping to offload data center and cloud resources for AI development tasks. The devices let users select single- or multi-GPU configurations as required for their workloads.

NVIDIA Lends Support to Washington's Efforts to Ensure AI Safety

In an event at the White House today, NVIDIA announced support for voluntary commitments that the Biden Administration developed to ensure advanced AI systems are safe, secure and trustworthy. The news came the same day NVIDIA's chief scientist, Bill Dally, testified before a U.S. Senate subcommittee seeking input on potential legislation covering generative AI. Separately, NVIDIA founder and CEO Jensen Huang will join other industry leaders in a closed-door meeting on AI Wednesday with the full Senate.

Seven companies including Adobe, IBM, Palantir and Salesforce joined NVIDIA in supporting the eight agreements the Biden-Harris administration released in July with support from Amazon, Anthropic, Google, Inflection, Meta, Microsoft and OpenAI.

NVIDIA GH200 Superchip Aces MLPerf Inference Benchmarks

In its debut on the MLPerf industry benchmarks, the NVIDIA GH200 Grace Hopper Superchip ran all data center inference tests, extending the leading performance of NVIDIA H100 Tensor Core GPUs. The overall results showed the exceptional performance and versatility of the NVIDIA AI platform from the cloud to the network's edge. Separately, NVIDIA announced inference software that will give users leaps in performance, energy efficiency and total cost of ownership.

GH200 Superchips Shine in MLPerf
The GH200 links a Hopper GPU with a Grace CPU in one superchip. The combination provides more memory, bandwidth and the ability to automatically shift power between the CPU and GPU to optimize performance. Separately, NVIDIA HGX H100 systems that pack eight H100 GPUs delivered the highest throughput on every MLPerf Inference test in this round. Grace Hopper Superchips and H100 GPUs led across all MLPerf's data center tests, including inference for computer vision, speech recognition and medical imaging, in addition to the more demanding use cases of recommendation systems and the large language models (LLMs) used in generative AI.

NVIDIA CEO Meets with India Prime Minister Narendra Modi

Underscoring NVIDIA's growing relationship with the global technology superpower, Indian Prime Minister Narendra Modi met with NVIDIA founder and CEO Jensen Huang Monday evening. The meeting at 7 Lok Kalyan Marg—as the Prime Minister's official residence in New Delhi is known—comes as Modi prepares to host a gathering of leaders from the G20 group of the world's largest economies, including U.S. President Joe Biden, later this week.

"Had an excellent meeting with Mr. Jensen Huang, the CEO of NVIDIA," Modi said in a social media post. "We talked at length about the rich potential India offers in the world of AI." The event marks the second meeting between Modi and Huang, highlighting NVIDIA's role in the country's fast-growing technology industry.

Strong Cloud AI Server Demand Propels NVIDIA's FY2Q24 Data Center Business to Surpass 76% for the First Time

NVIDIA's latest financial report for FY2Q24 reveals that its data center business reached US$10.32 billion—a QoQ growth of 141% and YoY increase of 171%. The company remains optimistic about its future growth. TrendForce believes that the primary driver behind NVIDIA's robust revenue growth stems from its data center's AI server-related solutions. Key products include AI-accelerated GPUs and AI server HGX reference architecture, which serve as the foundational AI infrastructure for large data centers.

TrendForce further anticipates that NVIDIA will integrate its software and hardware resources. Utilizing a refined approach, NVIDIA will align its high-end, mid-tier, and entry-level GPU AI accelerator chips with various ODMs and OEMs, establishing a collaborative system certification model. Beyond accelerating the deployment of CSP cloud AI server infrastructures, NVIDIA is also partnering with entities like VMware on solutions including the Private AI Foundation. This strategy extends NVIDIA's reach into the edge enterprise AI server market, underpinning steady growth in its data center business for the next two years.
Return to Keyword Browsing
Mar 28th, 2025 17:42 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts