News Posts matching #CUDA

Return to Keyword Browsing

NVIDIA's 32-Bit PhysX Waves Goodbye with GeForce RTX 50 Series Ending 32-Bit CUDA Software Support

The days of 32-bit software support in NVIDIA's drivers are coming to an end, and with that, so does the support for the once iconic PhysX real-time physics engine. According to NVIDIA's engineers on GeForce forums, the lack of PhysX support has been quietly acknowledged, as NVIDIA's latest GeForce RTX 50 series of GPUs are phasing out support for 32-bit CUDA software, slowly transitioning the gaming world to the 64-bit software entirely. While older NVIDIA GPUs from the Maxwell through Ada generations will maintain 32-bit CUDA support, this update breaks backward compatibility for physics acceleration in legacy PC games on new GPUs. Users running these titles on RTX 50 series cards may need to rely on CPU-based PhysX processing, which could result in suboptimal performance compared to previous GPU generations.

A Reddit user reported frame rates dropping below 60 FPS in Borderlands 2 while using basic game mechanics with a 9800X3D CPU and RTX 5090 GPU, all because 32-bit CUDA application support on Blackwell architecture is depreciated. When another user booted up a 64-bit PhysX application, Batman Arkham Knight, PhysX worked perfectly, as expected. It is just that a massive list of older games, which gamers would sometimes prefer to play, is now running a lot slower on the most powerful consumer GPU due to the phase-out of 32-bit CUDA app support.

NVIDIA GeForce RTX 5070 Ti Allegedly Scores 16.6% Improvement Over RTX 4070 Ti SUPER in Synthetic Benchmarks

Thanks to some early 3D Mark benchmarks obtained by VideoCardz, NVIDIA's upcoming GeForce RTX 5070 Ti GPU paints an interesting picture of performance gains over the predecessor. Testing conducted with AMD's Ryzen 7 9800X3D processor and 48 GB of DDR5-6000 memory has provided the first glimpse into the card's capabilities. The new GPU demonstrates a 16.6% performance improvement over its predecessor, the RTX 4070 Ti SUPER. However, benchmark data shows it is falling short of the more expensive RTX 5080 by 13.2%, raising questions about the price-to-performance ratio given the $250 price difference between the two cards. Priced at $749 MSRP, the RTX 5070 Ti could be even pricier in retail channels at launch, especially with limited availability. The card's positioning becomes particularly interesting compared to the RTX 5080's $999 price point, which commands a 33% premium for its additional performance capabilities.

As a reminder, the RTX 5070 Ti boasts 8,960 CUDA cores, 280 texture units, 70 RT cores for ray tracing, and 280 tensor cores for AI computations, all supported by 16 GB of GDDR7 memory running at 28 Gbps effective speed across a 256-bit bus interface, resulting in an 896 GB/s bandwidth. We have to wait for proper reviews for the final performance conclusion, as synthetic benchmarks tell only part of the story. Modern gaming demands consideration of advanced features such as ray tracing and upscaling technologies, which can significantly impact real-world performance. The true test will come from comprehensive gaming benchmarks tested over various cases. The gaming community won't have to wait long for detailed analysis, as official reviews will be reportedly released in just a few days. Additional evaluations of non-MSRP versions should follow on February 20, the card's launch date.

NVIDIA GeForce RTX 5070 Ti Edges Out RTX 4080 in OpenCL Benchmark

A recently surfaced Geekbench OpenCL listing has revealed the performance improvements that the GeForce RTX 5070 Ti is likely to bring to the table, and the numbers sure look promising - that is, coming from the disappointment of the GeForce RTX 5080, which manages roughly 260,000 points in the benchmark, portraying a paltry 8% improvement over its predecessor. The GeForce RTX 5070 Ti, however, managed an impressive 248,000 points, putting it a substantial 20% ahead of the GeForce RTX 4070 Ti. Hilariously enough, the RTX 5080 is merely 4% ahead, making the situation even worse for the somewhat contentious GPU. NVIDIA has claimed similar performance improvements in its marketing material, which does seem quiet plausible.

Of course, an OpenCL benchmark is hardly representative of real-world gaming performance. That being said, there is no denying that raw benchmarks will certainly help buyers temper expectations and make decisions. Previous leaks and speculations have hinted at a roughly 10% improvement over its predecessor in raster performance and up to 15% improvements in ray tracing performance, although the OpenCL listing does indicate the RTX 5070 ti might be capable of a larger generational jump, neck-and-neck with NVIDIA's claims. For those in need of a refresher, the RTX 5070 Ti boasts 8960 CUDA cores paired with 16 GB of GDDR7 memory on a 256-bit bus. Like its siblings, the RTX 5070 is also rumored to face "extremely limited" supply at launch. With its official launch less than a week away, we won't have much waiting to do to find out for ourselves.

NVIDIA RTX 5080 Laptop Defeats Predecessor By 19% in Time Spy Benchmark

The NVIDIA RTX 50-series witnessed quite a contentious launch, to say the least. Hindered by abysmal availability, controversial generational improvement, and whacky marketing tactics by Team Green, it would be safe to say a lot of passionate gamers were left utterly disappointed. That said, while the desktop cards have been the talk of the town as of late, the RTX 50 Laptop counterparts are yet to make headlines. Occasional leaks do appear on the interwebs, the latest one of which seems to indicate the 3D Mark Time Spy performance for the RTX 5080 Laptop GPU. And the results are - well, debatable.

We do know that the RTX 5080 Laptop GPU will feature 7680 CUDA cores, a shockingly modest increase over its predecessor. Considering that we did not get a node shrink this time around, the architectural improvements appear to be rather minimal, going by the tests conducted so far. Of course, the biggest boost in performance will likely be afforded by GDDR7 memory, utilizing a 256-bit bus, compared its predecessor's GDDR6 memory on a 192-bit bus. In 3D Mark's Time Spy DX12 test, which is somewhat of an outdated benchmark, the RTX 5080 Laptop managed roughly around 21,900 points. The RTX 4080 Laptop, on an average, rakes in around 18,200 points, putting the RTX 5080 Laptop ahead by almost 19%. The RTX 4090 Laptop is also left behind, by around 5%.

NVIDIA GB202 "Blackwell" Die Exposed, Shows the Massive 24,576 CUDA Core Configuration

A die-shot of NVIDIA's GB202, the silicon powering the RTX 5090, has surfaced online, providing detailed insights into the "Blackwell" architecture's physical layout. The annotated images, shared by hardware analyst Kurnal and provided by ASUS China general manager Tony Yu, compare the GB202 to its AD102 predecessor and outline key architectural components. The die's central region houses 128 MB of L2 cache (96 MB enabled on RTX 5090), surrounded by memory interfaces. Eight 64-bit memory controllers support the 512-bit GDDR7 interface, with physical interfaces positioned along the top, left, and right edges of the die. Twelve graphics processing clusters (GPCs) surround the central cache. Each GPC contains eight texture processing clusters (TPCs), with each GPC housing 16 streaming multiprocessors (SMs). The complete die configuration enables 24,576 CUDA cores, arranged as 128 cores per SM across 192 SMs. With RTX 5090 offering "only" 21,760 CUDA cores, this means that the full GB202 die is reserved for workstation GPUs.

The SM design includes four slices sharing 128 KB of L1 cache and four texture mapping units (TMUs). Individual SM slices contain dedicated register files, L0 instruction caches, warp schedulers, load-store units, and special function units. Central to the die's layout is a vertical strip containing the media processing components—NVENC and NVDEC units—running from top to bottom. The RTX 5090 implementation enables three of four available NVENC encoders and two of four NVDEC decoders. The die includes twelve raster engine/3D FF blocks for geometry processing. At the bottom edge sits the PCIe 5.0 x16 interface and display controller components. Despite its substantial size, the GB202 remains smaller than NVIDIA's previous GH100 and GV100 dies, which exceeded 814 mm². Each SM integrates specialized hardware, including new 5th-generation Tensor cores and 4th-generation RT cores, contributing to the die's total of 192 RT cores, 768 Tensor cores, and 768 texture units.

NVIDIA Likely Sending Maxwell, Pascal & Volta Architectures to CUDA Legacy Branch

Team Green's CUDA 12.8 release notes have revealed upcoming changes for three older GPU architectures—the document's "Deprecated and Dropped Features" section outlines forthcoming changes. A brief sentence outlines a less active future for affected families: "architecture support for Maxwell, Pascal, and Volta is considered feature-complete and will be frozen in an upcoming release." Further down, NVIDIA states that a small selection of operating systems have been dropped from support lists, including Microsoft Windows 10 21H2 and Debian 11.

Refocusing on matters of hardware—Michael Larabel, Phoronix's editor-in-chief, has kindly provided a bit of history and context. "Four years ago with the NVIDIA 470 series was the legacy branch for GeForce GTX 600 and 700 Kepler series and now as we embark on the NVIDIA 570 driver series, it looks like it could end up being the legacy branch for Maxwell, Pascal, and Volta generations of GPUs." Larabel and other industry watchdogs reckon that the incoming "Blackwell" generation is taking priority, with Team Green likely freeing up resources and concentrating less on taking care of decade+ old hardware. VideoCardz believes that gaming GPU support will continue—at least for Maxwell (e.g. GeForce GTX 900) and Pascal (GeForce GTX 10 series)—based on a playtesting of the toolkit's latest set of integrated drivers (version 571.96).

First NVIDIA GeForce RTX 5090 GPU with 32 GB GDDR7 Memory Leaks Ahead of CES Keynote

NVIDIA's unannounced GeForce RTX 5090 graphics card has leaked, confirming key specifications of the next-generation GPU. Thanks to exclusive information from VideoCardz, we can see the packaging of Inno3D's RTX 5090 iChill X3 model, which confirms that the graphics card will feature 32 GB of GDDR7 memory. The leaked materials show that Inno3D's variant will use a 3.5-slot cooling system, suggesting significant cooling requirements for the flagship card. According to earlier leaks, the RTX 5090 will be based on the GB202 GPU and include 21,760 CUDA cores. The card's memory system is a significant upgrade, with its 32 GB of GDDR7 memory running on a 512-bit memory bus at 28 Gbps, capable of delivering nearly 1.8 TB/s of bandwidth. This represents twice the memory capacity of the upcoming RTX 5080, which is expected to ship with 16 GB capacity but 30 Gbps GDDR7 modules.

Power consumption has increased significantly, with the RTX 5090's TDP rated at 575 W and TGP of 600 W, marking a 125-watt increase over the previous RTX 4090 in raw TDP. NVIDIA is scheduled to hold its CES keynote today at 06:30 pm PT time, where the company is expected to announce several new graphics cards officially. The lineup should include the RTX 5090, RTX 5080, RTX 5070 Ti, RTX 5070, and an RTX 5090D model specifically for the Chinese market. Early indications are that the RTX 5080 will be the first card to reach consumers, with a planned release date of January 21st. Release dates for other models, including the flagship RTX 5090, have not yet been confirmed. The RTX 5090 is currently the only card in the RTX 50 series planned to use the GB202 GPU. Pricing information and additional specifications are expected to be revealed during the upcoming announcement.

Nintendo Switch 2 PCB Leak Reveals an NVIDIA Tegra T239 Chip Optically Shrunk to 5nm

Nintendo Switch 2 promises to be this year's big (well small) gaming platform launch. It goes up against a growing ecosystem of handhelds based on x86-64 mobile processors running Windows, its main play would have to be offering a similar or better gameplay experience, but with better battery life, given that all of its hardware is purpose-built for a handheld console, and runs a highly optimized software stack; and the SoC forms a big part of this. Nintendo turned to NVIDIA for the job, given its graphics IP leadership, and its ability to integrate it with Arm CPU IP in a semi-custom chip. Someone with access to a Switch 2 prototype, likely an ISV, took the device apart, revealing the chip, a die-shrunk version of the Tegra T239 from 2023.

It's important to note that prototype consoles physically appear nothing like the final product, they're just designed so ISVs and game developers can validate them, and together with PC-based "official" emulation, set up the ability to develop or port games to the new platform. The Switch 2 looks very similar to the original Switch, it is a large tablet-like device, with detachable controllers. The largest chip on the mainboard is the NVIDIA Tegra T239. Nintendo Prime shared more details about the chip.

NVIDIA Plans GeForce RTX 5080 "Blackwell" Availability on January 21, Right After CES Announcement

Hong Kong tech media HKEPC report indicates that NVIDIA's GeForce RTX 5080 graphics card will launch on January 21, 2025. The release follows a planned announcement event on January 6, where CEO Jensen Huang will present the new "Blackwell" architecture. Anticipated specifications based on prior rumors point to RTX 5080 using GB203-400-A1 chip, containing 10,752 CUDA cores across 84 SM. The card maintains 16 GB of memory but upgrades to GDDR7 technology running at 30 Gbps, while other cards in the series are expected to use 28 Gbps memory. The graphics card is manufactured using TSMC's 4NP 4 nm node. This improvement in manufacturing technology, combined with architectural changes, accounts for most of the expected performance gains, as the raw CUDA core count only increased by 10% over the RTX 4080. NVIDIA is also introducing larger segmentation between its Blackwell SKUs, as the RTX 5090 has nearly double CUDA cores and double GDDR7 memory capacity.

NVIDIA is organizing a GeForce LAN event two days before the announcement, marking the return of this gathering after 13 years, so the timing is interesting. NVIDIA wants to capture gamer's hearts with 50 hours of non-stop gameplay. Meanwhile, AMD currently has no competing products announced in the high-end graphics segment, leaving NVIDIA without direct competition in this performance tier. This market situation could affect the final pricing of the RTX 5080, which will be revealed during the January keynote. While the January 21 date appears set for the RTX 5080, launch dates for other cards in the Blackwell family, including the RTX 5090 and RTX 5070 series, remain unconfirmed. NVIDIA typically releases different models in their GPU families on separate dates to manage production and distribution effectively.

NVIDIA GeForce RTX 5070 and RTX 5070 Ti Final Specifications Seemingly Confirmed

Thanks to kopite7kimi, we are able to finalize the leaked specifications of NVIDIA's upcoming GeForce RTX 5070 and RTX 5070 Ti graphics cards.
Starting off with RTX 5070 Ti, it will feature 8,960 CUDA cores and come equipped with 16 GB GDDR7 memory on a 256-bit memory bus, offering 896 GB/s bandwidth. The card is reportedly designed with a total board power (TBP) of 300 W. The Ti variant appears to use the PG147-SKU60 board design with a GB203-300-A1 GPU. The standard RTX 5070 is positioned as a more power-efficient option, with specifications pointing to 6,144 CUDA cores and 12 GB of GDDR7 memory on a 192-bit bus, with 627 GB/s memory bandwidth. This model is expected to operate at a slightly lower 250 W TBP.

Interestingly, the non-Ti RTX 5070 card will be available in two board variants, PG146 and PG147, both utilizing the GB205-300-A1 GPU. While we don't know what the pricing structure looks like, we see that NVIDIA has chosen to make more considerable differentiating factors between its SKUs. The Ti variant not only gets an extra four GB of GDDR7 memory, but it also gets a whopping 45% increase in CUDA core count, going from 6,144 to 8,960 cores. While we wait for the CES to see the initial wave of GeForce RTX 50 series cards, the GeForce RTX 5070 and RTX 5070 Ti are expected to arrive later, possibly after RTX 5080 and RTX 5090 GPUs.

AMD's Pain Point is ROCm Software, NVIDIA's CUDA Software is Still Superior for AI Development: Report

The battle of AI acceleration in the data center is, as most readers are aware, insanely competitive, with NVIDIA offering a top-tier software stack. However, AMD has tried in recent years to capture a part of the revenue that hyperscalers and OEMs are willing to spend with its Instinct MI300X accelerator lineup for AI and HPC. Despite having decent hardware, the company is not close to bridging the gap software-wise with its competitor, NVIDIA. According to the latest report from SemiAnalysis, a research and consultancy firm, they have run a five-month experiment using Instinct MI300X for training and benchmark runs. And the findings were surprising: even with better hardware, AMD's software stack, including ROCm, has massively degraded AMD's performance.

"When comparing NVIDIA's GPUs to AMD's MI300X, we found that the potential on paper advantage of the MI300X was not realized due to a lack within AMD public release software stack and the lack of testing from AMD," noted SemiAnalysis, breaking down arguments in the report further, adding that "AMD's software experience is riddled with bugs rendering out of the box training with AMD is impossible. We were hopeful that AMD could emerge as a strong competitor to NVIDIA in training workloads, but, as of today, this is unfortunately not the case. The CUDA moat has yet to be crossed by AMD due to AMD's weaker-than-expected software Quality Assurance (QA) culture and its challenging out-of-the-box experience."

Acer Leaks GeForce RTX 5090 and RTX 5080 GPU, Memory Sizes Confirmed

Acer has jumped the gun and listed its ACER Predator Orion 7000 systems with the upcoming NVIDIA RTX 50 series graphics cards, namely the GeForce RTX 5080 and the GeForce RTX 5090. In addition, the listing confirms that the GeForce RTX 5080 will come with 16 GB of GDDR7 memory, while the GeForce RTX 5090 will get 32 GB of GDDR7 memory.

The ACER Predator Orion 7000 gaming PC was announced back in September, together with Intel's Core Ultra 200 series, and it does not come as a surprise that this high-end pre-built system will now be getting NVIDIA's new GeForce RTX 50 series graphics cards. In case you missed previous rumors, the GeForce RTX 5080 is expected to use the GB203-400 GPU with 10,752 CUDA cores, and come with 16 GB of GDDR7 memory on a 256-bit memory interface. The GeForce RTX 5090, on the other hand, gets the GB202-300 GPU with 21,760 CUDA cores and packs 32 GB of GDDR7 memory.

NVIDIA GeForce RTX 5070 Ti Leak Tips More VRAM, Cores, and Power Draw

It's an open secret by now that NVIDIA's GeForce RTX 5000 series GPUs are on the way, with an early 2025 launch on the cards. Now, preliminary details about the RTX 5070 Ti have leaked, revealing an increase in both VRAM and TDP and suggesting that the new upper mid-range GPU will finally address the increased VRAM demand from modern games. According to the leak from Wccftech, the RTX 5070 Ti will have 16 GB of GDDR7 VRAM, up from 12 GB on the RTX 4070 Ti, as we previously speculated. Also confirming previous leaks, the new sources confirm that the 5070 Ti will use the cut-down GB203 chip, although the new leak points to a significantly higher TBP of 350 W. The new memory configuration will supposedly run on a 256-bit memory bus and run at 28 Gbps for a total memory bandwidth of 896 GB/s, which is a significant boost over the RTX 4070 Ti.

Supposedly, the RTX 5070 Ti will also see a bump in total CUDA cores, from 7680 in the RTX 4070 Ti to 8960 in the RTX 5070 Ti. The new RTX 5070 Ti will also switch to the 12V-2x6 power connector, compared to the 16-pin connector from the 4070 Ti. NVIDIA is expected to announce the RTX 5000 series graphics cards at CES 2025 in early January, but the RTX 5070 Ti will supposedly be the third card in the 5000-series launch cycle. That said, leaks suggest that the 5070 Ti will still launch in Q1 2025, meaning we may see an indication of specs at CES 2025, although pricing is still unclear.

Update Dec 16th: Kopite7kimi, ubiquitous hardware leaker, has since responded to the RTX 5070 Ti leaks, stating that 350 W may be on the higher end for the RTX 5070 Ti: "...the latest data shows 285W. However, 350W is also one of the configs." This could mean that a TBP of 350 W is possible, although maybe only on certain graphics card models, if competition is strong, or in certain boost scenarios.

Nintendo Switch Successor: Backward Compatibility Confirmed for 2025 Launch

Nintendo has officially announced that its next-generation Switch console will feature backward compatibility, allowing players to use their existing game libraries on the new system. However, those eagerly awaiting the console's release may need to exercise patience as launch expectations have shifted to early 2025. On the official X account, Nintendo has announced: "At today's Corporate Management Policy Briefing, we announced that Nintendo Switch software will also be playable on the successor to Nintendo Switch. Nintendo Switch Online will be available on the successor to Nintendo Switch as well. Further information about the successor to Nintendo Switch, including its compatibility with Nintendo Switch, will be announced at a later date."

While the original Switch evolved from a 20 nm Tegra X1 to a more power-efficient 16 nm Tegra X1+ SoC (both featuring four Cortex-A57 and four Cortex-A53 cores with GM20B Maxwell GPUs), the Switch 2 is rumored to utilize a customized variant of NVIDIA's Jetson Orin SoC, now codenamed T239. The new chip represents a significant upgrade with its 12 Cortex-A78AE cores, LPDDR5 memory, and Ampere GPU architecture with 1,536 CUDA cores, promising enhanced battery efficiency and DLSS capabilities for the handheld gaming market. With the holiday 2024 release window now seemingly off the table, the new console is anticipated to debut in the first half of 2025, marking nearly eight years since the original Switch's launch.

NVIDIA Releases GeForce 565.90 WHQL Game Ready Driver

NVIDIA has released its latest GeForce graphics drivers, the GeForce 565.90 WHQL Game Ready drivers. As a new Game Ready driver, it provides optimizations and support, including NVIDIA DLSS 3, for new games including THRONE AND LIBERTY, MechWarrior 5: Clans, and Starship Troopers: Extermination. The new drivers also add support for CUDA 12.7 and enable RTX HDR multi-monitor support within the latest NVIDIA App beta update.

NVIDIA also fixed several issues, including texture flickering issues with Final Fantasy XV and a frozen white screen and crash issue with Dying Light 2 Stay Human. When it comes to general bugs, the new drivers fix corruption with Steamlink streaming when MSSA is globally enabled, as well as a slight monitor backlight panel flicker issue when FPS drops below 60.

DOWNLOAD: NVIDIA GeForce 565.90 WHQL Game Ready

Advantech Launches AIR-310, Ultra-Low-Profile Scalable AI Inference

Advantech, a leading provider of edge computing solutions, introduces the AIR-310, a compact edge AI inference system featuring an MXM GPU card. Powered by 12th/13th/14th Gen Intel Core 65 W desktop processors, the AIR-310 delivers up to 12.99 TFLOPS of scalable AI performance via the NVIDIA Quadro 2000A GPU card in a 1.5U chassis (215 x 225 x 55 mm). Despite its compact size, it offers versatile connectivity with three LAN ports and four USB 3.0 ports, enabling seamless integration of sensors and cameras for vision AI applications.

The system includes smart fan management, operates in temperatures from 0 to 50°C (32 to 122°F), and is shock-resistant, capable of withstanding 3G vibration and 30G shock. Bundled with Intel Arc A370 and NVIDIA A2000 GPUs, it is certified to IEC 61000-6-2, IEC 61000-6-4, and CB/UL standards, ensuring stable 24/7 operation in harsh environments, including space-constrained or mobile equipment. The AIR-310 supports Windows 11, Linux Ubuntu 24.04, and the Edge AI SDK, enabling accelerated inference deployment for applications such as factory inspections, real-time video surveillance, GenAI/LLM, and medical imaging.

NVIDIA GeForce RTX 5090 and RTX 5080 Specifications Surface, Showing Larger SKU Segmentation

Thanks to the renowned NVIDIA hardware leaker kopite7Kimi on X, we are getting information about the final versions of NVIDIA's first upcoming wave of GeForce RTX 50 series "Blackwell" graphics cards. The two leaked GPUs are the GeForce RTX 5090 and RTX 5080, which now feature a more significant gap between xx80 and xx90 SKUs. For starters, we have the highest-end GeForce RTX 5090. NVIDIA has decided to use the GB202-300-A1 die and enabled 21,760 FP32 CUDA cores on this top-end model. Accompanying the massive 170 SM GPU configuration, the RTX 5090 has 32 GB of GDDR7 memory on a 512-bit bus, with each GDDR7 die running at 28 Gbps. This translates to 1,568 GB/s memory bandwidth. All of this is confined to a 600 W TGP.

When it comes to the GeForce RTX 5080, NVIDIA has decided to further separate its xx80 and xx90 SKUs. The RTX 5080 has 10,752 FP32 CUDA cores paired with 16 GB of GDDR7 memory on a 256-bit bus. With GDDR7 running at 28 Gbps, the memory bandwidth is also halved at 784 GB/s. This SKU uses a GB203-400-A1 die, which is designed to run within a 400 W TGP power envelope. For reference, the RTX 4090 has 68% more CUDA cores than the RTX 4080. The rumored RTX 5090 has around 102% more CUDA cores than the rumored RTX 5080, which means that NVIDIA is separating its top SKUs even more. We are curious to see at what price point NVIDIA places its upcoming GPUs so that we can compare generational updates and the difference between xx80 and xx90 models and their widened gaps.

Nintendo Switch 2 Allegedly Not Powered by AMD APU Due to Poor Battery Life

Nintendo's next-generation Switch 2 handheld gaming console is nearing its release. As leaks intensify about its future specifications, we get information about its planning stages. According to Moore's Law is Dead YouTube video, we learn that Nintendo didn't choose AMD APU to be the powerhouse behind Switch 2 due to poor battery life. In a bid to secure the best chip at a mere five watts of power, the Japanese company had two choices: NVIDIA Tegra or AMD APU. With some preliminary testing and evaluation, AMD APU wasn't reportedly power-efficient at 5 Watt TDP, while the NVIDIA Tegra chip was maintaining sufficient battery life and performance at target specifications.

Allegedly the AMD APU was good for 15 W design, but Nintendo didn't want to place a bigger battery so that the device remains lighter and cheaper. The final design will likely carry a battery with a 20 Wh capacity, which will be the main power source behind the NVIDIA Tegra T239 SoC. As a reminder, the Tegra T239 SoC features eight-core Arm A78C cluster with modified NVIDIA Ampere cores in combination with DLSS, featuring some of the latest encoding/decoding elements from Ada Lovelace, like AV1. There are likely 1536 CUDA cores paired with 128-bit LPDDR5 memory running at 102 GB/s bandwidth. For final specifications, we have to wait for the official launch, but with rumors starting to intensify, we can expect to see it relatively soon.

Interview with AMD's Senior Vice President and Chief Software Officer Andrej Zdravkovic: UDNA, ROCm for Radeon, AI Everywhere, and Much More!

A few days ago, we reported on AMD's newest expansion plans for Serbia. The company opened two new engineering design centers with offices in Belgrade and Nis. We were invited to join the opening ceremony and got an exclusive interview with one of AMD's top executives, Andrej Zdravkovic, who is the senior vice president and Chief Software Officer. Previously, we reported on AMD's transition to become a software company. The company has recently tripled its software engineering workforce and is moving some of its best people to support these teams. AMD's plan is spread over a three to five-year timeframe to improve its software ecosystem, accelerating hardware development to launch new products more frequently and to react to changes in software demand. AMD found that to help these expansion efforts, opening new design centers in Serbia would be very advantageous.

We sat down with Andrej Zdravkovic to discuss the purpose of AMD's establishment in Serbia and the future of some products. Zdravkovic is actually an engineer from Serbia, where he completed his Bachelor's and Master's degrees in electrical engineering from Belgrade University. In 1998, Zdravkovic joined ATI and quickly rose through the ranks, eventually becoming a senior director. During his decade-long tenure, Zdravkovic witnessed a significant industry shift as AMD acquired ATI in 2006. After a brief stint at another company, Zdravkovic returned to AMD in 2015, bringing with him a wealth of experience and a unique perspective on the evolution of the graphics and computing industry.
Here is the full interview:

AMD to Unify Gaming "RDNA" and Data Center "CDNA" into "UDNA": Singular GPU Architecture Similar to NVIDIA's CUDA

According to new information from Tom's Hardware, AMD has announced plans to unify its consumer-focused gaming RDNA and data center CDNA graphics architectures into a single, unified design called "UDNA." The announcement was made by AMD's Jack Huynh, Senior Vice President and General Manager of the Computing and Graphics Business Group, at IFA 2024 in Berlin. The goal of the new UDNA architecture is to provide a single focus point for developers so that each optimized application can run on consumer-grade GPU like Radeon RX 7900XTX as well as high-end data center GPU like Instinct MI300. This will create a unification similar to NVIDIA's CUDA, which enables CUDA-focused developers to run applications on everything ranging from laptops to data centers.
Jack HuynhSo, part of a big change at AMD is today we have a CDNA architecture for our Instinct data center GPUs and RDNA for the consumer stuff. It's forked. Going forward, we will call it UDNA. There'll be one unified architecture, both Instinct and client [consumer]. We'll unify it so that it will be so much easier for developers versus today, where they have to choose and value is not improving.

NVIDIA Shifts Gears: Open-Source Linux GPU Drivers Take Center Stage

Just a few months after hiring Ben Skeggs, a lead maintainer of the open-source NVIDIA GPU driver for Linux kernel, NVIDIA has announced a complete transition to open-source GPU kernel modules in its upcoming R560 driver release for Linux. This decision comes two years after the company's initial foray into open-source territory with the R515 driver in May 2022. The tech giant began focusing on data center compute GPUs, while GeForce and Workstation GPU support remained in the alpha stages. Now, after extensive development and optimization, NVIDIA reports that its open-source modules have achieved performance parity with, and in some cases surpassed, their closed-source counterparts. This transition brings a host of new capabilities, including heterogeneous memory management support, confidential computing features, and compatibility with NVIDIA's Grace platform's coherent memory architectures.

The move to open-source is expected to foster greater collaboration within the Linux ecosystem and potentially lead to faster bug fixes and feature improvements. However, not all GPUs will be compatible with the new open-source modules. While cutting-edge platforms like NVIDIA Grace Hopper and Blackwell will require open-source drivers, older GPUs from the Maxwell, Pascal, or Volta architectures must stick with proprietary drivers. NVIDIA has developed a detection helper script to guide driver selection for users who are unsure about compatibility. The shift also brings changes to NVIDIA's installation processes. The default driver version for most installation methods will now be the open-source variant. This affects package managers with the CUDA meta package, run file installations and even Windows Subsystem for Linux.

AMD is Becoming a Software Company. Here's the Plan

Just a few weeks ago, AMD invited us to Barcelona as part of a roundtable, to share their vision for the future of the company, and to get our feedback. On site, were prominent AMD leadership, including Phil Guido, Executive Vice President & Chief Commercial Officer and Jack Huynh, Senior VP & GM, Computing and Graphics Business Group. AMD is making changes in a big way to how they are approaching technology, shifting their focus from hardware development to emphasizing software, APIs, and AI experiences. Software is no longer just a complement to hardware; it's the core of modern technological ecosystems, and AMD is finally aligning its strategy accordingly.

The major difference between AMD and NVIDIA is that AMD is a hardware company that makes software on the side to support its hardware; while NVIDIA is a software company that designs hardware on the side to accelerate its software. This is about to change, as AMD is making a pivot toward software. They believe that they now have the full stack of computing hardware—all the way from CPUs, to AI accelerators, to GPUs, to FPGAs, to data-processing and even server architecture. The only frontier left for AMD is software.

New Performance Optimizations Supercharge NVIDIA RTX AI PCs for Gamers, Creators and Developers

NVIDIA today announced at Microsoft Build new AI performance optimizations and integrations for Windows that help deliver maximum performance on NVIDIA GeForce RTX AI PCs and NVIDIA RTX workstations. Large language models (LLMs) power some of the most exciting new use cases in generative AI and now run up to 3x faster with ONNX Runtime (ORT) and DirectML using the new NVIDIA R555 Game Ready Driver. ORT and DirectML are high-performance tools used to run AI models locally on Windows PCs.

WebNN, an application programming interface for web developers to deploy AI models, is now accelerated with RTX via DirectML, enabling web apps to incorporate fast, AI-powered capabilities. And PyTorch will support DirectML execution backends, enabling Windows developers to train and infer complex AI models on Windows natively. NVIDIA and Microsoft are collaborating to scale performance on RTX GPUs. These advancements build on NVIDIA's world-leading AI platform, which accelerates more than 500 applications and games on over 100 million RTX AI PCs and workstations worldwide.

NVIDIA Blackwell Platform Pushes the Boundaries of Scientific Computing

Quantum computing. Drug discovery. Fusion energy. Scientific computing and physics-based simulations are poised to make giant steps across domains that benefit humanity as advances in accelerated computing and AI drive the world's next big breakthroughs. NVIDIA unveiled at GTC in March the NVIDIA Blackwell platform, which promises generative AI on trillion-parameter large language models (LLMs) at up to 25x less cost and energy consumption than the NVIDIA Hopper architecture.

Blackwell has powerful implications for AI workloads, and its technology capabilities can also help to deliver discoveries across all types of scientific computing applications, including traditional numerical simulation. By reducing energy costs, accelerated computing and AI drive sustainable computing. Many scientific computing applications already benefit. Weather can be simulated at 200x lower cost and with 300x less energy, while digital twin simulations have 65x lower cost and 58x less energy consumption versus traditional CPU-based systems and others.

NVIDIA Accelerates Quantum Computing Centers Worldwide With CUDA-Q Platform

NVIDIA today announced that it will accelerate quantum computing efforts at national supercomputing centers around the world with the open-source NVIDIA CUDA-Q platform. Supercomputing sites in Germany, Japan and Poland will use the platform to power the quantum processing units (QPUs) inside their NVIDIA-accelerated high-performance computing systems.

QPUs are the brains of quantum computers that use the behavior of particles like electrons or photons to calculate differently than traditional processors, with the potential to make certain types of calculations faster. Germany's Jülich Supercomputing Centre (JSC) at Forschungszentrum Jülich is installing a QPU built by IQM Quantum Computers as a complement to its JUPITER supercomputer, supercharged by the NVIDIA GH200 Grace Hopper Superchip. The ABCI-Q supercomputer, located at the National Institute of Advanced Industrial Science and Technology (AIST) in Japan, is designed to advance the nation's quantum computing initiative. Powered by the NVIDIA Hopper architecture, the system will add a QPU from QuEra. Poland's Poznan Supercomputing and Networking Center (PSNC) has recently installed two photonic QPUs, built by ORCA Computing, connected to a new supercomputer partition accelerated by NVIDIA Hopper.
Return to Keyword Browsing
Feb 21st, 2025 19:17 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts