News Posts matching #CUDA

NVIDIA NIM Microservices Now Available to Streamline Agentic Workflows on RTX AI PCs and Workstations

Press Release by

Tuesday, 09:59 Discuss (0 Comments)

Generative AI is unlocking new capabilities for PCs and workstations, including game assistants, enhanced content-creation and productivity tools and more. NVIDIA NIM microservices, available now, and AI Blueprints, in the coming weeks, accelerate AI development and improve its accessibility. Announced at the CES trade show in January, NVIDIA NIM provides prepackaged, state-of-the-art AI models optimized for the NVIDIA RTX platform, including the NVIDIA GeForce RTX 50 Series and, now, the new NVIDIA Blackwell RTX PRO GPUs. The microservices are easy to download and run. They span the top modalities for PC development and are compatible with top ecosystem applications and tools.

The experimental System Assistant feature of Project G-Assist was also released today. Project G-Assist showcases how AI assistants can enhance apps and games. The System Assistant allows users to run real-time diagnostics, get recommendations on performance optimizations, or control system software and peripherals - all via simple voice or text commands. Developers and enthusiasts can extend its capabilities with a simple plug-in architecture and new plug-in builder.

Read full story

NVIDIA to Build Accelerated Quantum Computing Research Center

Press Release by

AleksandarK

Mar 19th, 2025 03:16 Discuss (0 Comments)

NVIDIA today announced it is building a Boston-based research center to provide cutting-edge technologies to advance quantum computing. The NVIDIA Accelerated Quantum Research Center, or NVAQC, will integrate leading quantum hardware with AI supercomputers, enabling what is known as accelerated quantum supercomputing. The NVAQC will help solve quantum computing's most challenging problems, ranging from qubit noise to transforming experimental quantum processors into practical devices.

Leading quantum computing innovators, including Quantinuum, Quantum Machines and QuEra Computing, will tap into the NVAQC to drive advancements through collaborations with researchers from leading universities, such as the Harvard Quantum Initiative in Science and Engineering (HQI) and the Engineering Quantum Systems (EQuS) group at the Massachusetts Institute of Technology (MIT).

Read full story

NVIDIA Accelerates Science and Engineering With CUDA-X Libraries Powered by GH200 and GB200 Superchips

Press Release by

GFreeman

Mar 18th, 2025 13:51 Discuss (1 Comment)

Scientists and engineers of all kinds are equipped to solve tough problems a lot faster with NVIDIA CUDA-X libraries powered by NVIDIA GB200 and GH200 superchips. Announced today at the NVIDIA GTC global AI conference, developers can now take advantage of tighter automatic integration and coordination between CPU and GPU resources - enabled by CUDA-X working with these latest superchip architectures - resulting in up to 11x speedups for computational engineering tools and 5x larger calculations compared with using traditional accelerated computing architectures.

This greatly accelerates and improves workflows in engineering simulation, design optimization and more, helping scientists and researchers reach groundbreaking results faster. NVIDIA released CUDA in 2006, opening up a world of applications to the power of accelerated computing. Since then, NVIDIA has built more than 900 domain-specific NVIDIA CUDA-X libraries and AI models, making it easier to adopt accelerated computing and driving incredible scientific breakthroughs. Now, CUDA-X brings accelerated computing to a broad new set of engineering disciplines, including astronomy, particle physics, quantum physics, automotive, aerospace and semiconductor design.

Read full story

BOXX Workstations Upgraded With New NVIDIA RTX PRO 6000 Blackwell Workstation Edition GPUs 

Press Release by

GFreeman

Mar 18th, 2025 13:48 Discuss (0 Comments)

BOXX Technologies, the leading innovator of high-performance computers, rendering systems, and servers, announced that as a supplier of NVIDIA-Certified Systems, BOXX workstations will feature the new NVIDIA RTX PRO 6000 Blackwell Workstation Edition and NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition GPUs. Designed for creative professionals, these NVIDIA Blackwell architecture GPUs combine breakthrough AI inference, ray tracing, and neural rendering technology with major performance and memory improvements to drive demanding creative, design, and engineering workflows. BOXX will be among the first computer hardware manufacturers offering the new GPUs inside multiple workstation form factors.

"From our desk side APEXX workstations to our FLEXX and RAXX data center platforms, BOXX is taking our record-setting performance to new heights with NVIDIA RTX PRO 6000 Blackwell Workstation Edition GPUs," said BOXX CEO Kirk Schell. "Our systems equipped with these groundbreaking GPUs are purpose-built for creative professionals who demand the best, so whether its architects, engineers, and content creators, or data scientists and large scale enterprise deployments, BOXX accelerates mission critical work while maintaining unparalleled performance, reliability, and support."

Read full story

NVIDIA Details DLSS 4 Design: A Complete AI-Driven Rendering Technology

AleksandarK

Mar 14th, 2025 13:23 Discuss (58 Comments)

NVIDIA has published a research paper on DLSS version 4, its AI rendering technology for real-time graphics performance. The system integrates advancements in frame generation, ray reconstruction, and latency reduction. The flagship Multi-Frame Generation feature generates three additional frames for every native frame. The DLSS 4 later on brings the best looking frames to the user quickly to make is seem like a real rendering. At the core of DLSS 4 is a shift from convolutional neural networks to transformer models. These new AI architectures excel at capturing spatial-temporal dependencies, improving ray-traced affect quality by 30-50% according to NVIDIA's benchmarks. The technology processes each AI-generated frame in just 1 ms on RTX 5090 GPUs—significantly faster than the 3.25 ms required by DLSS 3. For competitive gaming, the new Reflex Frame Warp feature reduces input latency by up to 75%, achieving 14 ms in THE FINALS and under 3 ms in VALORANT, according to NVIDIA's own benchmarks.

DLSS 4's implementation leverages Blackwell-specific architecture capabilities, including FP8 tensor cores and fused CUDA kernels. The optimized pipeline incorporates vertical layer fusion and memory optimizations that keep computational overhead manageable despite using transformer models, which are twice as large as previous CNN implementations. This efficiency enables real-time performance even with the substantially more complex AI processing. The unified AI pipeline reduces manual tuning requirements for ray-traced effects, allowing studios to implement advanced path tracing across diverse hardware configurations. The design also addresses gaming challenges like interpolating fast-moving UI elements and particle effects and reducing artifacts in high-motion scenes. NVIDIA's hardware flip metering and Blackwell-induced display engine integration ensure precise frame pacing of newly generated frames for smooth, high-refresh-rate gaming, with accurate imagery.

Read full story

NVIDIA Reportedly Prepares GeForce RTX 5060 and RTX 5060 Ti Unveil Tomorrow

AleksandarK

Mar 12th, 2025 06:32 Discuss (115 Comments)

NVIDIA is set to unveil its RTX 5060 series graphics cards tomorrow, according to VideoCardz information, which claims NVIDIA shared launch info with some media outlets today. The announcement will include two desktop models: the RTX 5060 and RTX 5060 Ti, confirming leaks from industry sources last week. The upcoming lineup will feature three variants: RTX 5060 Ti 16 GB, RTX 5060 Ti 8 GB, and RTX 5060. All three cards will utilize identical board designs and the same GPU, allowing manufacturers to produce visually similar Ti and non-Ti models. Power requirements are expected to range from 150-180 W. NVIDIA's RTX 5060 Ti will ship with 4608 CUDA cores, representing a modest 6% increase over the previous generation RTX 4060 Ti. The most significant improvement comes from the implementation of GDDR7 memory technology, which could deliver over 50% higher bandwidth than its predecessor if NVIDIA maintains the expected 28 Gbps memory speed across all variants.

The standard RTX 5060 will feature 3840 CUDA cores paired with 8 GB of GDDR7 memory. This configuration delivers 25% more GPU cores than its predecessor and marks an upgrade in GPU tier from AD107 (XX7) to GB206 (XX6). The smaller GB207 GPU is reportedly reserved for the upcoming RTX 5050. VideoCardz's sources indicate the RTX 5060 series will hit the market in April. Tomorrow's announcement is strategically timed as an update for the Game Developers Conference (GDC), which begins next week. All models in the series will maintain the 128-bit memory bus of their predecessors while delivering significantly improved memory bandwidth—448 GB/s compared to the previous generation's 288 GB/s for the Ti model and 272 GB/s for the standard variant. The improved bandwidth stems from the introduction of GDDR7 memory.

NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

AleksandarK

Mar 3rd, 2025 11:52 Discuss (85 Comments)

PassMark Software has identified the root cause behind unexpectedly low compute performance in NVIDIA's new GeForce RTX 5090, RTX 5080, and RTX 5070 Ti GPUs. The culprit: NVIDIA has silently discontinued support for 32-bit OpenCL and CUDA in its "Blackwell" architecture, causing compatibility issues with existing benchmarking tools and applications. The issue manifested when PassMark's DirectCompute benchmark returned the error code "CL_OUT_OF_RESOURCES (-5)" on RTX 5000 series cards. After investigation, developers confirmed that while the benchmark's primary application has been 64-bit for years, several compute sub-benchmarks still utilize 32-bit code that previously functioned correctly on RTX 4000 and earlier GPUs. This architectural change wasn't clearly documented by NVIDIA, whose developer website continues to display 32-bit code samples and documentation despite the removal of actual support.

The impact extends beyond benchmarking software. Applications built on legacy CUDA infrastructure, including technologies like PhysX, will experience significant performance degradation as computational tasks fall back to CPU processing rather than utilizing the GPU's parallel architecture. While this fallback mechanism allows older applications to run on the RTX 40 series and prior hardware, the RTX 5000 series handles these tasks exclusively through the CPU, resulting in substantially lower performance. PassMark is currently working to port the affected OpenCL code to 64-bit, allowing proper testing of the new GPUs' compute capabilities. However, they warn that many existing applications containing 32-bit OpenCL components may never function properly on RTX 5000 series cards without source code modifications. The benchmark developer also notes this change doesn't fully explain poor DirectX9 performance, suggesting additional architectural changes may affect legacy rendering pathways. PassMark updated its software today, but legacy benchmarks could still suffer. Below is an older benchmark run without the latest PassMark V11.1 build 1004 patches, showing just how much the newest generations suffers without a proper software support.

NVIDIA's 32-Bit PhysX Waves Goodbye with GeForce RTX 50 Series Ending 32-Bit CUDA Software Support

AleksandarK

Feb 19th, 2025 03:06 Discuss (74 Comments)

The days of 32-bit software support in NVIDIA's drivers are coming to an end, and with that, so does the support for the once iconic PhysX real-time physics engine. According to NVIDIA's engineers on GeForce forums, the lack of PhysX support has been quietly acknowledged, as NVIDIA's latest GeForce RTX 50 series of GPUs are phasing out support for 32-bit CUDA software, slowly transitioning the gaming world to the 64-bit software entirely. While older NVIDIA GPUs from the Maxwell through Ada generations will maintain 32-bit CUDA support, this update breaks backward compatibility for physics acceleration in legacy PC games on new GPUs. Users running these titles on RTX 50 series cards may need to rely on CPU-based PhysX processing, which could result in suboptimal performance compared to previous GPU generations.

A Reddit user reported frame rates dropping below 60 FPS in Borderlands 2 while using basic game mechanics with a 9800X3D CPU and RTX 5090 GPU, all because 32-bit CUDA application support on Blackwell architecture is depreciated. When another user booted up a 64-bit PhysX application, Batman Arkham Knight, PhysX worked perfectly, as expected. It is just that a massive list of older games, which gamers would sometimes prefer to play, is now running a lot slower on the most powerful consumer GPU due to the phase-out of 32-bit CUDA app support.

Read full story

NVIDIA GeForce RTX 5070 Ti Allegedly Scores 16.6% Improvement Over RTX 4070 Ti SUPER in Synthetic Benchmarks

AleksandarK

Feb 17th, 2025 04:43 Discuss (58 Comments)

Thanks to some early 3D Mark benchmarks obtained by VideoCardz, NVIDIA's upcoming GeForce RTX 5070 Ti GPU paints an interesting picture of performance gains over the predecessor. Testing conducted with AMD's Ryzen 7 9800X3D processor and 48 GB of DDR5-6000 memory has provided the first glimpse into the card's capabilities. The new GPU demonstrates a 16.6% performance improvement over its predecessor, the RTX 4070 Ti SUPER. However, benchmark data shows it is falling short of the more expensive RTX 5080 by 13.2%, raising questions about the price-to-performance ratio given the $250 price difference between the two cards. Priced at $749 MSRP, the RTX 5070 Ti could be even pricier in retail channels at launch, especially with limited availability. The card's positioning becomes particularly interesting compared to the RTX 5080's $999 price point, which commands a 33% premium for its additional performance capabilities.

As a reminder, the RTX 5070 Ti boasts 8,960 CUDA cores, 280 texture units, 70 RT cores for ray tracing, and 280 tensor cores for AI computations, all supported by 16 GB of GDDR7 memory running at 28 Gbps effective speed across a 256-bit bus interface, resulting in an 896 GB/s bandwidth. We have to wait for proper reviews for the final performance conclusion, as synthetic benchmarks tell only part of the story. Modern gaming demands consideration of advanced features such as ray tracing and upscaling technologies, which can significantly impact real-world performance. The true test will come from comprehensive gaming benchmarks tested over various cases. The gaming community won't have to wait long for detailed analysis, as official reviews will be reportedly released in just a few days. Additional evaluations of non-MSRP versions should follow on February 20, the card's launch date.

NVIDIA GeForce RTX 5070 Ti Edges Out RTX 4080 in OpenCL Benchmark

GGforever

Feb 13th, 2025 13:03 Discuss (17 Comments)

A recently surfaced Geekbench OpenCL listing has revealed the performance improvements that the GeForce RTX 5070 Ti is likely to bring to the table, and the numbers sure look promising - that is, coming from the disappointment of the GeForce RTX 5080, which manages roughly 260,000 points in the benchmark, portraying a paltry 8% improvement over its predecessor. The GeForce RTX 5070 Ti, however, managed an impressive 248,000 points, putting it a substantial 20% ahead of the GeForce RTX 4070 Ti. Hilariously enough, the RTX 5080 is merely 4% ahead, making the situation even worse for the somewhat contentious GPU. NVIDIA has claimed similar performance improvements in its marketing material, which does seem quiet plausible.

Of course, an OpenCL benchmark is hardly representative of real-world gaming performance. That being said, there is no denying that raw benchmarks will certainly help buyers temper expectations and make decisions. Previous leaks and speculations have hinted at a roughly 10% improvement over its predecessor in raster performance and up to 15% improvements in ray tracing performance, although the OpenCL listing does indicate the RTX 5070 ti might be capable of a larger generational jump, neck-and-neck with NVIDIA's claims. For those in need of a refresher, the RTX 5070 Ti boasts 8960 CUDA cores paired with 16 GB of GDDR7 memory on a 256-bit bus. Like its siblings, the RTX 5070 is also rumored to face "extremely limited" supply at launch. With its official launch less than a week away, we won't have much waiting to do to find out for ourselves.

NVIDIA RTX 5080 Laptop Defeats Predecessor By 19% in Time Spy Benchmark

GGforever

Feb 7th, 2025 12:06 Discuss (10 Comments)

The NVIDIA RTX 50-series witnessed quite a contentious launch, to say the least. Hindered by abysmal availability, controversial generational improvement, and whacky marketing tactics by Team Green, it would be safe to say a lot of passionate gamers were left utterly disappointed. That said, while the desktop cards have been the talk of the town as of late, the RTX 50 Laptop counterparts are yet to make headlines. Occasional leaks do appear on the interwebs, the latest one of which seems to indicate the 3D Mark Time Spy performance for the RTX 5080 Laptop GPU. And the results are - well, debatable.

We do know that the RTX 5080 Laptop GPU will feature 7680 CUDA cores, a shockingly modest increase over its predecessor. Considering that we did not get a node shrink this time around, the architectural improvements appear to be rather minimal, going by the tests conducted so far. Of course, the biggest boost in performance will likely be afforded by GDDR7 memory, utilizing a 256-bit bus, compared its predecessor's GDDR6 memory on a 192-bit bus. In 3D Mark's Time Spy DX12 test, which is somewhat of an outdated benchmark, the RTX 5080 Laptop managed roughly around 21,900 points. The RTX 4080 Laptop, on an average, rakes in around 18,200 points, putting the RTX 5080 Laptop ahead by almost 19%. The RTX 4090 Laptop is also left behind, by around 5%.

Read full story

NVIDIA GB202 "Blackwell" Die Exposed, Shows the Massive 24,576 CUDA Core Configuration

AleksandarK

Jan 27th, 2025 06:36 Discuss (28 Comments)

A die-shot of NVIDIA's GB202, the silicon powering the RTX 5090, has surfaced online, providing detailed insights into the "Blackwell" architecture's physical layout. The annotated images, shared by hardware analyst Kurnal and provided by ASUS China general manager Tony Yu, compare the GB202 to its AD102 predecessor and outline key architectural components. The die's central region houses 128 MB of L2 cache (96 MB enabled on RTX 5090), surrounded by memory interfaces. Eight 64-bit memory controllers support the 512-bit GDDR7 interface, with physical interfaces positioned along the top, left, and right edges of the die. Twelve graphics processing clusters (GPCs) surround the central cache. Each GPC contains eight texture processing clusters (TPCs), with each GPC housing 16 streaming multiprocessors (SMs). The complete die configuration enables 24,576 CUDA cores, arranged as 128 cores per SM across 192 SMs. With RTX 5090 offering "only" 21,760 CUDA cores, this means that the full GB202 die is reserved for workstation GPUs.

The SM design includes four slices sharing 128 KB of L1 cache and four texture mapping units (TMUs). Individual SM slices contain dedicated register files, L0 instruction caches, warp schedulers, load-store units, and special function units. Central to the die's layout is a vertical strip containing the media processing components—NVENC and NVDEC units—running from top to bottom. The RTX 5090 implementation enables three of four available NVENC encoders and two of four NVDEC decoders. The die includes twelve raster engine/3D FF blocks for geometry processing. At the bottom edge sits the PCIe 5.0 x16 interface and display controller components. Despite its substantial size, the GB202 remains smaller than NVIDIA's previous GH100 and GV100 dies, which exceeded 814 mm². Each SM integrates specialized hardware, including new 5th-generation Tensor cores and 4th-generation RT cores, contributing to the die's total of 192 RT cores, 768 Tensor cores, and 768 texture units.

NVIDIA Likely Sending Maxwell, Pascal & Volta Architectures to CUDA Legacy Branch

T0@st

Jan 24th, 2025 11:49 Discuss (49 Comments)

Team Green's CUDA 12.8 release notes have revealed upcoming changes for three older GPU architectures—the document's "Deprecated and Dropped Features" section outlines forthcoming changes. A brief sentence outlines a less active future for affected families: "architecture support for Maxwell, Pascal, and Volta is considered feature-complete and will be frozen in an upcoming release." Further down, NVIDIA states that a small selection of operating systems have been dropped from support lists, including Microsoft Windows 10 21H2 and Debian 11.

Refocusing on matters of hardware—Michael Larabel, Phoronix's editor-in-chief, has kindly provided a bit of history and context. "Four years ago with the NVIDIA 470 series was the legacy branch for GeForce GTX 600 and 700 Kepler series and now as we embark on the NVIDIA 570 driver series, it looks like it could end up being the legacy branch for Maxwell, Pascal, and Volta generations of GPUs." Larabel and other industry watchdogs reckon that the incoming "Blackwell" generation is taking priority, with Team Green likely freeing up resources and concentrating less on taking care of decade+ old hardware. VideoCardz believes that gaming GPU support will continue—at least for Maxwell (e.g. GeForce GTX 900) and Pascal (GeForce GTX 10 series)—based on a playtesting of the toolkit's latest set of integrated drivers (version 571.96).

First NVIDIA GeForce RTX 5090 GPU with 32 GB GDDR7 Memory Leaks Ahead of CES Keynote

AleksandarK

Jan 6th, 2025 04:07 Discuss (82 Comments)

NVIDIA's unannounced GeForce RTX 5090 graphics card has leaked, confirming key specifications of the next-generation GPU. Thanks to exclusive information from VideoCardz, we can see the packaging of Inno3D's RTX 5090 iChill X3 model, which confirms that the graphics card will feature 32 GB of GDDR7 memory. The leaked materials show that Inno3D's variant will use a 3.5-slot cooling system, suggesting significant cooling requirements for the flagship card. According to earlier leaks, the RTX 5090 will be based on the GB202 GPU and include 21,760 CUDA cores. The card's memory system is a significant upgrade, with its 32 GB of GDDR7 memory running on a 512-bit memory bus at 28 Gbps, capable of delivering nearly 1.8 TB/s of bandwidth. This represents twice the memory capacity of the upcoming RTX 5080, which is expected to ship with 16 GB capacity but 30 Gbps GDDR7 modules.

Power consumption has increased significantly, with the RTX 5090's TDP rated at 575 W and TGP of 600 W, marking a 125-watt increase over the previous RTX 4090 in raw TDP. NVIDIA is scheduled to hold its CES keynote today at 06:30 pm PT time, where the company is expected to announce several new graphics cards officially. The lineup should include the RTX 5090, RTX 5080, RTX 5070 Ti, RTX 5070, and an RTX 5090D model specifically for the Chinese market. Early indications are that the RTX 5080 will be the first card to reach consumers, with a planned release date of January 21st. Release dates for other models, including the flagship RTX 5090, have not yet been confirmed. The RTX 5090 is currently the only card in the RTX 50 series planned to use the GB202 GPU. Pricing information and additional specifications are expected to be revealed during the upcoming announcement.

Nintendo Switch 2 PCB Leak Reveals an NVIDIA Tegra T239 Chip Optically Shrunk to 5nm

btarunr

Jan 2nd, 2025 03:51 Discuss (54 Comments)

Nintendo Switch 2 promises to be this year's big (well small) gaming platform launch. It goes up against a growing ecosystem of handhelds based on x86-64 mobile processors running Windows, its main play would have to be offering a similar or better gameplay experience, but with better battery life, given that all of its hardware is purpose-built for a handheld console, and runs a highly optimized software stack; and the SoC forms a big part of this. Nintendo turned to NVIDIA for the job, given its graphics IP leadership, and its ability to integrate it with Arm CPU IP in a semi-custom chip. Someone with access to a Switch 2 prototype, likely an ISV, took the device apart, revealing the chip, a die-shrunk version of the Tegra T239 from 2023.

It's important to note that prototype consoles physically appear nothing like the final product, they're just designed so ISVs and game developers can validate them, and together with PC-based "official" emulation, set up the ability to develop or port games to the new platform. The Switch 2 looks very similar to the original Switch, it is a large tablet-like device, with detachable controllers. The largest chip on the mainboard is the NVIDIA Tegra T239. Nintendo Prime shared more details about the chip.

Read full story

NVIDIA Plans GeForce RTX 5080 "Blackwell" Availability on January 21, Right After CES Announcement

AleksandarK

Jan 2nd, 2025 03:44 Discuss (28 Comments)

Hong Kong tech media HKEPC report indicates that NVIDIA's GeForce RTX 5080 graphics card will launch on January 21, 2025. The release follows a planned announcement event on January 6, where CEO Jensen Huang will present the new "Blackwell" architecture. Anticipated specifications based on prior rumors point to RTX 5080 using GB203-400-A1 chip, containing 10,752 CUDA cores across 84 SM. The card maintains 16 GB of memory but upgrades to GDDR7 technology running at 30 Gbps, while other cards in the series are expected to use 28 Gbps memory. The graphics card is manufactured using TSMC's 4NP 4 nm node. This improvement in manufacturing technology, combined with architectural changes, accounts for most of the expected performance gains, as the raw CUDA core count only increased by 10% over the RTX 4080. NVIDIA is also introducing larger segmentation between its Blackwell SKUs, as the RTX 5090 has nearly double CUDA cores and double GDDR7 memory capacity.

NVIDIA is organizing a GeForce LAN event two days before the announcement, marking the return of this gathering after 13 years, so the timing is interesting. NVIDIA wants to capture gamer's hearts with 50 hours of non-stop gameplay. Meanwhile, AMD currently has no competing products announced in the high-end graphics segment, leaving NVIDIA without direct competition in this performance tier. This market situation could affect the final pricing of the RTX 5080, which will be revealed during the January keynote. While the January 21 date appears set for the RTX 5080, launch dates for other cards in the Blackwell family, including the RTX 5090 and RTX 5070 series, remain unconfirmed. NVIDIA typically releases different models in their GPU families on separate dates to manage production and distribution effectively.

NVIDIA GeForce RTX 5070 and RTX 5070 Ti Final Specifications Seemingly Confirmed

AleksandarK

Dec 25th, 2024 03:15 Discuss (153 Comments)

Thanks to kopite7kimi, we are able to finalize the leaked specifications of NVIDIA's upcoming GeForce RTX 5070 and RTX 5070 Ti graphics cards.
Starting off with RTX 5070 Ti, it will feature 8,960 CUDA cores and come equipped with 16 GB GDDR7 memory on a 256-bit memory bus, offering 896 GB/s bandwidth. The card is reportedly designed with a total board power (TBP) of 300 W. The Ti variant appears to use the PG147-SKU60 board design with a GB203-300-A1 GPU. The standard RTX 5070 is positioned as a more power-efficient option, with specifications pointing to 6,144 CUDA cores and 12 GB of GDDR7 memory on a 192-bit bus, with 627 GB/s memory bandwidth. This model is expected to operate at a slightly lower 250 W TBP.

Interestingly, the non-Ti RTX 5070 card will be available in two board variants, PG146 and PG147, both utilizing the GB205-300-A1 GPU. While we don't know what the pricing structure looks like, we see that NVIDIA has chosen to make more considerable differentiating factors between its SKUs. The Ti variant not only gets an extra four GB of GDDR7 memory, but it also gets a whopping 45% increase in CUDA core count, going from 6,144 to 8,960 cores. While we wait for the CES to see the initial wave of GeForce RTX 50 series cards, the GeForce RTX 5070 and RTX 5070 Ti are expected to arrive later, possibly after RTX 5080 and RTX 5090 GPUs.

AMD's Pain Point is ROCm Software, NVIDIA's CUDA Software is Still Superior for AI Development: Report

AleksandarK

Dec 23rd, 2024 08:02 Discuss (33 Comments)

The battle of AI acceleration in the data center is, as most readers are aware, insanely competitive, with NVIDIA offering a top-tier software stack. However, AMD has tried in recent years to capture a part of the revenue that hyperscalers and OEMs are willing to spend with its Instinct MI300X accelerator lineup for AI and HPC. Despite having decent hardware, the company is not close to bridging the gap software-wise with its competitor, NVIDIA. According to the latest report from SemiAnalysis, a research and consultancy firm, they have run a five-month experiment using Instinct MI300X for training and benchmark runs. And the findings were surprising: even with better hardware, AMD's software stack, including ROCm, has massively degraded AMD's performance.

"When comparing NVIDIA's GPUs to AMD's MI300X, we found that the potential on paper advantage of the MI300X was not realized due to a lack within AMD public release software stack and the lack of testing from AMD," noted SemiAnalysis, breaking down arguments in the report further, adding that "AMD's software experience is riddled with bugs rendering out of the box training with AMD is impossible. We were hopeful that AMD could emerge as a strong competitor to NVIDIA in training workloads, but, as of today, this is unfortunately not the case. The CUDA moat has yet to be crossed by AMD due to AMD's weaker-than-expected software Quality Assurance (QA) culture and its challenging out-of-the-box experience."

Read full story

Acer Leaks GeForce RTX 5090 and RTX 5080 GPU, Memory Sizes Confirmed

GFreeman

Dec 18th, 2024 08:20 Discuss (28 Comments)

Acer has jumped the gun and listed its ACER Predator Orion 7000 systems with the upcoming NVIDIA RTX 50 series graphics cards, namely the GeForce RTX 5080 and the GeForce RTX 5090. In addition, the listing confirms that the GeForce RTX 5080 will come with 16 GB of GDDR7 memory, while the GeForce RTX 5090 will get 32 GB of GDDR7 memory.

The ACER Predator Orion 7000 gaming PC was announced back in September, together with Intel's Core Ultra 200 series, and it does not come as a surprise that this high-end pre-built system will now be getting NVIDIA's new GeForce RTX 50 series graphics cards. In case you missed previous rumors, the GeForce RTX 5080 is expected to use the GB203-400 GPU with 10,752 CUDA cores, and come with 16 GB of GDDR7 memory on a 256-bit memory interface. The GeForce RTX 5090, on the other hand, gets the GB202-300 GPU with 21,760 CUDA cores and packs 32 GB of GDDR7 memory.

Read full story

NVIDIA GeForce RTX 5070 Ti Leak Tips More VRAM, Cores, and Power Draw

Updated by

Cpt.Jank

Dec 14th, 2024 17:29 Updated: Dec 16th, 2024 06:35 Discuss (161 Comments)

It's an open secret by now that NVIDIA's GeForce RTX 5000 series GPUs are on the way, with an early 2025 launch on the cards. Now, preliminary details about the RTX 5070 Ti have leaked, revealing an increase in both VRAM and TDP and suggesting that the new upper mid-range GPU will finally address the increased VRAM demand from modern games. According to the leak from Wccftech, the RTX 5070 Ti will have 16 GB of GDDR7 VRAM, up from 12 GB on the RTX 4070 Ti, as we previously speculated. Also confirming previous leaks, the new sources confirm that the 5070 Ti will use the cut-down GB203 chip, although the new leak points to a significantly higher TBP of 350 W. The new memory configuration will supposedly run on a 256-bit memory bus and run at 28 Gbps for a total memory bandwidth of 896 GB/s, which is a significant boost over the RTX 4070 Ti.

Supposedly, the RTX 5070 Ti will also see a bump in total CUDA cores, from 7680 in the RTX 4070 Ti to 8960 in the RTX 5070 Ti. The new RTX 5070 Ti will also switch to the 12V-2x6 power connector, compared to the 16-pin connector from the 4070 Ti. NVIDIA is expected to announce the RTX 5000 series graphics cards at CES 2025 in early January, but the RTX 5070 Ti will supposedly be the third card in the 5000-series launch cycle. That said, leaks suggest that the 5070 Ti will still launch in Q1 2025, meaning we may see an indication of specs at CES 2025, although pricing is still unclear.

Update Dec 16th: Kopite7kimi, ubiquitous hardware leaker, has since responded to the RTX 5070 Ti leaks, stating that 350 W may be on the higher end for the RTX 5070 Ti: "...the latest data shows 285W. However, 350W is also one of the configs." This could mean that a TBP of 350 W is possible, although maybe only on certain graphics card models, if competition is strong, or in certain boost scenarios.

Nintendo Switch Successor: Backward Compatibility Confirmed for 2025 Launch

AleksandarK

Nov 6th, 2024 08:00 Discuss (24 Comments)

Nintendo has officially announced that its next-generation Switch console will feature backward compatibility, allowing players to use their existing game libraries on the new system. However, those eagerly awaiting the console's release may need to exercise patience as launch expectations have shifted to early 2025. On the official X account, Nintendo has announced: "At today's Corporate Management Policy Briefing, we announced that Nintendo Switch software will also be playable on the successor to Nintendo Switch. Nintendo Switch Online will be available on the successor to Nintendo Switch as well. Further information about the successor to Nintendo Switch, including its compatibility with Nintendo Switch, will be announced at a later date."

While the original Switch evolved from a 20 nm Tegra X1 to a more power-efficient 16 nm Tegra X1+ SoC (both featuring four Cortex-A57 and four Cortex-A53 cores with GM20B Maxwell GPUs), the Switch 2 is rumored to utilize a customized variant of NVIDIA's Jetson Orin SoC, now codenamed T239. The new chip represents a significant upgrade with its 12 Cortex-A78AE cores, LPDDR5 memory, and Ampere GPU architecture with 1,536 CUDA cores, promising enhanced battery efficiency and DLSS capabilities for the handheld gaming market. With the holiday 2024 release window now seemingly off the table, the new console is anticipated to debut in the first half of 2025, marking nearly eight years since the original Switch's launch.

NVIDIA Releases GeForce 565.90 WHQL Game Ready Driver

GFreeman

Oct 1st, 2024 09:25 Discuss (9 Comments)

NVIDIA has released its latest GeForce graphics drivers, the GeForce 565.90 WHQL Game Ready drivers. As a new Game Ready driver, it provides optimizations and support, including NVIDIA DLSS 3, for new games including THRONE AND LIBERTY, MechWarrior 5: Clans, and Starship Troopers: Extermination. The new drivers also add support for CUDA 12.7 and enable RTX HDR multi-monitor support within the latest NVIDIA App beta update.

NVIDIA also fixed several issues, including texture flickering issues with Final Fantasy XV and a frozen white screen and crash issue with Dying Light 2 Stay Human. When it comes to general bugs, the new drivers fix corruption with Steamlink streaming when MSSA is globally enabled, as well as a slight monitor backlight panel flicker issue when FPS drops below 60.

DOWNLOAD: NVIDIA GeForce 565.90 WHQL Game Ready

Read full story

Advantech Launches AIR-310, Ultra-Low-Profile Scalable AI Inference

Press Release by

Nomad76

Sep 26th, 2024 16:15 Discuss (0 Comments)

Advantech, a leading provider of edge computing solutions, introduces the AIR-310, a compact edge AI inference system featuring an MXM GPU card. Powered by 12th/13th/14th Gen Intel Core 65 W desktop processors, the AIR-310 delivers up to 12.99 TFLOPS of scalable AI performance via the NVIDIA Quadro 2000A GPU card in a 1.5U chassis (215 x 225 x 55 mm). Despite its compact size, it offers versatile connectivity with three LAN ports and four USB 3.0 ports, enabling seamless integration of sensors and cameras for vision AI applications.

The system includes smart fan management, operates in temperatures from 0 to 50°C (32 to 122°F), and is shock-resistant, capable of withstanding 3G vibration and 30G shock. Bundled with Intel Arc A370 and NVIDIA A2000 GPUs, it is certified to IEC 61000-6-2, IEC 61000-6-4, and CB/UL standards, ensuring stable 24/7 operation in harsh environments, including space-constrained or mobile equipment. The AIR-310 supports Windows 11, Linux Ubuntu 24.04, and the Edge AI SDK, enabling accelerated inference deployment for applications such as factory inspections, real-time video surveillance, GenAI/LLM, and medical imaging.

Read full story

NVIDIA GeForce RTX 5090 and RTX 5080 Specifications Surface, Showing Larger SKU Segmentation

AleksandarK

Sep 26th, 2024 13:38 Discuss (185 Comments)

Thanks to the renowned NVIDIA hardware leaker kopite7Kimi on X, we are getting information about the final versions of NVIDIA's first upcoming wave of GeForce RTX 50 series "Blackwell" graphics cards. The two leaked GPUs are the GeForce RTX 5090 and RTX 5080, which now feature a more significant gap between xx80 and xx90 SKUs. For starters, we have the highest-end GeForce RTX 5090. NVIDIA has decided to use the GB202-300-A1 die and enabled 21,760 FP32 CUDA cores on this top-end model. Accompanying the massive 170 SM GPU configuration, the RTX 5090 has 32 GB of GDDR7 memory on a 512-bit bus, with each GDDR7 die running at 28 Gbps. This translates to 1,568 GB/s memory bandwidth. All of this is confined to a 600 W TGP.

When it comes to the GeForce RTX 5080, NVIDIA has decided to further separate its xx80 and xx90 SKUs. The RTX 5080 has 10,752 FP32 CUDA cores paired with 16 GB of GDDR7 memory on a 256-bit bus. With GDDR7 running at 28 Gbps, the memory bandwidth is also halved at 784 GB/s. This SKU uses a GB203-400-A1 die, which is designed to run within a 400 W TGP power envelope. For reference, the RTX 4090 has 68% more CUDA cores than the RTX 4080. The rumored RTX 5090 has around 102% more CUDA cores than the rumored RTX 5080, which means that NVIDIA is separating its top SKUs even more. We are curious to see at what price point NVIDIA places its upcoming GPUs so that we can compare generational updates and the difference between xx80 and xx90 models and their widened gaps.

Nintendo Switch 2 Allegedly Not Powered by AMD APU Due to Poor Battery Life

AleksandarK

Sep 24th, 2024 08:23 Discuss (83 Comments)

Nintendo's next-generation Switch 2 handheld gaming console is nearing its release. As leaks intensify about its future specifications, we get information about its planning stages. According to Moore's Law is Dead YouTube video, we learn that Nintendo didn't choose AMD APU to be the powerhouse behind Switch 2 due to poor battery life. In a bid to secure the best chip at a mere five watts of power, the Japanese company had two choices: NVIDIA Tegra or AMD APU. With some preliminary testing and evaluation, AMD APU wasn't reportedly power-efficient at 5 Watt TDP, while the NVIDIA Tegra chip was maintaining sufficient battery life and performance at target specifications.

Allegedly the AMD APU was good for 15 W design, but Nintendo didn't want to place a bigger battery so that the device remains lighter and cheaper. The final design will likely carry a battery with a 20 Wh capacity, which will be the main power source behind the NVIDIA Tegra T239 SoC. As a reminder, the Tegra T239 SoC features eight-core Arm A78C cluster with modified NVIDIA Ampere cores in combination with DLSS, featuring some of the latest encoding/decoding elements from Ada Lovelace, like AV1. There are likely 1536 CUDA cores paired with 128-bit LPDDR5 memory running at 102 GB/s bandwidth. For final specifications, we have to wait for the official launch, but with rumors starting to intensify, we can expect to see it relatively soon.

Return to Keyword Browsing

News Posts matching #CUDA

NVIDIA NIM Microservices Now Available to Streamline Agentic Workflows on RTX AI PCs and Workstations

NVIDIA to Build Accelerated Quantum Computing Research Center

NVIDIA Accelerates Science and Engineering With CUDA-X Libraries Powered by GH200 and GB200 Superchips

BOXX Workstations Upgraded With New NVIDIA RTX PRO 6000 Blackwell Workstation Edition GPUs

NVIDIA Details DLSS 4 Design: A Complete AI-Driven Rendering Technology

NVIDIA Reportedly Prepares GeForce RTX 5060 and RTX 5060 Ti Unveil Tomorrow

NVIDIA GeForce RTX 50 Series Faces Compute Performance Issues Due to Dropped 32-bit Support

NVIDIA's 32-Bit PhysX Waves Goodbye with GeForce RTX 50 Series Ending 32-Bit CUDA Software Support

NVIDIA GeForce RTX 5070 Ti Allegedly Scores 16.6% Improvement Over RTX 4070 Ti SUPER in Synthetic Benchmarks

NVIDIA GeForce RTX 5070 Ti Edges Out RTX 4080 in OpenCL Benchmark

NVIDIA RTX 5080 Laptop Defeats Predecessor By 19% in Time Spy Benchmark

NVIDIA GB202 "Blackwell" Die Exposed, Shows the Massive 24,576 CUDA Core Configuration

NVIDIA Likely Sending Maxwell, Pascal & Volta Architectures to CUDA Legacy Branch

First NVIDIA GeForce RTX 5090 GPU with 32 GB GDDR7 Memory Leaks Ahead of CES Keynote

Nintendo Switch 2 PCB Leak Reveals an NVIDIA Tegra T239 Chip Optically Shrunk to 5nm

NVIDIA Plans GeForce RTX 5080 "Blackwell" Availability on January 21, Right After CES Announcement

NVIDIA GeForce RTX 5070 and RTX 5070 Ti Final Specifications Seemingly Confirmed

AMD's Pain Point is ROCm Software, NVIDIA's CUDA Software is Still Superior for AI Development: Report

Acer Leaks GeForce RTX 5090 and RTX 5080 GPU, Memory Sizes Confirmed

NVIDIA GeForce RTX 5070 Ti Leak Tips More VRAM, Cores, and Power Draw

Nintendo Switch Successor: Backward Compatibility Confirmed for 2025 Launch

NVIDIA Releases GeForce 565.90 WHQL Game Ready Driver

Advantech Launches AIR-310, Ultra-Low-Profile Scalable AI Inference

NVIDIA GeForce RTX 5090 and RTX 5080 Specifications Surface, Showing Larger SKU Segmentation

Nintendo Switch 2 Allegedly Not Powered by AMD APU Due to Poor Battery Life

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts