News Posts matching #CUDA

Return to Keyword Browsing

NVIDIA GeForce RTX 4070 Variant Could be Refreshed With AD103 GPU

Hardware tipster kopite7kimi has learned from insider sources that a variant of NVIDIA's GeForce RTX 4070 graphic card could be lined up with a different GPU - the AD103 instead of the currently utilized AD104-derived AD104-250-A1. The Ada Lovelace-based architecture is a staple across the RTX 40-series of graphics cards, but a fully unlocked AD103 is not yet attached to any product on the market - it will be a strange move for NVIDIA to refresh or expand the mid-range RTX 4070 lineup with a much larger GPU, albeit in a reduced form. A cut down variant of the AD103 is currently housed within NVIDIA's GeForce RTX 4080 graphics card - its AD103-300-A1 GPU has 9728 CUDA Cores and Team Green's engineers have chosen to disable 5% of the full article's capabilities.

The hardware boffins will need to do a lot of pruning if the larger GPU ends up on the rumored RTX 4070 sort-of upgrade - the SKU's 5,888 CUDA core count spec would require a 42% reduction in GPU potency. It is somewhat curious that the RTX 4070 Ti has not been mentioned by the tipster - you would think that the more powerful card (than the standard 4070) would be the logical and immediate candidate for this type of treatment. In theory NVIDIA could be re-purposing dies that do not meet RTX 4080-level standards, thus salvaging rejected material and repurposing it for step down card models.

NVIDIA RTX 5000 Ada Generation Workstation GPU Mentioned in Official Driver Documents

NVIDIA's rumored RTX 5000 Ada Generation GPU has been outed once again, according to VideoCardz - the cited source being a keen-eyed member posting information dumps on a laptop discussion forum. Team Green has released new driver documentation that makes mention of hardware ID "26B2" under an entry for a now supported device: "NVIDIA RTX 5000 Ada Generation." Forum admin StefanG3D posted the small discovery on their favored forum in the small hours of Sunday morning (April 23).

As reported last month, the NVIDIA RTX 5000 Ada is destined to sit between existing sibling workstation GPUs - the AD102-based RTX 6000 and AD104-based RTX 4000 SFF. Hardware tipster kopite7kimi has learned enough to theorize that the NVIDIA RTX 5000 Ada Generation workstation graphics card will feature 15,360 CUDA cores and 32 GB of GDDR6 memory. The AD102 GPU is expected to sit at the heart of this unannounced card.

Square Enix Unearths Old Crime Puzzler - The Portopia Serial Murder Case, Remaster Features AI Interaction

At the turn of the 1980s, most PC adventure games were played using only the keyboard. In those days, adventure games didn't use action menus like more modern games, but simply presented the player with a command line where they could freely input text to decide the actions that characters would take and proceed through the story. Free text input systems like these allowed players to feel a great deal of freedom. However, they did come with one common source of frustration: players knowing what action they wanted to perform but being unable to do so because they could not find the right wording. This problem was caused by the limitations of PC performance and NLP technology of the time.

40 years have passed since then, and PC performance has drastically improved, as have the capabilities of NLP technology. Using "The Portopia Serial Murder Case" as a test case, we'd like to show you the capabilities of modern NLP and the impact it can have on adventure games, as well as deepen your understanding of NLP technologies.

NVIDIA's Tiny RTX 4000 Ada Lovelace Graphics Cards is now Available

NVIDIA has begun selling its compact RTX 4000 Ada Lovelace graphics card, offering GeForce RTX 3070-like performance at a mere 70 W power consumption, allowing it to fit in almost all desktop PCs. The low-profile, dual-slot board is priced higher than the RTX 4080 as it targets professional users, but it can still be used in a regular gaming computer. PNY's RTX 4000 Ada generation graphics card is the first to reach consumer shelves, currently available for $1,444 at ShopBLT, a retailer known for obtaining hardware before its competitors. The card comes with four Mini-DisplayPort connectors, so an additional mDP-DP or mDP-HDMI adapter must be factored into the cost.

The NVIDIA RTX 4000 SFF Ada generation board features an AD104 GPU with 6,144 CUDA cores, 20 GB of GDDR6 ECC memory, and a 160-bit interface. With a fixed boost frequency floating around 1560 MHz to reduce overall board power consumption, the GPU is rated for just 70 Watts of power. To emphasize the efficiency, this card requires no external PCIe power connector, as all the juice is fed through the PCIe slot. The GA104 graphics processor in this configuration delivers a peak FP32 performance of 19.2 TFLOPS, comparable to the GeForce RTX 3070. The 20 GB of memory makes the card more valuable for professionals and AI researchers needing compact solutions. Although the card's performance is overshadowed by the recently launched GeForce RTX 4070, the RTX 4000 SFF Ada's professional drivers, support for professional software ISVs, and additional features make it a strong contender in the semi-professional market. Availability and pricing are expected to improve in the coming weeks as the card becomes more widely accessible.

More images, along with specification table, follow.

AMD Brings ROCm to Consumer GPUs on Windows OS

AMD has published an exciting development for its Radeon Open Compute Ecosystem (ROCm) users today. Now, ROCm is coming to the Windows operating system, and the company has extended ROCm support for consumer graphics cards instead of only supporting professional-grade GPUs. This development milestone is essential for making AMD's GPU family more competent with NVIDIA and its CUDA-accelerated GPUs. For those unaware, AMD ROCm is a software stack designed for GPU programming. Similarly to NVIDIA's CUDA, ROCm is designed for AMD GPUs and was historically limited to Linux-based OSes and GFX9, CDNA, and professional-grade RDNA GPUs.

However, according to documents obtained by Tom's Hardware (which are behind a login wall), AMD has brought support for ROCm to Radeon RX 6900 XT, Radeon RX 6600, and R9 Fury GPU. What is interesting is not the inclusion of RX 6900 XT and RX 6600 but the support for R9 Fury, an eight-year-old graphics card. Also, what is interesting is that out of these three GPUs, only R9 Fury has full ROCm support, the RX 6900 XT has HIP SDK support, and RX 6600 has only HIP runtime support. And to make matters even more complicated, the consumer-grade R9 Fury GPU has full ROCm support only on Linux and not Windows. The reason for this strange selection of support has yet to be discovered. However, it is a step in the right direction, as AMD has yet to enable more functionality on Windows and more consumer GPUs to compete with NVIDIA.

Adlink launches portable GPU accelerator with NVIDIA RTX A500

ADLINK Technology Inc., a global leader in edge computing, today launched Pocket AI - the first ever ultra-portable GPU accelerator to offer exceptional power at a cost-effective price point. With hardware and software compatibility, Pocket AI is the perfect tool to boost performance and productivity. It provides plug-and-play scalability from development to deployment for AI developers, professional graphics users and embedded industrial applications.

Pocket AI is a simple, reliable route to impressive GPU acceleration at a fraction of the cost of a laptop with equivalent GPU power. Its many benefits include a perfect power/performance balance from the NVIDIA RTX A500 GPU; high functionality driven by NVIDIA CUDA X and accelerated libraries; quick, easy delivery/power via Thunderbolt 3 interface and USB PD; and compatibility supported by NVIDIA developer tools. For the ultimate portability, the Pocket AI is compact and lightweight - est. 106 x 72 x 25 mm and 250 grams.

NVIDIA Prepares H100 NVL GPUs With More Memory and SLI-Like Capability

NVIDIA has killed SLI on its graphics cards, disabling the possibility of connecting two or more GPUs to harness their power for gaming and other workloads. However, SLI is making a reincarnation today in the form of a new H100 GPU model that spots higher memory capacity and higher performance. Called the H100 NVL, the GPU is a unique edition design based on the regular H100 PCIe version. What makes the H100 HVL version so special is the boost in memory capacity, now up from 80 GB in the standard model to 94 GB in the NVL edition SKU, for a total of 188 GB of HMB3 memory, running on a 6144-bit bus. Being a special edition SKU, it is sold only in pairs, as these H100 NVL GPUs are paired together and are connected by three NVLink connectors on top. Installation requires two PCIe slots, separated by dual-slot spacing.

The performance differences between the H100 PCIe version and the H100 SXM version are now matched with the new H100 NVL, as the card features a boost in the TDP with up to 400 Watts per card, which is configurable. The H100 NVL uses the same Tensor and CUDA core configuration as the SXM edition, except it is placed on a PCIe slot and connected to another card. Being sold in pairs, OEMs can outfit their systems with either two or four pairs per certified system. You can see the specification table below, with information filled out by AnandTech. As NVIDIA says, the need for this special edition SKU is the emergence of Large Language Models (LLMs) that require significant computational power to run. "Servers equipped with H100 NVL GPUs increase GPT-175B model performance up to 12X over NVIDIA DGX A100 systems while maintaining low latency in power-constrained data center environments," noted the company.

ASUS Announces NVIDIA-Certified Servers and ProArt Studiobook Pro 16 OLED at GTC

ASUS today announced its participation in NVIDIA GTC, a developer conference for the era of AI and the metaverse. ASUS will offer comprehensive NVIDIA-certified server solutions that support the latest NVIDIA L4 Tensor Core GPU—which accelerates real-time video AI and generative AI—as well as the NVIDIA BlueField -3 DPU, igniting unprecedented innovation for supercomputing infrastructure. ASUS will also launch the new ProArt Studiobook Pro 16 OLED laptop with the NVIDIA RTX 3000 Ada Generation Laptop GPU for mobile creative professionals.

Purpose-built GPU servers for generative AI
Generative AI applications enable businesses to develop better products and services, and deliver original content tailored to the unique needs of customers and audiences. ASUS ESC8000 and ESC4000 are fully certified NVIDIA servers that support up to eight NVIDIA L4 Tensor Core GPUs, which deliver universal acceleration and energy efficiency for AI with up to 2.7X more generative AI performance than the previous GPU generation. ASUS ESC and RS series servers are engineered for HPC workloads, with support for the NVIDIA Bluefield-3 DPU to transform data center infrastructure, as well as NVIDIA AI Enterprise applications for streamlined AI workflows and deployment.

NVIDIA Redefines Workstations to Power New Era of AI, Design, Industrial Metaverse

NVIDIA today announced six new NVIDIA RTX Ada Lovelace architecture GPUs for laptops and desktops, which enable creators, engineers and data scientists to meet the demands of the new era of AI, design and the metaverse. Using the new NVIDIA RTX GPUs with NVIDIA Omniverse, a platform for building and operating metaverse applications, designers can simulate a concept before making it a reality, planners can visualize an entire factory before it is built and engineers can evaluate their designs in real time.

The NVIDIA RTX 5000, RTX 4000, RTX 3500, RTX 3000 and RTX 2000 Ada Generation laptop GPUs deliver breakthrough performance and up to 2x the efficiency of the previous generation to tackle the most demanding workflows. For the desktop, the NVIDIA RTX 4000 Small Form Factor (SFF) Ada Generation GPU features new RT Cores, Tensor Cores and CUDA cores with 20 GB of graphics memory to deliver incredible performance in a compact card.

NVIDIA Announces Microsoft, Tencent, Baidu Adopting CV-CUDA for Computer Vision AI

Microsoft, Tencent and Baidu are adopting NVIDIA CV-CUDA for computer vision AI. NVIDIA CEO Jensen Huang highlighted work in content understanding, visual search and deep learning Tuesday as he announced the beta release for NVIDIA's CV-CUDA—an open-source, GPU-accelerated library for computer vision at cloud scale. "Eighty percent of internet traffic is video, user-generated video content is driving significant growth and consuming massive amounts of power," said Huang in his keynote at NVIDIA's GTC technology conference. "We should accelerate all video processing and reclaim the power."

CV-CUDA promises to help companies across the world build and scale end-to-end, AI-based computer vision and image processing pipelines on GPUs. The majority of internet traffic is video and image data, driving incredible scale in applications such as content creation, visual search and recommendation, and mapping. These applications use a specialized, recurring set of computer vision and image-processing algorithms to process image and video data before and after they're processed by neural networks.

Alleged NVIDIA AD106 GPU Tested in 3DMark and AIDA64

Benchmarks and specifications of an alleged NVIDIA AD106 GPU have tipped up on Chiphell, although the original poster has since removed all the details. Thanks to @harukaze5719 on Twitter, who posted the details, we still get an insight into what we might be able to expect from NVIDIA's upcoming mid-range cards. All these details should be taken as is, as the original source isn't exactly what we'd call trustworthy. Based on the data in the TPU GPU database, the GPU in question should be the GeForce RTX 4070 Mobile with much higher clock speeds or an equivalent desktop part that offers more CUDA cores than the RTX 4060 Ti. Whatever the specific AD106 GPU is, it's being compared to the GeForce RTX 2080 Super and the RTX 3070 Ti.

The GPU was tested in AIDA64 and 3DMark and it beats the RTX 2080 Super in all of the tests, while drawing some 55 W less power at the same time. In some of the benchmarks the wins are within the margin of testing error, for example when it comes to the memory performance in AIDA64. However, we're looking at a GPU connected to only half the memory bandwidth here, as the AD106 GPU only has a 128-bit memory bus, compared to 256-bit for the RTX 2080 Super, although the memory clocks are much higher, but the overall memory bandwidth is still nearly 36 percent higher in the RTX 2080 Super. Yet, the AD106 GPU manages to beat the RTX 2080 Super in all of the memory benchmarks in AIDA64.

PNY GeForce RTX 4070 Ti Specifications Leak, Similar to the Canceled RTX 4080 12 GB Edition

VideoCardz has obtained images and specifications of PNY's two upcoming GeForce RTX 4070 Ti models. According to the latest leak, these GPUs are equipped with 7680 CUDA cores and 12 GB of GDDR6X memory. This configuration resembles the canceled GeForce RTX 4080 12 GB edition card, which confirms that NVIDIA will rebrand it under the RTX 4070 Ti naming scheme. PNY has prepared GeForce RTX 4070 Ti XLR8 VERTO and VERTO GPUs, with the difference in cooler design and applied factory overclocking. The XLR8 version bears the same cooler as the RTX 4080 XRL8 card, adapted for the RTX 4070 Ti GPU SKU. This design should naturally offer greater overclocking performance than the regular VERTO SKU.

The render below looks like the 16-pin 12VHPWR power connector remains on these cards and that PNY has not swapped it for another solution. We expect to hear more about these cards on January 5th, when NVIDIA plans to launch.

NVIDIA GeForce RTX 4060 Ti to Feature Shorter PCB, 220 Watt TDP, and 16-Pin 12VHPWR Power Connector

While NVIDIA has launched high-end GeForce RTX 4090 and RTX 4080 GPUs from its Ada Lovelace family, middle and lower-end products are brewing to satisfy the entire consumer market. Today, according to the kopite7kimi, a well-known leaker, we have potential information about the configuration of the upcoming GeForce RTX 4060 Ti graphics card. Featuring 4352 FP32 CUDA cores, the GPU is powered by an AD106-350-A1 die. On the die, there is 32 MB of L2 cache. To pair, it has 8 GB of GDDR6 18 Gbps memory, which should be enough to power games at 1440p resolution, which this card is aiming for.

The design of the cards reference PG190 PCB is supposedly very short, making it ideal for ITX-sized designs we could see from NVIDIA's AIB partners. Interestingly, with a TDP of 220 Watts, the reference card is powered by the infamous 16-pin 12VHPWR connector, capable of supplying 600 Watts of power. This choice of connector is unclear; however, it could be NVIDIA's push to standardize its usage across all products in the Ada Lovelace family stack. While the card should not need the full potential of the connector, it signals that the company could only be using this type of connector for all of its future designs.

NVIDIA GeForce RTX 4090 16 GB Laptop SKU Spotted in Next-Gen HP Omen 17 Laptop

According to the well-known hardware leaker @momomo_us, HP is preparing the launch of its next-generation Omen 17 gaming laptops. And with a new generation of chips coming to consumers, HP accidentally made some information about laptop SKUs public. Four models are listed, and they represent a combination of Intel's 13th-generation Raptor Lake mobile processors with NVIDIA's Ada Lovelace RTX 40 series graphics cards for the mobile/laptop sector. The four SKUs are: CM2007NQ/CM2005NQ with Core i7-13700HX & RTX 4060 8 GB; CM2001NQ with Core i7-13700HX & RTX 4070 8 GB; CK2007NQ/CK2004NQ with Core i7-13700HX & RTX 4080 12 GB; CK2001NQ with Core i7-13700HX & RTX 4090 16 GB.

The most exciting find here is the appearance of the xx90 series in the mobile/laptop form factor, which has not been the case before. The GeForce RTX 4090 laptop edition is supposedly equipped with 16 GB of VRAM, and the GPU SKU should be a cut-down version of AD102 GPU adjusted for power and clock constraints so it can run within a reasonable TDP. With NVIDIA seemingly giving its clients an RTX 4090 SKU option, we have to wait and see what the CUDA core counts are and how clocks scale in a more restricted laptop environment.

NVIDIA Could Launch Hopper H100 PCIe GPU with 120 GB Memory

NVIDIA's high-performance computing hardware stack is now equipped with the top-of-the-line Hopper H100 GPU. It features 16896 or 14592 CUDA cores, developing if it comes in SXM5 of PCIe variant, with the former being more powerful. Both variants come with a 5120-bit interface, with the SXM5 version using HBM3 memory running at 3.0 Gbps speed and the PCIe version using HBM2E memory running at 2.0 Gbps. Both versions use the same capacity capped at 80 GBs. However, that could soon change with the latest rumor suggesting that NVIDIA could be preparing a PCIe version of Hopper H100 GPU with 120 GBs of an unknown type of memory installed.

According to the Chinese website "s-ss.cc" the 120 GB variant of the H100 PCIe card will feature an entire GH100 chip with everything unlocked. As the site suggests, this version will improve memory capacity and performance over the regular H100 PCIe SKU. With HPC workloads increasing in size and complexity, more significant memory allocation is needed for better performance. With the recent advances in Large Language Models (LLMs), AI workloads use trillions of parameters for tranining, most of which is done on GPUs like NVIDIA H100.

NVIDIA Introduces L40 Omniverse Graphics Card

During its GTC 2022 session, NVIDIA introduced its new generation of gaming graphics cards based on the novel Ada Lovelace architecture. Dubbed NVIDIA GeForce RTX 40 series, it brings various updates like more CUDA cores, a new DLSS 3 version, 4th generation Tensor cores, 3rd generation Ray Tracing cores, and much more, which you can read about here. However, today, we also got a new Ada Lovelace card intended for the data center. Called the L40, NVIDIA updated its previous Ampere-based A40 design. While the NVIDIA website provides sparse, the new L40 GPU uses 48 GB GDDR6 memory with ECC error correction. Using NVLink, you can get 96GBs of VRAM. Paired with an unknown SKU, we assume that it uses AD102 with adjusted frequencies to lower the TDP and allow for passive cooling.

NVIDIA is calling this their Omniverse GPU, as it is a part of the push to separate its GPUs used for graphics and AI/HPC models. The "L" model in the current product stack is used to accelerate graphics, with display ports installed on the GPU, while the "H" models (H100) are there to accelerate HPC/AI installments where visual elements are a secondary task. This is a further separation of the entire GPU market, where the HPC/AI SKUs get their own architecture, and GPUs for graphics processing are built on a new architecture as well. You can see the specifications provided by NVIDIA below.

NVIDIA Ada's 4th Gen Tensor Core, 3rd Gen RT Core, and Latest CUDA Core at a Glance

Yesterday, NVIDIA launched its GeForce RTX 40-series, based on the "Ada" graphics architecture. We're yet to receive a technical briefing about the architecture itself, and the various hardware components that make up the silicon; but NVIDIA on its website gave us a first look at what's in store with the key number-crunching components of "Ada," namely the Ada CUDA core, 4th generation Tensor core, and 3rd generation RT core. Besides generational IPC and clock speed improvements, the latest CUDA core benefits from SER (shader execution reordering), an SM or GPC-level feature that reorders execution waves/threads to optimally load each CUDA core and improve parallelism.

Despite using specialized hardware such as the RT cores, the ray tracing pipeline still relies on CUDA cores and the CPU for a handful tasks, and here NVIDIA claims that SER contributes to a 3X ray tracing performance uplift (the performance contribution of CUDA cores). With traditional raster graphics, SER contributes a meaty 25% performance uplift. With Ada, NVIDIA is introducing its 4th generation of Tensor core (after Volta, Turing, and Ampere). The Tensor cores deployed on Ada are functionally identical to the ones on the Hopper H100 Tensor Core HPC processor, featuring the new FP8 Transformer Engine, which delivers up to 5X the AI inference performance over the previous generation Ampere Tensor Core (which itself delivered a similar leap by leveraging sparsity).

NVIDIA Jetson Orin Nano Sets New Standard for Entry-Level Edge AI and Robotics With 80x Performance Leap

NVIDIA today expanded the NVIDIA Jetson lineup with the launch of new Jetson Orin Nano system-on-modules that deliver up to 80x the performance over the prior generation, setting a new standard for entry-level edge AI and robotics. For the first time, the NVIDIA Jetson family spans six Orin-based production modules to support a full range of edge AI and robotics applications. This includes the Orin Nano—which delivers up to 40 trillion operations per second (TOPS) of AI performance in the smallest Jetson form factor—up to the AGX Orin, delivering 275 TOPS for advanced autonomous machines.

Jetson Orin features an NVIDIA Ampere architecture GPU, Arm-based CPUs, next-generation deep learning and vision accelerators, high-speed interfaces, fast memory bandwidth and multimodal sensor support. This performance and versatility empower more customers to commercialize products that once seemed impossible, from engineers deploying edge AI applications to Robotics Operating System (ROS) developers building next-generation intelligent machines.

NVIDIA GeForce RTX 4080 Comes in 12GB and 16GB Variants

NVIDIA's upcoming GeForce RTX 4080 "Ada," a successor to the RTX 3080 "Ampere," reportedly comes in two distinct variants based on memory size, memory bus width, and possibly even core-configuration. MEGAsizeGPU reports that they have seen two reference designs for the RTX 4080, one with 12 GB of memory and a 10-layer PCB, and the other with 16 GB of memory and a 12-layer PCB. Increasing numbers of PCB layers enable greater density of wiring around the ASIC. At debut, the flagship product from NVIDIA is expected to be the RTX 4090, with its 24 GB memory size, and 14-layer PCB. Apparently, the 12 GB and 16 GB variants of the RTX 4080 feature vastly different PCB designs.

We've known from past attempts at memory-based variants, such as the GTX 1060 (3 GB vs. 6 GB), or the more recent RTX 3080 (10 GB vs. 12 GB), that NVIDIA turns to other levers to differentiate variants, such as core-configuration (numbers of available CUDA cores), and the same is highly likely with the RTX 4080. The RTX 4080 12 GB, RTX 4080 16 GB, and the RTX 4090, could be NVIDIA's answers to AMD's RDNA3-based successors of the RX 6800, RX 6800 XT, and RX 6950 XT, respectively.

NVIDIA Hopper Features "SM-to-SM" Comms Within GPC That Minimize Cache Roundtrips and Boost Multi-Instance Performance

NVIDIA in its HotChips 34 presentation revealed a defining feature of its "Hopper" compute architecture that works to increase parallelism and help the H100 processor better perform in a multi-instance environment. The hardware component hierarchy of "Hopper" is typical of NVIDIA architectures, with GPCs, SMs, and CUDA cores forming a hierarchy. The company is introducing a new component it calls "SM to SM Network." This is a high-bandwidth communications fabric inside the Graphics Processing Cluster (GPC), which facilitates direct communication among the SMs without making round-trips to the cache or memory hierarchy, play a significant role in NVIDIA's overarching claim of "6x throughput gain over the A100."

Direct SM-to-SM communication not just impacts latency, but also unburdens the L2 cache, letting NVIDIA's memory-management free up the cache of "cooler" (infrequently accessed) data. CUDA sees every GPU as a "grid," every GPC as a "Cluster," every SM as a "thread block," and every lane of SIMD units as a "lane." Each lane has a 64 KB of shared memory, which makes up 256 KB of shared local storage per SM as there are four lanes. The GPCs interface with 50 MB of L2 cache, which is the last-level on-die cache before the 80 GB of HBM3 serves as main memory.

NVIDIA Jetson AGX Orin 32GB Production Modules Now Available

Bringing new AI and robotics applications and products to market, or supporting existing ones, can be challenging for developers and enterprises. The NVIDIA Jetson AGX Orin 32 GB production module—available now—is here to help. Nearly three dozen technology providers in the NVIDIA Partner Network worldwide are offering commercially available products powered by the new module, which provides up to a 6x performance leap over the previous generation.

With a wide range of offerings from Jetson partners, developers can build and deploy feature-packed Orin-powered systems sporting cameras, sensors, software and connectivity suited for edge AI, robotics, AIoT and embedded applications. Production-ready systems with options for peripherals enable customers to tackle challenges in industries from manufacturing, retail and construction to agriculture, logistics, healthcare, smart cities, last-mile delivery and more.

NVIDIA GeForce RTX 40 Series "AD104" Could Match RTX 3090 Ti Performance

NVIDIA's upcoming GeForce RTX 40 series Ada Lovelace graphics card lineup is slowly shaping up to be a significant performance uplift compared to the previous generation. Today, according to a well-known hardware leaker kopite7kimi, we are speculating that a mid-range AD104 SKU could match the performance of the last-generation flagship GeForce RTX 3090 Ti graphics card. The full AD104 SKU is set to feature 7680 FP32 CUDA cores, paired with 12 GB of 21 Gbps GDDR6X memory running on a 192-bit bus. Coming with a large TGP of 400 Watts, it should have a performance of the GA102-350-A1 SKU found in GeForce RTX 3090 Ti.

Regarding naming this complete AD104 SKU, it should end up as a GeForce RTX 4070 Ti model. Of course, we must wait and see what NVIDIA decides to do with the lineup and what the final models will look like.

NVIDIA GeForce RTX 4090 Twice as Fast as RTX 3090, Features 16128 CUDA Cores and 450W TDP

NVIDIA's next-generation GeForce RTX 40 series of graphics cards, codenamed Ada Lovelace, is shaping up to be a powerful graphics card lineup. Allegedly, we can expect to see a mid-July launch of NVIDIA's newest gaming offerings, where customers can expect some impressive performance. According to a reliable hardware leaker, kopite7kimi, NVIDIA GeForce RTX 4090 graphics card will feature AD102-300 GPU SKU. This model is equipped with 126 Streaming Multiprocessors (SMs), which brings the total number of FP32 CUDA cores to 16128. Compared to the full AD102 GPU with 144 SMs, this leads us to think that there will be an RTX 4090 Ti model following up later as well.

Paired with 24 GB of 21 Gbps GDDR6X memory, the RTX 4090 graphics card has a TDP of 450 Watts. While this number may appear as a very power-hungry design, bear in mind that the targeted performance improvement over the previous RTX 3090 model is expected to be a two-fold scale. Paired with TSMC's new N4 node and new architecture design, performance scaling should follow at the cost of higher TDPs. These claims are yet to be validated by real-world benchmarks of independent tech media, so please take all of this information with a grain of salt and wait for TechPowerUp reviews once the card arrives.

Moore Threads Unveils MTT S60 & MTT S2000 Graphics Cards with DirectX Support

Chinese company Moore Threads has unveiled their MTT GPU series just 18 months after the company's establishment in 2020. The MT Unified System Architecture (MUSA) architecture is the first for any Chinese company to be developed fully domestically and includes support for DirectX, OpenCL, OpenGL, Vulkan, and CUDA. The company announced the MTT S60 and MTT S2000 single slot desktop graphics cards for gaming and server applications at a recent event. The MTT S60 is manufactured on a 12 nm node and features 2,048 MUSA cores paired with 8 GB of LPGDDR4X memory offering 6 TFLOPs of performance. The MTT S2000 is also manufactured on a 12 nm node and doubles the number of MUSA cores to 4096 paired with 32 GB of undisclosed video memory allowing it to reach 12 TFLOPs.

Moore Threads joins Intel in supporting AV1 encoding on a consumer GPU with MUSA cards featuring H.264, H.265, and AV1 encoding support in addition to H.264, H.265, AV1, VP8, and VP9 decoding. The company is also developing a physics engine dubbed Alphacore which is said to work with existing tools such as Unity, Unreal Engine, and Houdini to accelerate physics performance by 5 to 10 times. The only gaming performance shown was a simple demonstration of the MTT S60 running League of Legends at 1080p without any frame rate details.

AAEON Announces BOXER-8260AI and BOXER-8261 Powered by NVIDIA Jetson AGX Orin

With the announcement of the NVIDIA Jetson AGX Orin developer kit, AAEON is excited to utilize the many benefits that such a powerful system-on-module (SOM) can bring to its own product lines. With the same form factor and pin compatibility as the NVIDIA Jetson AGX Xavier, but with an improvement from 32 TOPS to 275 TOPS, the NVIDIA Jetson AGX Orin is set to make it easier than ever to develop faster, more sophisticated AI applications.

AAEON is therefore pleased to announce two upcoming products available in Q4 which will feature the Jetson AGX Orin 32 GB and Jetson AGX Orin 64 GB as their respective processor modules: the BOXER-8260AI and BOXER-8261 AI@Edge Embedded BOX PCs. Both products will feature the NVIDIA JetPack 5.0 SDK to support the full Jetson software stack to help in the development of AI applications in areas such as high-end autonomous machinery. With two NVIDIA deep learning accelerators (NVDLA), along with a 32 GB 256-bit system memory, the BOXER-8260AI will provide the perfect device for vision-based AI applications. Moreover, its expansive I/O options include 12 RJ-45 slots for PoE, along with DB-9 slots for CANbus and six DIO.
Return to Keyword Browsing
May 21st, 2024 09:53 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts