News Posts matching #CUDA

Return to Keyword Browsing

First NVIDIA GeForce RTX 5090 GPU with 32 GB GDDR7 Memory Leaks Ahead of CES Keynote

NVIDIA's unannounced GeForce RTX 5090 graphics card has leaked, confirming key specifications of the next-generation GPU. Thanks to exclusive information from VideoCardz, we can see the packaging of Inno3D's RTX 5090 iChill X3 model, which confirms that the graphics card will feature 32 GB of GDDR7 memory. The leaked materials show that Inno3D's variant will use a 3.5-slot cooling system, suggesting significant cooling requirements for the flagship card. According to earlier leaks, the RTX 5090 will be based on the GB202 GPU and include 21,760 CUDA cores. The card's memory system is a significant upgrade, with its 32 GB of GDDR7 memory running on a 512-bit memory bus at 28 Gbps, capable of delivering nearly 1.8 TB/s of bandwidth. This represents twice the memory capacity of the upcoming RTX 5080, which is expected to ship with 16 GB capacity but 30 Gbps GDDR7 modules.

Power consumption has increased significantly, with the RTX 5090's TDP rated at 575 W and TGP of 600 W, marking a 125-watt increase over the previous RTX 4090 in raw TDP. NVIDIA is scheduled to hold its CES keynote today at 06:30 pm PT time, where the company is expected to announce several new graphics cards officially. The lineup should include the RTX 5090, RTX 5080, RTX 5070 Ti, RTX 5070, and an RTX 5090D model specifically for the Chinese market. Early indications are that the RTX 5080 will be the first card to reach consumers, with a planned release date of January 21st. Release dates for other models, including the flagship RTX 5090, have not yet been confirmed. The RTX 5090 is currently the only card in the RTX 50 series planned to use the GB202 GPU. Pricing information and additional specifications are expected to be revealed during the upcoming announcement.

Nintendo Switch 2 PCB Leak Reveals an NVIDIA Tegra T239 Chip Optically Shrunk to 5nm

Nintendo Switch 2 promises to be this year's big (well small) gaming platform launch. It goes up against a growing ecosystem of handhelds based on x86-64 mobile processors running Windows, its main play would have to be offering a similar or better gameplay experience, but with better battery life, given that all of its hardware is purpose-built for a handheld console, and runs a highly optimized software stack; and the SoC forms a big part of this. Nintendo turned to NVIDIA for the job, given its graphics IP leadership, and its ability to integrate it with Arm CPU IP in a semi-custom chip. Someone with access to a Switch 2 prototype, likely an ISV, took the device apart, revealing the chip, a die-shrunk version of the Tegra T239 from 2023.

It's important to note that prototype consoles physically appear nothing like the final product, they're just designed so ISVs and game developers can validate them, and together with PC-based "official" emulation, set up the ability to develop or port games to the new platform. The Switch 2 looks very similar to the original Switch, it is a large tablet-like device, with detachable controllers. The largest chip on the mainboard is the NVIDIA Tegra T239. Nintendo Prime shared more details about the chip.

NVIDIA Plans GeForce RTX 5080 "Blackwell" Availability on January 21, Right After CES Announcement

Hong Kong tech media HKEPC report indicates that NVIDIA's GeForce RTX 5080 graphics card will launch on January 21, 2025. The release follows a planned announcement event on January 6, where CEO Jensen Huang will present the new "Blackwell" architecture. Anticipated specifications based on prior rumors point to RTX 5080 using GB203-400-A1 chip, containing 10,752 CUDA cores across 84 SM. The card maintains 16 GB of memory but upgrades to GDDR7 technology running at 30 Gbps, while other cards in the series are expected to use 28 Gbps memory. The graphics card is manufactured using TSMC's 4NP 4 nm node. This improvement in manufacturing technology, combined with architectural changes, accounts for most of the expected performance gains, as the raw CUDA core count only increased by 10% over the RTX 4080. NVIDIA is also introducing larger segmentation between its Blackwell SKUs, as the RTX 5090 has nearly double CUDA cores and double GDDR7 memory capacity.

NVIDIA is organizing a GeForce LAN event two days before the announcement, marking the return of this gathering after 13 years, so the timing is interesting. NVIDIA wants to capture gamer's hearts with 50 hours of non-stop gameplay. Meanwhile, AMD currently has no competing products announced in the high-end graphics segment, leaving NVIDIA without direct competition in this performance tier. This market situation could affect the final pricing of the RTX 5080, which will be revealed during the January keynote. While the January 21 date appears set for the RTX 5080, launch dates for other cards in the Blackwell family, including the RTX 5090 and RTX 5070 series, remain unconfirmed. NVIDIA typically releases different models in their GPU families on separate dates to manage production and distribution effectively.

NVIDIA GeForce RTX 5070 and RTX 5070 Ti Final Specifications Seemingly Confirmed

Thanks to kopite7kimi, we are able to finalize the leaked specifications of NVIDIA's upcoming GeForce RTX 5070 and RTX 5070 Ti graphics cards.
Starting off with RTX 5070 Ti, it will feature 8,960 CUDA cores and come equipped with 16 GB GDDR7 memory on a 256-bit memory bus, offering 896 GB/s bandwidth. The card is reportedly designed with a total board power (TBP) of 300 W. The Ti variant appears to use the PG147-SKU60 board design with a GB203-300-A1 GPU. The standard RTX 5070 is positioned as a more power-efficient option, with specifications pointing to 6,144 CUDA cores and 12 GB of GDDR7 memory on a 192-bit bus, with 627 GB/s memory bandwidth. This model is expected to operate at a slightly lower 250 W TBP.

Interestingly, the non-Ti RTX 5070 card will be available in two board variants, PG146 and PG147, both utilizing the GB205-300-A1 GPU. While we don't know what the pricing structure looks like, we see that NVIDIA has chosen to make more considerable differentiating factors between its SKUs. The Ti variant not only gets an extra four GB of GDDR7 memory, but it also gets a whopping 45% increase in CUDA core count, going from 6,144 to 8,960 cores. While we wait for the CES to see the initial wave of GeForce RTX 50 series cards, the GeForce RTX 5070 and RTX 5070 Ti are expected to arrive later, possibly after RTX 5080 and RTX 5090 GPUs.

AMD's Pain Point is ROCm Software, NVIDIA's CUDA Software is Still Superior for AI Development: Report

The battle of AI acceleration in the data center is, as most readers are aware, insanely competitive, with NVIDIA offering a top-tier software stack. However, AMD has tried in recent years to capture a part of the revenue that hyperscalers and OEMs are willing to spend with its Instinct MI300X accelerator lineup for AI and HPC. Despite having decent hardware, the company is not close to bridging the gap software-wise with its competitor, NVIDIA. According to the latest report from SemiAnalysis, a research and consultancy firm, they have run a five-month experiment using Instinct MI300X for training and benchmark runs. And the findings were surprising: even with better hardware, AMD's software stack, including ROCm, has massively degraded AMD's performance.

"When comparing NVIDIA's GPUs to AMD's MI300X, we found that the potential on paper advantage of the MI300X was not realized due to a lack within AMD public release software stack and the lack of testing from AMD," noted SemiAnalysis, breaking down arguments in the report further, adding that "AMD's software experience is riddled with bugs rendering out of the box training with AMD is impossible. We were hopeful that AMD could emerge as a strong competitor to NVIDIA in training workloads, but, as of today, this is unfortunately not the case. The CUDA moat has yet to be crossed by AMD due to AMD's weaker-than-expected software Quality Assurance (QA) culture and its challenging out-of-the-box experience."

Acer Leaks GeForce RTX 5090 and RTX 5080 GPU, Memory Sizes Confirmed

Acer has jumped the gun and listed its ACER Predator Orion 7000 systems with the upcoming NVIDIA RTX 50 series graphics cards, namely the GeForce RTX 5080 and the GeForce RTX 5090. In addition, the listing confirms that the GeForce RTX 5080 will come with 16 GB of GDDR7 memory, while the GeForce RTX 5090 will get 32 GB of GDDR7 memory.

The ACER Predator Orion 7000 gaming PC was announced back in September, together with Intel's Core Ultra 200 series, and it does not come as a surprise that this high-end pre-built system will now be getting NVIDIA's new GeForce RTX 50 series graphics cards. In case you missed previous rumors, the GeForce RTX 5080 is expected to use the GB203-400 GPU with 10,752 CUDA cores, and come with 16 GB of GDDR7 memory on a 256-bit memory interface. The GeForce RTX 5090, on the other hand, gets the GB202-300 GPU with 21,760 CUDA cores and packs 32 GB of GDDR7 memory.

NVIDIA GeForce RTX 5070 Ti Leak Tips More VRAM, Cores, and Power Draw

It's an open secret by now that NVIDIA's GeForce RTX 5000 series GPUs are on the way, with an early 2025 launch on the cards. Now, preliminary details about the RTX 5070 Ti have leaked, revealing an increase in both VRAM and TDP and suggesting that the new upper mid-range GPU will finally address the increased VRAM demand from modern games. According to the leak from Wccftech, the RTX 5070 Ti will have 16 GB of GDDR7 VRAM, up from 12 GB on the RTX 4070 Ti, as we previously speculated. Also confirming previous leaks, the new sources confirm that the 5070 Ti will use the cut-down GB203 chip, although the new leak points to a significantly higher TBP of 350 W. The new memory configuration will supposedly run on a 256-bit memory bus and run at 28 Gbps for a total memory bandwidth of 896 GB/s, which is a significant boost over the RTX 4070 Ti.

Supposedly, the RTX 5070 Ti will also see a bump in total CUDA cores, from 7680 in the RTX 4070 Ti to 8960 in the RTX 5070 Ti. The new RTX 5070 Ti will also switch to the 12V-2x6 power connector, compared to the 16-pin connector from the 4070 Ti. NVIDIA is expected to announce the RTX 5000 series graphics cards at CES 2025 in early January, but the RTX 5070 Ti will supposedly be the third card in the 5000-series launch cycle. That said, leaks suggest that the 5070 Ti will still launch in Q1 2025, meaning we may see an indication of specs at CES 2025, although pricing is still unclear.

Update Dec 16th: Kopite7kimi, ubiquitous hardware leaker, has since responded to the RTX 5070 Ti leaks, stating that 350 W may be on the higher end for the RTX 5070 Ti: "...the latest data shows 285W. However, 350W is also one of the configs." This could mean that a TBP of 350 W is possible, although maybe only on certain graphics card models, if competition is strong, or in certain boost scenarios.

Nintendo Switch Successor: Backward Compatibility Confirmed for 2025 Launch

Nintendo has officially announced that its next-generation Switch console will feature backward compatibility, allowing players to use their existing game libraries on the new system. However, those eagerly awaiting the console's release may need to exercise patience as launch expectations have shifted to early 2025. On the official X account, Nintendo has announced: "At today's Corporate Management Policy Briefing, we announced that Nintendo Switch software will also be playable on the successor to Nintendo Switch. Nintendo Switch Online will be available on the successor to Nintendo Switch as well. Further information about the successor to Nintendo Switch, including its compatibility with Nintendo Switch, will be announced at a later date."

While the original Switch evolved from a 20 nm Tegra X1 to a more power-efficient 16 nm Tegra X1+ SoC (both featuring four Cortex-A57 and four Cortex-A53 cores with GM20B Maxwell GPUs), the Switch 2 is rumored to utilize a customized variant of NVIDIA's Jetson Orin SoC, now codenamed T239. The new chip represents a significant upgrade with its 12 Cortex-A78AE cores, LPDDR5 memory, and Ampere GPU architecture with 1,536 CUDA cores, promising enhanced battery efficiency and DLSS capabilities for the handheld gaming market. With the holiday 2024 release window now seemingly off the table, the new console is anticipated to debut in the first half of 2025, marking nearly eight years since the original Switch's launch.

NVIDIA Releases GeForce 565.90 WHQL Game Ready Driver

NVIDIA has released its latest GeForce graphics drivers, the GeForce 565.90 WHQL Game Ready drivers. As a new Game Ready driver, it provides optimizations and support, including NVIDIA DLSS 3, for new games including THRONE AND LIBERTY, MechWarrior 5: Clans, and Starship Troopers: Extermination. The new drivers also add support for CUDA 12.7 and enable RTX HDR multi-monitor support within the latest NVIDIA App beta update.

NVIDIA also fixed several issues, including texture flickering issues with Final Fantasy XV and a frozen white screen and crash issue with Dying Light 2 Stay Human. When it comes to general bugs, the new drivers fix corruption with Steamlink streaming when MSSA is globally enabled, as well as a slight monitor backlight panel flicker issue when FPS drops below 60.

DOWNLOAD: NVIDIA GeForce 565.90 WHQL Game Ready

Advantech Launches AIR-310, Ultra-Low-Profile Scalable AI Inference

Advantech, a leading provider of edge computing solutions, introduces the AIR-310, a compact edge AI inference system featuring an MXM GPU card. Powered by 12th/13th/14th Gen Intel Core 65 W desktop processors, the AIR-310 delivers up to 12.99 TFLOPS of scalable AI performance via the NVIDIA Quadro 2000A GPU card in a 1.5U chassis (215 x 225 x 55 mm). Despite its compact size, it offers versatile connectivity with three LAN ports and four USB 3.0 ports, enabling seamless integration of sensors and cameras for vision AI applications.

The system includes smart fan management, operates in temperatures from 0 to 50°C (32 to 122°F), and is shock-resistant, capable of withstanding 3G vibration and 30G shock. Bundled with Intel Arc A370 and NVIDIA A2000 GPUs, it is certified to IEC 61000-6-2, IEC 61000-6-4, and CB/UL standards, ensuring stable 24/7 operation in harsh environments, including space-constrained or mobile equipment. The AIR-310 supports Windows 11, Linux Ubuntu 24.04, and the Edge AI SDK, enabling accelerated inference deployment for applications such as factory inspections, real-time video surveillance, GenAI/LLM, and medical imaging.

NVIDIA GeForce RTX 5090 and RTX 5080 Specifications Surface, Showing Larger SKU Segmentation

Thanks to the renowned NVIDIA hardware leaker kopite7Kimi on X, we are getting information about the final versions of NVIDIA's first upcoming wave of GeForce RTX 50 series "Blackwell" graphics cards. The two leaked GPUs are the GeForce RTX 5090 and RTX 5080, which now feature a more significant gap between xx80 and xx90 SKUs. For starters, we have the highest-end GeForce RTX 5090. NVIDIA has decided to use the GB202-300-A1 die and enabled 21,760 FP32 CUDA cores on this top-end model. Accompanying the massive 170 SM GPU configuration, the RTX 5090 has 32 GB of GDDR7 memory on a 512-bit bus, with each GDDR7 die running at 28 Gbps. This translates to 1,568 GB/s memory bandwidth. All of this is confined to a 600 W TGP.

When it comes to the GeForce RTX 5080, NVIDIA has decided to further separate its xx80 and xx90 SKUs. The RTX 5080 has 10,752 FP32 CUDA cores paired with 16 GB of GDDR7 memory on a 256-bit bus. With GDDR7 running at 28 Gbps, the memory bandwidth is also halved at 784 GB/s. This SKU uses a GB203-400-A1 die, which is designed to run within a 400 W TGP power envelope. For reference, the RTX 4090 has 68% more CUDA cores than the RTX 4080. The rumored RTX 5090 has around 102% more CUDA cores than the rumored RTX 5080, which means that NVIDIA is separating its top SKUs even more. We are curious to see at what price point NVIDIA places its upcoming GPUs so that we can compare generational updates and the difference between xx80 and xx90 models and their widened gaps.

Nintendo Switch 2 Allegedly Not Powered by AMD APU Due to Poor Battery Life

Nintendo's next-generation Switch 2 handheld gaming console is nearing its release. As leaks intensify about its future specifications, we get information about its planning stages. According to Moore's Law is Dead YouTube video, we learn that Nintendo didn't choose AMD APU to be the powerhouse behind Switch 2 due to poor battery life. In a bid to secure the best chip at a mere five watts of power, the Japanese company had two choices: NVIDIA Tegra or AMD APU. With some preliminary testing and evaluation, AMD APU wasn't reportedly power-efficient at 5 Watt TDP, while the NVIDIA Tegra chip was maintaining sufficient battery life and performance at target specifications.

Allegedly the AMD APU was good for 15 W design, but Nintendo didn't want to place a bigger battery so that the device remains lighter and cheaper. The final design will likely carry a battery with a 20 Wh capacity, which will be the main power source behind the NVIDIA Tegra T239 SoC. As a reminder, the Tegra T239 SoC features eight-core Arm A78C cluster with modified NVIDIA Ampere cores in combination with DLSS, featuring some of the latest encoding/decoding elements from Ada Lovelace, like AV1. There are likely 1536 CUDA cores paired with 128-bit LPDDR5 memory running at 102 GB/s bandwidth. For final specifications, we have to wait for the official launch, but with rumors starting to intensify, we can expect to see it relatively soon.

Interview with AMD's Senior Vice President and Chief Software Officer Andrej Zdravkovic: UDNA, ROCm for Radeon, AI Everywhere, and Much More!

A few days ago, we reported on AMD's newest expansion plans for Serbia. The company opened two new engineering design centers with offices in Belgrade and Nis. We were invited to join the opening ceremony and got an exclusive interview with one of AMD's top executives, Andrej Zdravkovic, who is the senior vice president and Chief Software Officer. Previously, we reported on AMD's transition to become a software company. The company has recently tripled its software engineering workforce and is moving some of its best people to support these teams. AMD's plan is spread over a three to five-year timeframe to improve its software ecosystem, accelerating hardware development to launch new products more frequently and to react to changes in software demand. AMD found that to help these expansion efforts, opening new design centers in Serbia would be very advantageous.

We sat down with Andrej Zdravkovic to discuss the purpose of AMD's establishment in Serbia and the future of some products. Zdravkovic is actually an engineer from Serbia, where he completed his Bachelor's and Master's degrees in electrical engineering from Belgrade University. In 1998, Zdravkovic joined ATI and quickly rose through the ranks, eventually becoming a senior director. During his decade-long tenure, Zdravkovic witnessed a significant industry shift as AMD acquired ATI in 2006. After a brief stint at another company, Zdravkovic returned to AMD in 2015, bringing with him a wealth of experience and a unique perspective on the evolution of the graphics and computing industry.
Here is the full interview:

AMD to Unify Gaming "RDNA" and Data Center "CDNA" into "UDNA": Singular GPU Architecture Similar to NVIDIA's CUDA

According to new information from Tom's Hardware, AMD has announced plans to unify its consumer-focused gaming RDNA and data center CDNA graphics architectures into a single, unified design called "UDNA." The announcement was made by AMD's Jack Huynh, Senior Vice President and General Manager of the Computing and Graphics Business Group, at IFA 2024 in Berlin. The goal of the new UDNA architecture is to provide a single focus point for developers so that each optimized application can run on consumer-grade GPU like Radeon RX 7900XTX as well as high-end data center GPU like Instinct MI300. This will create a unification similar to NVIDIA's CUDA, which enables CUDA-focused developers to run applications on everything ranging from laptops to data centers.
Jack HuynhSo, part of a big change at AMD is today we have a CDNA architecture for our Instinct data center GPUs and RDNA for the consumer stuff. It's forked. Going forward, we will call it UDNA. There'll be one unified architecture, both Instinct and client [consumer]. We'll unify it so that it will be so much easier for developers versus today, where they have to choose and value is not improving.

NVIDIA Shifts Gears: Open-Source Linux GPU Drivers Take Center Stage

Just a few months after hiring Ben Skeggs, a lead maintainer of the open-source NVIDIA GPU driver for Linux kernel, NVIDIA has announced a complete transition to open-source GPU kernel modules in its upcoming R560 driver release for Linux. This decision comes two years after the company's initial foray into open-source territory with the R515 driver in May 2022. The tech giant began focusing on data center compute GPUs, while GeForce and Workstation GPU support remained in the alpha stages. Now, after extensive development and optimization, NVIDIA reports that its open-source modules have achieved performance parity with, and in some cases surpassed, their closed-source counterparts. This transition brings a host of new capabilities, including heterogeneous memory management support, confidential computing features, and compatibility with NVIDIA's Grace platform's coherent memory architectures.

The move to open-source is expected to foster greater collaboration within the Linux ecosystem and potentially lead to faster bug fixes and feature improvements. However, not all GPUs will be compatible with the new open-source modules. While cutting-edge platforms like NVIDIA Grace Hopper and Blackwell will require open-source drivers, older GPUs from the Maxwell, Pascal, or Volta architectures must stick with proprietary drivers. NVIDIA has developed a detection helper script to guide driver selection for users who are unsure about compatibility. The shift also brings changes to NVIDIA's installation processes. The default driver version for most installation methods will now be the open-source variant. This affects package managers with the CUDA meta package, run file installations and even Windows Subsystem for Linux.

AMD is Becoming a Software Company. Here's the Plan

Just a few weeks ago, AMD invited us to Barcelona as part of a roundtable, to share their vision for the future of the company, and to get our feedback. On site, were prominent AMD leadership, including Phil Guido, Executive Vice President & Chief Commercial Officer and Jack Huynh, Senior VP & GM, Computing and Graphics Business Group. AMD is making changes in a big way to how they are approaching technology, shifting their focus from hardware development to emphasizing software, APIs, and AI experiences. Software is no longer just a complement to hardware; it's the core of modern technological ecosystems, and AMD is finally aligning its strategy accordingly.

The major difference between AMD and NVIDIA is that AMD is a hardware company that makes software on the side to support its hardware; while NVIDIA is a software company that designs hardware on the side to accelerate its software. This is about to change, as AMD is making a pivot toward software. They believe that they now have the full stack of computing hardware—all the way from CPUs, to AI accelerators, to GPUs, to FPGAs, to data-processing and even server architecture. The only frontier left for AMD is software.

New Performance Optimizations Supercharge NVIDIA RTX AI PCs for Gamers, Creators and Developers

NVIDIA today announced at Microsoft Build new AI performance optimizations and integrations for Windows that help deliver maximum performance on NVIDIA GeForce RTX AI PCs and NVIDIA RTX workstations. Large language models (LLMs) power some of the most exciting new use cases in generative AI and now run up to 3x faster with ONNX Runtime (ORT) and DirectML using the new NVIDIA R555 Game Ready Driver. ORT and DirectML are high-performance tools used to run AI models locally on Windows PCs.

WebNN, an application programming interface for web developers to deploy AI models, is now accelerated with RTX via DirectML, enabling web apps to incorporate fast, AI-powered capabilities. And PyTorch will support DirectML execution backends, enabling Windows developers to train and infer complex AI models on Windows natively. NVIDIA and Microsoft are collaborating to scale performance on RTX GPUs. These advancements build on NVIDIA's world-leading AI platform, which accelerates more than 500 applications and games on over 100 million RTX AI PCs and workstations worldwide.

NVIDIA Blackwell Platform Pushes the Boundaries of Scientific Computing

Quantum computing. Drug discovery. Fusion energy. Scientific computing and physics-based simulations are poised to make giant steps across domains that benefit humanity as advances in accelerated computing and AI drive the world's next big breakthroughs. NVIDIA unveiled at GTC in March the NVIDIA Blackwell platform, which promises generative AI on trillion-parameter large language models (LLMs) at up to 25x less cost and energy consumption than the NVIDIA Hopper architecture.

Blackwell has powerful implications for AI workloads, and its technology capabilities can also help to deliver discoveries across all types of scientific computing applications, including traditional numerical simulation. By reducing energy costs, accelerated computing and AI drive sustainable computing. Many scientific computing applications already benefit. Weather can be simulated at 200x lower cost and with 300x less energy, while digital twin simulations have 65x lower cost and 58x less energy consumption versus traditional CPU-based systems and others.

NVIDIA Accelerates Quantum Computing Centers Worldwide With CUDA-Q Platform

NVIDIA today announced that it will accelerate quantum computing efforts at national supercomputing centers around the world with the open-source NVIDIA CUDA-Q platform. Supercomputing sites in Germany, Japan and Poland will use the platform to power the quantum processing units (QPUs) inside their NVIDIA-accelerated high-performance computing systems.

QPUs are the brains of quantum computers that use the behavior of particles like electrons or photons to calculate differently than traditional processors, with the potential to make certain types of calculations faster. Germany's Jülich Supercomputing Centre (JSC) at Forschungszentrum Jülich is installing a QPU built by IQM Quantum Computers as a complement to its JUPITER supercomputer, supercharged by the NVIDIA GH200 Grace Hopper Superchip. The ABCI-Q supercomputer, located at the National Institute of Advanced Industrial Science and Technology (AIST) in Japan, is designed to advance the nation's quantum computing initiative. Powered by the NVIDIA Hopper architecture, the system will add a QPU from QuEra. Poland's Poznan Supercomputing and Networking Center (PSNC) has recently installed two photonic QPUs, built by ORCA Computing, connected to a new supercomputer partition accelerated by NVIDIA Hopper.

AIO Workstation Combines 128-Core Arm Processor and Four NVIDIA GPUs Totaling 28,416 CUDA Cores

All-in-one computers are often traditionally seen as lower-powered alternatives to traditional desktop workstations. However, a new offering from Alafia AI, a startup focused on medical imaging appliances, aims to shatter that perception. The company's upcoming Alafia Aivas SuperWorkstation packs serious hardware muscle, demonstrating that all-in-one systems can match the performance of their more modular counterparts. At the heart of the Aivas SuperWorkstation lies a 128-core Ampere Altra processor, running at 3.0 GHz clock speed. This CPU is complemented by not one but three NVIDIA L4 GPUs for compute, and a single NVIDIA RTX 4000 Ada GPU for video output, delivering a combined 28,416 CUDA cores for accelerated parallel computing tasks. The system doesn't skimp on other components, either. It features a 4K touch display with up to 360 nits of brightness, an extensive 2 TB of DDR4 RAM, and storage options up to an 8 TB solid-state drive. This combination of cutting-edge CPU, GPU, memory, and storage is squarely aimed at the demands of medical imaging and AI development workloads.

The all-in-one form factor packs this incredible hardware into a sleek, purposefully designed clinical research appliance. While initially targeting software developers, Alafia AI hopes that institutions that can optimize their applications for the Arm architecture can eventually deploy the Aivas SuperWorkstation for production medical imaging workloads. The company is aiming for application integration in Q3 2024 and full ecosystem device integration by Q4 2024. With this powerful new offering, Alafia AI is challenging long-held assumptions about the performance limitations of all-in-one systems. The Aivas SuperWorkstation demonstrates that the right hardware choices can transform these compact form factors into true powerhouse workstations. Especially with a combined total output of three NVIDIA L4 compute units, alongside RTX 4000 Ada graphics card, the AIO is more powerful than some of the high-end desktop workstations.

Nvidia CEO Reiterates Solid Partnership with TSMC

One key takeaway from the ongoing GTC is that Nvidia's AI empire has taken shape with strong partnerships from TSMC and other Taiwanese makers, such as those major server ODMs.

According to the news report from the technology-focused media DIGITIMES Asia, during his keynote at GTC on March 18, Huang underscored his company's partnerships with TSMC, as well as the supply chain in Taiwan. Speaking to the press later, Huang said Nvidia will have a very strong demand for CoWoS, the advanced packaging services TSMC offers.

Jensen Huang Celebrates Rise of Portable AI Workstations

2024 will be the year generative AI gets personal, the CEOs of NVIDIA and HP said today in a fireside chat, unveiling new laptops that can build, test and run large language models. "This is a renaissance of the personal computer," said NVIDIA founder and CEO Jensen Huang at HP Amplify, a gathering in Las Vegas of about 1,500 resellers and distributors. "The work of creators, designers and data scientists is going to be revolutionized by these new workstations."

Greater Speed and Security
"AI is the biggest thing to come to the PC in decades," said HP's Enrique Lores, in the runup to the announcement of what his company billed as "the industry's largest portfolio of AI PCs and workstations." Compared to running their AI work in the cloud, the new systems will provide increased speed and security while reducing costs and energy, Lores said in a keynote at the event. New HP ZBooks provide a portfolio of mobile AI workstations powered by a full range of NVIDIA RTX Ada Generation GPUs. Entry-level systems with the NVIDIA RTX 500 Ada Generation Laptop GPU let users run generative AI apps and tools wherever they go. High-end models pack the RTX 5000 to deliver up to 682 TOPS, so they can create and run LLMs locally, using retrieval-augmented generation (RAG) to connect to their content for results that are both personalized and private.

NVIDIA and HP Supercharge Data Science and Generative AI on Workstations

NVIDIA and HP Inc. today announced that NVIDIA CUDA-X data processing libraries will be integrated with HP AI workstation solutions to turbocharge the data preparation and processing work that forms the foundation of generative AI development.

Built on the NVIDIA CUDA compute platform, CUDA-X libraries speed data processing for a broad range of data types, including tables, text, images and video. They include the NVIDIA RAPIDS cuDF library, which accelerates the work of the nearly 10 million data scientists using pandas software by up to 110x using an NVIDIA RTX 6000 Ada Generation GPU instead of a CPU-only system, without requiring any code changes.

NVIDIA Cracks Down on CUDA Translation Layers, Changes Licensing Terms

NVIDIA's Compute Unified Device Architecture (CUDA) has long been the de facto standard programming interface for developing GPU-accelerated software. Over the years, NVIDIA has built an entire ecosystem around CUDA, cementing its position as the leading GPU computing and AI manufacturer. However, rivals AMD and Intel have been trying to make inroads with their own open API offerings—ROCm from AMD and oneAPI from Intel. The idea was that developers could more easily run existing CUDA code on non-NVIDIA GPUs by providing open access through translation layers. Developers had created projects like ZLUDA to translate CUDA to ROCm, and Intel's CUDA to SYCL aimed to do the same for oneAPI. However, with the release of CUDA 11.5, NVIDIA appears to have cracked down on these translation efforts by modifying its terms of use, according to developer Longhorn on X.

"You may not reverse engineer, decompile or disassemble any portion of the output generated using Software elements for the purpose of translating such output artifacts to target a non-NVIDIA platform," says the CUDA 11.5 terms of service document. The changes don't seem to be technical in nature but rather licensing restrictions. The impact remains to be seen, depending on how much code still requires translation versus running natively on each vendor's API. While CUDA gave NVIDIA a unique selling point, its supremacy has diminished as more libraries work across hardware. Still, the move could slow the adoption of AMD and Intel offerings by making it harder for developers to port existing CUDA applications. As GPU-accelerated computing grows in fields like AI, the battle for developer mindshare between NVIDIA, AMD, and Intel is heating up.

NVIDIA Announces RTX 500 and 1000 Professional Ada Generation Laptop GPUs

With generative AI and hybrid work environments becoming the new standard, nearly every professional, whether a content creator, researcher or engineer, needs a powerful, AI-accelerated laptop to help users tackle their industry's toughest challenges - even on the go. The new NVIDIA RTX 500 and 1000 Ada Generation Laptop GPUs will be available in new, highly portable mobile workstations, expanding the NVIDIA Ada Lovelace architecture-based lineup, which includes the RTX 2000, 3000, 3500, 4000 and 5000 Ada Generation Laptop GPUs.

AI is rapidly being adopted to drive efficiencies across professional design and content creation workflows and everyday productivity applications, underscoring the importance of having powerful local AI acceleration and sufficient processing power in systems. The next generation of mobile workstations with Ada Generation GPUs, including the RTX 500 and 1000 GPUs, will include both a neural processing unit (NPU), a component of the CPU, and an NVIDIA RTX GPU, which includes Tensor Cores for AI processing. The NPU helps offload light AI tasks, while the GPU provides up to an additional 682 TOPS of AI performance for more demanding day-to-day AI workflows.
Return to Keyword Browsing
Jan 20th, 2025 15:52 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts