News Posts matching #GPU

Return to Keyword Browsing

NVIDIA Fine-Tunes Llama3.1 Model to Beat GPT-4o and Claude 3.5 Sonnet with Only 70 Billion Parameters

NVIDIA has officially released its Llama-3.1-Nemotron-70B-Instruct model. Based on META's Llama3.1 70B, the Nemotron model is a large language model customized by NVIDIA in order to improve the helpfulness of LLM-generated responses. NVIDIA uses fine-tuning structured data to steer the model and allow it to generate more helpful responses. With only 70 billion parameters, the model is punching far above its weight class. The company claims that the model is beating the current top models from leading labs like OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet, which are the current leaders across AI benchmarks. In evaluations such as Arena Hard, the NVIDIA Llama3.1 Nemotron 70B is scoring 85 points, while GPT-4o and Sonnet 3.5 score 79.3 and 79.2, respectively. Other benchmarks like AlpacaEval and MT-Bench spot NVIDIA also hold the top spot, with 57.6 and 8.98 scores earned. Claude and GPT reach 52.4 / 8.81 and 57.5 / 8.74, just below Nemotron.

This language model underwent training using reinforcement learning from human feedback (RLHF), specifically employing the REINFORCE algorithm. The process involved a reward model based on a large language model architecture and custom preference prompts designed to guide the model's behavior. The training began with a pre-existing instruction-tuned language model as the starting point. It was trained on Llama-3.1-Nemotron-70B-Reward and HelpSteer2-Preference prompts on a Llama-3.1-70B-Instruct model as the initial policy. Running the model locally requires either four 40 GB or two 80 GB VRAM GPUs and 150 GB of free disk space. We managed to take it for a spin on NVIDIA's website to say hello to TechPowerUp readers. The model also passes the infamous "strawberry" test, where it has to count the number of specific letters in a word, however, it appears that it was part of the fine-tuning data as it fails the next test, shown in the image below.

Google Shows Production NVIDIA "Blackwell" GB200 NVL System for Cloud

Last week, we got a preview of Microsoft's Azure production-ready NVIDIA "Blackwell" GB200 system, showing that only a third of the rack that goes in the data center is actually holding the compute elements, with the other two-thirds holding the cooling compartment to cool down the immense heat output from tens of GB200 GPUs. Today, Google is showing off a part of its own infrastructure ahead of the Google Cloud App Dev & Infrastructure Summit, taking place on October 30, digitally as an event. Shown below are two racks standing side by side, connecting NVIDIA "Blackwell" GB200 NVL cards with the rest of the Google infrastructure. Unlike Microsoft Azure, Google Cloud uses a different data center design in its facilities.

There is one rack with power distribution units, networking switches, and cooling distribution units, all connected to the compute rack, which houses power supplies, GPUs, and CPU servers. Networking equipment is present, and it connects to Google's "global" data center network, which is Google's own data center fabric. We are not sure what is the fabric connection of choice between these racks; as for optimal performance, NVIDIA recommends InfiniBand (Mellanox acquisition). However, given that Google's infrastructure is set up differently, there may be Ethernet switches present. Interestingly, Google's design of GB200 racks differs from Azure's, as it uses additional rack space to distribute the coolant to its local heat exchangers, i.e., coolers. We are curious to see if Google releases more information on infrastructure, as it has been known as the infrastructure king because of its ability to scale and keep everything organized.

Supermicro's Liquid-Cooled SuperClusters for AI Data Centers Powered by NVIDIA GB200 NVL72 and NVIDIA HGX B200 Systems

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is accelerating the industry's transition to liquid-cooled data centers with the NVIDIA Blackwell platform to deliver a new paradigm of energy-efficiency for the rapidly heightened energy demand of new AI infrastructures. Supermicro's industry-leading end-to-end liquid-cooling solutions are powered by the NVIDIA GB200 NVL72 platform for exascale computing in a single rack and have started sampling to select customers for full-scale production in late Q4. In addition, the recently announced Supermicro X14 and H14 4U liquid-cooled systems and 10U air-cooled systems are production-ready for the NVIDIA HGX B200 8-GPU system.

"We're driving the future of sustainable AI computing, and our liquid-cooled AI solutions are rapidly being adopted by some of the most ambitious AI Infrastructure projects in the world with over 2000 liquid-cooled racks shipped since June 2024," said Charles Liang, president and CEO of Supermicro. "Supermicro's end-to-end liquid-cooling solution, with the NVIDIA Blackwell platform, unlocks the computational power, cost-effectiveness, and energy-efficiency of the next generation of GPUs, such as those that are part of the NVIDIA GB200 NVL72, an exascale computer contained in a single rack. Supermicro's extensive experience in deploying liquid-cooled AI infrastructure, along with comprehensive on-site services, management software, and global manufacturing capacity, provides customers a distinct advantage in transforming data centers with the most powerful and sustainable AI solutions."

Supermicro Adds New Petascale JBOF All-Flash Storage Solution Integrating NVIDIA BlueField-3 DPU for AI Data Pipeline Acceleration

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is launching a new optimized storage system for high performance AI training, inference and HPC workloads. This JBOF (Just a Bunch of Flash) system utilizes up to four NVIDIA BlueField-3 data processing units (DPUs) in a 2U form factor to run software-defined storage workloads. Each BlueField-3 DPU features 400 Gb Ethernet or InfiniBand networking and hardware acceleration for high computation storage and networking workloads such as encryption, compression and erasure coding, as well as AI storage expansion. The state-of-the-art, dual port JBOF architecture enables active-active clustering ensuring high availability for scale up mission critical storage applications as well as scale-out storage such as object storage and parallel file systems.

"Supermicro's new high performance JBOF Storage System is designed using our Building Block approach which enables support for either E3.S or U.2 form-factor SSDs and the latest PCIe Gen 5 connectivity for the SSDs and the DPU networking and storage platform," said Charles Liang, president and CEO of Supermicro. "Supermicro's system design supports 24 or 36 SSD's enabling up to 1.105PB of raw capacity using 30.71 TB SSDs. Our balanced network and storage I/O design can saturate the full 400 Gb/s BlueField-3 line-rate realizing more than 250 GB/s bandwidth of the Gen 5 SSDs."

NVIDIA Contributes Blackwell Platform Design to Open Hardware Ecosystem, Accelerating AI Infrastructure Innovation

To drive the development of open, efficient and scalable data center technologies, NVIDIA today announced that it has contributed foundational elements of its NVIDIA Blackwell accelerated computing platform design to the Open Compute Project (OCP) and broadened NVIDIA Spectrum-X support for OCP standards.

At this year's OCP Global Summit, NVIDIA will be sharing key portions of the NVIDIA GB200 NVL72 system electro-mechanical design with the OCP community — including the rack architecture, compute and switch tray mechanicals, liquid-cooling and thermal environment specifications, and NVIDIA NVLink cable cartridge volumetrics — to support higher compute density and networking bandwidth.

NVIDIA "Blackwell" GPUs are Sold Out for 12 Months, Customers Ordering in 100K GPU Quantities

NVIDIA's "Blackwell" series of GPUs, including B100, B200, and GB200, are reportedly sold out for 12 months or an entire year. This directly means that if a new customer is willing to order a new Blackwell GPU now, there is a 12-month waitlist to get that GPU. Analyst from Morgan Stanley Joe Moore confirmed that in a meeting with NVIDIA and its investors, NVIDIA executives confirmed that the demand for "Blackwell" is so great that there is a 12-month backlog to fulfill first before shipping to anyone else. We expect that this includes customers like Amazon, META, Microsoft, Google, Oracle, and others, who are ordering GPUs in insane quantities to keep up with the demand from their customers.

The previous generation of "Hopper" GPUs was ordered in 10s of thousands of GPUs, while this "Blackwell" generation was ordered in 100s of thousands of GPUs simultaneously. For NVIDIA, that is excellent news, as that demand is expected to continue. The only one standing in the way of customers is TSMC, which manufactures these GPUs as fast as possible to meet demand. NVIDIA is one of TSMC's largest customers, so wafer allocation at TSMC's facilities is only expected to grow. We are now officially in the era of the million-GPU data centers, and we can only question at what point this massive growth stops or if it will stop at all in the near future.

NVIDIA Might Consider Major Design Shift for Future 300 GPU Series

NVIDIA is reportedly considering a significant design change for its GPU products, shifting from the current on-board solution to an independent GPU socket design following the GB200 shipment in Q4, according to reports from MoneyDJ and the Economic Daily News quoted by TrendForce. This move is not new in the industry, AMD has already introduced socket design in 2023 with their MI300A series via Supermicro dedicated servers. The B300 series, expected to become NVIDIA's mainstream product in the second half of 2025, is rumored to be the main beneficiary of this design change that could improve yield rates, though it may come with some performance trade-offs.

According to the Economic Daily News, the socket design will simplify after-sales service and server board maintenance, allowing users to replace or upgrade the GPUs quickly. The report also pointed out that based on the slot design, boards will contain up to four NVIDIA GPUs and a CPU, with each GPU having its dedicated slot. This will bring benefits for Taiwanese manufacturers like Foxconn and LOTES, who will supply different components and connectors. The move seems logical since with the current on-board design, once a GPU becomes faulty, the entire motherboard needs to be replaced, leading to significant downtime and high operational and maintenance costs.

Lenovo Accelerates Business Transformation with New ThinkSystem Servers Engineered for Optimal AI and Powered by AMD

Today, Lenovo announced its industry-leading ThinkSystem infrastructure solutions powered by AMD EPYC 9005 Series processors, as well as AMD Instinct MI325X accelerators. Backed by 225 of AMD's world-record performance benchmarks, the Lenovo ThinkSystem servers deliver an unparalleled combination of AMD technology-based performance and efficiency to tackle today's most demanding edge-to-cloud workloads, including AI training, inferencing and modeling.

"Lenovo is helping organizations of all sizes and across various industries achieve AI-powered business transformations," said Vlad Rozanovich, Senior Vice President, Lenovo Infrastructure Solutions Group. "Not only do we deliver unmatched performance, we offer the right mix of solutions to change the economics of AI and give customers faster time-to-value and improved total value of ownership."

ASRock Rack Unveils New Server Platforms Supporting AMD EPYC 9005 Series Processors and AMD Instinct MI325X Accelerators at AMD Advancing AI 2024

ASRock Rack Inc., a leading innovative server company, announced upgrades to its extensive lineup to support AMD EPYC 9005 Series processors. Among these updates is the introduction of the new 6U8M-TURIN2 GPU server. This advanced platform features AMD Instinct MI325X accelerators, specifically optimized for intensive enterprise AI applications, and will be showcased at AMD Advancing AI 2024.

ASRock Rack Introduce GPU Servers Powered by AMD EPYC 9005 series processors
AMD today revealed the 5th Generation AMD EPYC processors, offering a wide range of core counts (up to 192 cores), frequencies (up to 5 GHz), and expansive cache capacities. Select high-frequency processors, such as the AMD EPYC 9575F, are optimized for use as host CPUs in GPU-enabled systems. Additionally, the just launched AMD Instinct MI325X accelerators feature substantial HBM3E memory and 6 TB/s of memory bandwidth, enabling quick access and efficient handling of large datasets and complex computations.

Supermicro Introduces New Servers and GPU Accelerated Systems with AMD EPYC 9005 Series CPUs and AMD Instinct MI325X GPUs

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, announces the launch of a new series of servers, GPU-accelerated systems, and storage servers featuring the AMD EPYC 9005 Series processors and AMD Instinct MI325X GPUs. The new H14 product line represents one of the most extensive server families in the industry, including Supermicro's Hyper systems, the Twin multi-node servers, and AI inferencing GPU systems, all available with air or liquid cooling options. The new "Zen 5" processor core architecture implements full data path AVX-512 vector instructions for CPU-based AI inference and provides 17% better instructions per cycle (IPC) than the previous 4th generation EPYC processor, enabling more performance per core.

Supermicro's new H14 family uses the latest 5th Gen AMD EPYC processors which enable up to 192 cores per CPU with up to 500 W TDP (thermal design power). Supermicro has designed new H14 systems including the Hyper and FlexTwin systems which can accommodate the higher thermal requirements. The H14 family also includes three systems for AI training and inference workloads supporting up to 10 GPUs which feature the AMD EPYC 9005 Series CPU as the host processor and two which support the AMD Instinct MI325X GPU.

AMD Launches Instinct MI325X Accelerator for AI Workloads: 256 GB HBM3E Memory and 2.6 PetaFLOPS FP8 Compute

During its "Advancing AI" conference today, AMD has updated its AI accelerator portfolio with the Instinct MI325X accelerator, designed to succeed its MI300X predecessor. Built on the CDNA 3 architecture, Instinct MI325X brings a suite of improvements over the old SKU. Now, the MI325X features 256 GB of HBM3E memory running at 6 TB/s bandwidth. The capacity memory alone is a 1.8x improvement over the old MI300 SKU, which features 192 GB of regular HBM3 memory. Providing more memory capacity is crucial as upcoming AI workloads are training models with parameter counts measured in trillions, as opposed to billions with current models we have today. When it comes to compute resources, the Instinct MI325X provides 1.3 PetaFLOPS at FP16 and 2.6 PetaFLOPS at FP8 training and inference. This represents a 1.3x improvement over the Instinct MI300.

A chip alone is worthless without a good platform, and AMD decided to make the Instinct MI325X OAM modules a drop-in replacement for the current platform designed for MI300X, as they are both pin-compatible. In systems packing eight MI325X accelerators, there are 2 TB of HBM3E memory running at 48 TB/s memory bandwidth. Such a system achieves 10.4 PetaFLOPS of FP16 and 20.8 PetaFLOPS of FP8 compute performance. The company uses NVIDIA's H200 HGX as reference claims for its performance competitiveness, where the company claims that the Instinct MI325X outperforms NVIDIA H200 HGX system by 1.3x across the board in memory bandwidth, FP16 / FP8 compute performance and 1.8x in memory capacity.

Intel Officially Launches Core Ultra 200S Series Desktop Processors

Today, Intel launched the new Intel Core Ultra 200S series processor family that will scale AI PC capabilities to desktop platforms and usher in the first enthusiast desktop AI PCs. Led by the Intel Core Ultra 9 processor 285K, the latest generation of enthusiast desktop processors includes five unlocked desktop processors equipped with up to 8 next-gen Performance-cores (P-cores), the fastest cores available for desktop PCs, and up to 16 next-gen Efficient-cores (E-cores) that altogether result in up to 14% more performance in multi-threaded workloads than the previous generation. The new family are the first NPU-enabled desktop processors for enthusiasts and come with a built-in Xe GPU with state-of-the-art media support.

"The new Intel Core Ultra 200S series processors deliver on our goals to significantly cut power usage while retaining outstanding gaming performance and delivering leadership compute. The result is a cooler and quieter user experience elevated by new AI gaming and creation capabilities enabled by the NPU, and leadership media performance that leverages our growing graphics portfolio." - Robert Hallock, vice president and general manager of AI and Technical Marketing, Client Computing Group.

Our preview of the Intel Core Ultra 2-series Arrow Lake-S desktop processor family is also live

Micron Updates Corporate Logo with "Ahead of The Curve" Design

Today, Micron updated its corporate logo with new symbolism. The redesign comes as Micron celebrates over four decades of technological advancement in the semiconductor industry. The new logo features a distinctive silicon color, paying homage to the wafers at the core of Micron's products. Its curved lettering represents the company's ability to stay ahead of industry trends and adapt to rapid technological changes. The design also incorporates vibrant gradient colors inspired by light reflections on wafers, which are the core of Mircorn's memory and storage products.

This rebranding effort coincides with Micron's expanding role in AI, where memory and storage innovations are increasingly crucial. The company has positioned itself beyond a commodity memory supplier, now offering leadership in solutions for AI data centers, high-performance computing, and AI-enabled devices. The company has come far from its original 64K DRAM in 1981 to HBM3E DRAM today. Micron offers different HBM memory products, graphics memory powering consumer GPUs, CXL memory modules, and DRAM components and modules.

Astera Labs Introduces New Portfolio of Fabric Switches Purpose-Built for AI Infrastructure at Cloud-Scale

Astera Labs, Inc, a global leader in semiconductor-based connectivity solutions for AI and cloud infrastructure, today announced a new portfolio of fabric switches, including the industry's first PCIe 6 switch, built from the ground up for demanding AI workloads in accelerated computing platforms deployed at cloud-scale. The Scorpio Smart Fabric Switch portfolio is optimized for AI dataflows to deliver maximum predictable performance per watt, high reliability, easy cloud-scale deployment, reduced time-to-market, and lower total cost of ownership.

The Scorpio Smart Fabric Switch portfolio features two application-specific product lines with a multi-generational roadmap:
  • Scorpio P-Series for GPU-to-CPU/NIC/SSD PCIe 6 connectivity- architected to support mixed traffic head-node connectivity across a diverse ecosystem of PCIe hosts and endpoints.
  • Scorpio X-Series for back-end GPU clustering-architected to deliver the highest back-end GPU-to-GPU bandwidth with platform-specific customization.

MediaTek Announces Dimensity 9400 Flagship SoC with All Big Core Design

MediaTek today launched the Dimensity 9400, the company's new flagship smartphone chipset optimized for edge-AI applications, immersive gaming, incredible photography, and more. The Dimensity 9400, the fourth and latest in MediaTek's flagship mobile SoC lineup, offers a massive boost in performance with its second-generation All Big Core design built on Arm's v9.2 CPU architecture, combined with the most advanced GPU and NPU for extreme performance in a super power-efficient design.

The Dimensity 9400 adopts MediaTek's second-gen All Big Core design, integrating one Arm Cortex-X925 core operating over 3.62 GHz, combined with 3x Cortex-X4 and 4x Cortex-A720 cores. This design offers 35% faster single-core performance and 28% faster multi-core performance compared to MediaTek's previous generation flagship chipset, the Dimensity 9300. Built on TSMC's second-generation 3 nm process, the Dimensity 9400 is up to 40% more power-efficient than its predecessor, allowing users to enjoy longer battery life.

NVIDIA "Blackwell" GB200 Server Dedicates Two-Thirds of Space to Cooling at Microsoft Azure

Late Tuesday, Microsoft Azure shared an interesting picture on its social media platform X, showcasing the pinnacle of GPU-accelerated servers—NVIDIA "Blackwell" GB200-powered AI systems. Microsoft is one of NVIDIA's largest customers, and the company often receives products first to integrate into its cloud and company infrastructure. Even NVIDIA listens to feedback from companies like Microsoft about designing future products, especially those like the now-canceled NVL36x2 system. The picture below shows a massive cluster that roughly divides the compute area into a single-third of the entire system, with a gigantic two-thirds of the system dedicated to closed-loop liquid cooling.

The entire system is connected using Infiniband networking, a standard for GPU-accelerated systems due to its lower latency in packet transfer. While the details of the system are scarce, we can see that the integrated closed-loop liquid cooling allows the GPU racks to be in a 1U form for increased density. Given that these systems will go into the wider Microsoft Azure data centers, a system needs to be easily maintained and cooled. There are indeed limits in power and heat output that Microsoft's data centers can handle, so these types of systems often fit inside internal specifications that Microsoft designs. There are more compute-dense systems, of course, like NVIDIA's NVL72, but hyperscalers should usually opt for other custom solutions that fit into their data center specifications. Finally, Microsoft noted that we can expect to see more details at the upcoming Microsoft Ignite conference in November and learn more about its GB200-powered AI systems.

Supermicro Currently Shipping Over 100,000 GPUs Per Quarter in its Complete Rack Scale Liquid Cooled Servers

Supermicro, Inc., a Total IT Solution Provider for Cloud, AI/ML, Storage, and 5G/Edge, is announcing a complete liquid cooling solution that includes powerful Coolant Distribution Units (CDUs), cold plates, Coolant Distribution Manifolds (CDMs), cooling towers and end to end management software. This complete solution reduces ongoing power costs and Day 0 hardware acquisition and data center cooling infrastructure costs. The entire end-to-end data center scale liquid cooling solution is available directly from Supermicro.

"Supermicro continues to innovate, delivering full data center plug-and-play rack scale liquid cooling solutions," said Charles Liang, CEO and president of Supermicro. "Our complete liquid cooling solutions, including SuperCloud Composer for the entire life-cycle management of all components, are now cooling massive, state-of-the-art AI factories, reducing costs and improving performance. The combination of Supermicro deployment experience and delivering innovative technology is resulting in data center operators coming to Supermicro to meet their technical and financial goals for both the construction of greenfield sites and the modernization of existing data centers. Since Supermicro supplies all the components, the time to deployment and online are measured in weeks, not months."

Fujitsu and Supermicro Collaborate to Develop Green Arm-Based AI Computing Technology and Liquid-cooled Datacenter Solutions

Fujitsu Limited and Supermicro, Inc. (NASDAQ: SMCI), today announced they will collaborate to establish a long-term strategic engagement in technology and business, to develop and market a platform with Fujitsu's future Arm-based "FUJITSU-MONAKA" processor that is designed for high-performance and energy efficiency and targeted for release in 2027. In addition, the two companies will also collaborate on developing liquid-cooled systems for HPC, Gen AI, and next-generation green data centers.

"Supermicro is excited to collaborate with Fujitsu to deliver state-of-the-art servers and solutions that are high performance, power efficient, and cost-optimized," said Charles Liang, president and CEO of Supermicro. "These systems will be optimized to support a broad range of workloads in AI, HPC, cloud and edge environments. The two companies will focus on green IT designs with energy-saving architectures, such as liquid cooling rack scale PnP, to minimize technology's environmental impact."

NVIDIA Cancels Dual-Rack NVL36x2 in Favor of Single-Rack NVL72 Compute Monster

NVIDIA has reportedly discontinued its dual-rack GB200 NVL36x2 GPU model, opting to focus on the single-rack GB200 NVL72 and NVL36 models. This shift, revealed by industry analyst Ming-Chi Kuo, aims to simplify NVIDIA's offerings in the AI and HPC markets. The decision was influenced by major clients like Microsoft, who prefer the NVL72's improved space efficiency and potential for enhanced inference performance. While both models perform similarly in AI large language model (LLM) training, the NVL72 is expected to excel in non-parallelizable inference tasks. As a reminder, the NVL72 features 36 Grace CPUs, delivering 2,592 Arm Neoverse V2 cores with 17 TB LPDDR5X memory with 18.4 TB/s aggregate bandwidth. Additionally, it includes 72 Blackwell GB200 SXM GPUs that have a massive 13.5 TB of HBM3e combined, running at 576 TB/s aggregate bandwidth.

However, this shift presents significant challenges. The NVL72's power consumption of around 120kW far exceeds typical data center capabilities, potentially limiting its immediate widespread adoption. The discontinuation of the NVL36x2 has also sparked concerns about NVIDIA's execution capabilities and may disrupt the supply chain for assembly and cooling solutions. Despite these hurdles, industry experts view this as a pragmatic approach to product planning in the dynamic AI landscape. While some customers may be disappointed by the dual-rack model's cancellation, NVIDIA's long-term outlook in the AI technology market remains strong. The company continues to work with clients and listen to their needs, to position itself as a leader in high-performance computing solutions.

Thermal Grizzly Unveils the New WireView Pro GPU

The WireView Pro GPU was developed in collaboration with Jon "elmor" Sandström, a renowned hardware R&D engineer, extreme overclocker, and founder of Elmor Labs, to introduce new functionalities. To better protect the graphics card from potential damage, the Pro version of the WireView includes sensor pin detection that recognizes whether the 12V-2x6 power connector is correctly plugged into the power supply.

Another new feature is the temperature sensors on the PCB of the WireView Pro GPU, which measure the temperature at the power connectors. Users can set a threshold via the WireView Pro GPU, which triggers an acoustic alarm when exceeded. Additionally, an alarm can be set to trigger when a defined current level is exceeded. The WireView Pro GPU also includes two additional temperature sensors that can be connected to monitor, for example, the temperature of the graphics card's memory or voltage regulators.

iPhone 16 Pro Max Testing Reveals A18 Pro Still Limited in Raster Performance Despite Improved Ray Tracing

Apple recently launched the iPhone 16 Pro and Pro Max with the company's new A18 Pro SoC, and in its presentation, Apple claimed the new SoC offered up to 20% faster gaming performance than the previous generation. While this may be true in certain scenarios, recent testing in Alien Isolation has revealed that the A18 Pro's GPU still has some shortcomings when it comes to gaming.

According to the tests run by MrMacRightPlus, the Apple iPhone 16 Pro Max is barely able to maintain 30 FPS in Alien Isolation when running at its native 2868×1320 pixel resolution. While Alien Isolation is a AAA title that was ported to the iPhone, it is still a 10-year-old game, meaning it should be fairly easy to run. Lowering the in-game resolution, however, results in a substantial improvement to the A18 Pro's performance, with the game reaching 60 FPS after the change. This 30 FPS limitation may not all be down to a lack of performance from the A18 Pro SoC, though.

NVIDIA GeForce RTX 5090 and RTX 5080 Specifications Surface, Showing Larger SKU Segmentation

Thanks to the renowned NVIDIA hardware leaker kopite7Kimi on X, we are getting information about the final versions of NVIDIA's first upcoming wave of GeForce RTX 50 series "Blackwell" graphics cards. The two leaked GPUs are the GeForce RTX 5090 and RTX 5080, which now feature a more significant gap between xx80 and xx90 SKUs. For starters, we have the highest-end GeForce RTX 5090. NVIDIA has decided to use the GB202-300-A1 die and enabled 21,760 FP32 CUDA cores on this top-end model. Accompanying the massive 170 SM GPU configuration, the RTX 5090 has 32 GB of GDDR7 memory on a 512-bit bus, with each GDDR7 die running at 28 Gbps. This translates to 1,568 GB/s memory bandwidth. All of this is confined to a 600 W TGP.

When it comes to the GeForce RTX 5080, NVIDIA has decided to further separate its xx80 and xx90 SKUs. The RTX 5080 has 10,752 FP32 CUDA cores paired with 16 GB of GDDR7 memory on a 256-bit bus. With GDDR7 running at 28 Gbps, the memory bandwidth is also halved at 784 GB/s. This SKU uses a GB203-400-A1 die, which is designed to run within a 400 W TGP power envelope. For reference, the RTX 4090 has 68% more CUDA cores than the RTX 4080. The rumored RTX 5090 has around 102% more CUDA cores than the rumored RTX 5080, which means that NVIDIA is separating its top SKUs even more. We are curious to see at what price point NVIDIA places its upcoming GPUs so that we can compare generational updates and the difference between xx80 and xx90 models and their widened gaps.

Intel Launches Gaudi 3 AI Accelerator and P-Core Xeon 6 CPU

As AI continues to revolutionize industries, enterprises are increasingly in need of infrastructure that is both cost-effective and available for rapid development and deployment. To meet this demand head-on, Intel today launched Xeon 6 with Performance-cores (P-cores) and Gaudi 3 AI accelerators, bolstering the company's commitment to deliver powerful AI systems with optimal performance per watt and lower total cost of ownership (TCO).

"Demand for AI is leading to a massive transformation in the data center, and the industry is asking for choice in hardware, software and developer tools," said Justin Hotard, Intel executive vice president and general manager of the Data Center and Artificial Intelligence Group. "With our launch of Xeon 6 with P-cores and Gaudi 3 AI accelerators, Intel is enabling an open ecosystem that allows our customers to implement all of their workloads with greater performance, efficiency and security."

Microsoft DirectX 12 Shifts to SPIR-V as Default Interchange Format

Microsoft's Direct3D and HLSL teams have unveiled plans to integrate SPIR-V support into DirectX 12 with the upcoming release of Shader Model 7. This significant transition marks a new era in GPU programmability, as it aims to unify the intermediate representation for graphical-shader stages and compute kernels. SPIR-V, an open standard intermediate representation for graphics and compute shaders, will replace the proprietary DirectX Intermediate Language (DXIL) as the shader interchange format for DirectX 12. The adoption of SPIR-V is expected to ease development processes across multiple GPU runtime environments. By embracing this open standard, Microsoft aims to enhance HLSL's position as the premier language for compiling graphics and compute shaders across various devices and APIs. This transition is part of a multi-year development process, during which Microsoft will work closely with The Khronos Group and the LLVM Project. The company has joined Khronos' SPIR and Vulkan working groups to ensure smooth collaboration and rapid feature adoption.

While the transition will take several years, Microsoft is providing early notice to allow developers and partners to plan accordingly. The company will offer translation tools between SPIR-V and DXIL to facilitate a gradual transition for both application and driver developers. For those not familiar with graphics development, graphics APIs ship with virtual instruction set architectures (ISA) that abstracts standard hardware features at a higher level. As GPUs don't follow the same ISA as CPUs (x86, Arm, RISC-V), this virtual ISA is needed to define some generics in the GPU architecture and allow various APIs like DirectX and Vulkan to run. Instead of focusing support on several formats like DXIL, Microsoft is embracing the open SPIR-V standard, which will become de facto for API developers in the future, allowing focus on more features instead of constantly replicating each other's functions. While DXIL is used mainly for gaming environments, SPIR-V has adoption in high-performance computing as well, with OpenCL and SYCL. Gaming presence is also there with Vulkan API, and we expect to see SPIR-V join DirectX 12 games.

Intel Releases Arc GPU Graphics Drivers 101.6077 Beta

Today Intel released its latest Arc GPU graphics drivers, with version 101.6077 beta hitting our servers. This latest version brings game-ready support for titles like Dead Rising Deluxe Remaster, Final Fantasy XVI, Frostpunk 2, and God of War Ragnarok. Besides adding support for latest titles, the 101.6077 beta drivers also fixed issues like corruption on certain reflective surfaces during gameplay of The Last of Us Part I (DX12), and corrupted lines on certain textures during gameplay of Age of Empires IV (DX12). Interestingly, few issues persist like Diablo IV (DX12) intermittently crashing while toggling Ray Tracing settings during gameplay, and Doom Eternal (VK) exhibiting intermittent flickering corruption in the game menu and during gameplay. Few more notes are included in the changelog and known issues, which can be seen below.

DOWNLOAD: Intel Arc GPU Graphics Drivers 101.6077 Beta.
Return to Keyword Browsing
Nov 22nd, 2024 21:05 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts