News Posts matching #acceleration

Return to Keyword Browsing

Extropic Intends to Accelerate AI through Thermodynamic Computing

Extropic, a pioneer in physics-based computing, this week emerged from stealth mode and announced the release of its Litepaper, which outlines the company's revolutionary approach to AI acceleration through thermodynamic computing. Founded in 2022 by Guillaume Verdon, Extropic has been developing novel chips and algorithms that leverage the natural properties of out-of-equilibrium thermodynamic systems to perform probabilistic computations for generative AI applications in a highly efficient manner. The Litepaper delves into Extropic's groundbreaking computational paradigm, which aims to address the limitations of current digital hardware in handling the complex probability distributions required for generative AI.

Today's algorithms spend around 25% of their time moving numbers around in memory, limiting the speedup achievable by accelerating specific operations. In contrast, Extropic's chips natively accelerate a broad class of probabilistic algorithms by running them physically as a rapid and energy-efficient, physics-based process in their entirety, unlocking a new regime of AI acceleration well beyond what was previously thought achievable. In coming out of stealth, the company has announced the fabrication of a superconducting prototype processor and developments surrounding room-temperature semiconductor-based devices for the broader market, with the goal of revolutionizing the field of AI acceleration and enabling new possibilities in generative AI.

Mesa CPU-based Vulkan Driver Gets Ray Tracing Support - Quake II Performance Hits 1 FPS

Konstantin Seurer, a Mesa developer, has spent the past couple of months working on CPU-based Vulkan ray tracing—naturally, some folks will express scepticism about this project's practicality. Seurer has already set expectations with a brief message: "don't ask about performance." His GitLab merge request page attracted Michael Larabel's attention—the Phoronix founder and principal author was suitably impressed with Seurer's coding wizardry. He: "managed to implement support for VK_KHR_acceleration_structure, VK_KHR_deferred_host_operations, and VK_KHR_ray_query for Lavapipe. This Lavapipe Vulkan ray tracing support is based in part on porting code from the emulated ray tracing worked on for RADV with older Radeon GPUs." A lone screenshot provided evidence of Quake II running at 1 FPS with Vulkan ray tracing enabled—this "atrocious" performance was achieved thanks to a Mesa Lavapipe driver "implementing the Vulkan API for CPU-based execution."

VideoCardz has highlighted an older example of CPU-based rendering techniques: "this is not the first time we heard about ray tracing on the CPU in Quake. In 2008, Intel demonstrated Enemy Territory: Quake Wars running at 720p resolution at 14 to 29 FPS on 16 core and 20-35 FPS at 24 core CPUs (quad-socket). The basic implementation of ray tracing in 2008 is not comparable to complex ray tracing techniques designed for GPUs, thus the performance on modern system is actually much lower. Beyond that, that game was specifically designed for the Intel architecture and used a specific API to achieve that. Sadly, the original ET demo is no longer available, it would be interesting to see how it performs today." CPU-based Vulkan ray tracing is expected to hit public distribution channels with the rollout of Mesa 24.1. Several members of the Phoronix community reckon that modern AMD Threadripper PRO processors have the potential to post double-digit in-game frame rates.

Qualcomm AI Hub Introduced at MWC 2024

Qualcomm Technologies, Inc. unveiled its latest advancements in artificial intelligence (AI) at Mobile World Congress (MWC) Barcelona. From the new Qualcomm AI Hub, to cutting-edge research breakthroughs and a display of commercial AI-enabled devices, Qualcomm Technologies is empowering developers and revolutionizing user experiences across a wide range of devices powered by Snapdragon and Qualcomm platforms.

"With Snapdragon 8 Gen 3 for smartphones and Snapdragon X Elite for PCs, we sparked commercialization of on-device AI at scale. Now with the Qualcomm AI Hub, we will empower developers to fully harness the potential of these cutting-edge technologies and create captivating AI-enabled apps," said Durga Malladi, senior vice president and general manager, technology planning and edge solutions, Qualcomm Technologies, Inc. "The Qualcomm AI Hub provides developers with a comprehensive AI model library to quickly and easily integrate pre-optimized AI models into their applications, leading to faster, more reliable and private user experiences."

Windows 11 DirectML Preview Supports Intel Core Ultra NPUs

Chad Pralle, Principle Technical Program Manager at Microsoft's Windows AI NPU division has introduced the DirectML 1.13.1 and ONNX Runtime 1.17 APIs—this appears to be a collaborative effort—Samsung was roped in to some degree, according to Microsoft's announcement and a recent Team Blue blog entry. Pralle and his team are suitably proud of this joint effort that involved open source models: "we are excited to announce developer preview support for NPU acceleration in DirectML, the machine learning platform API for Windows. This developer preview enables support for a subset of models on new Windows 11 devices with Intel Core Ultra processors with Intel AI boost."

Further on in Microsoft's introductory piece, Samsung Electronics is announced as a key launch partner—Hwang-Yoon Shim, VP and Head of New Computing H/W R&D Group stated that: "NPUs are emerging as a critical resource for broadly delivering efficient machine learning experiences to users, and Windows DirectML is one of the most efficient ways for Samsung's developers to make those experiences for Windows." Microsoft notes that NPU support in DirectML is still "a work in progress," but Pralle and his colleagues are eager to receive user feedback from the testing community. It is currently "only compatible with a subset of machine learning models, some models may not run at all or may have high latency or low accuracy." They hope to implement improvements in the near future. The release is limited to modern Team Blue hardware, so NPU-onboard AMD devices are excluded at this point in time, naturally.

Two New Marvell OCTEON 10 Processors Bring Server-Class Performance to Networking Devices

Marvell Technology, a leader in data infrastructure semiconductor solutions, is enabling networking equipment and firewall manufacturers achieve breakthrough levels of performance and efficiency with two new OCTEON 10 data processing units (DPUs), the OCTEON 10 CN102 and OCTEON 10 CN103. The 5 nm OCTEON CN102 and CN103, broadly available to OEMs for product design and pilot production, are optimized for data and control plane applications in routers, firewalls, 5G small cells, SD-WAN appliances, and control plane applications in top-of-rack switches and line card controllers. Several of the world's largest networking equipment manufacturers have already incorporated the OCTEON 10 CN102 into a number of product designs.

Containing up to eight Arm Neoverse N2 cores, OCTEON 10 CN102 and CN103 deliver 3x the performance of Marvell current DPU solutions for devices while reducing power consumption by 50% to 25 W. Achieving SPEC CPU (2017) integer rate (SPECint) scores of 36.5, OCTEON 10 CN102 and CN103 are able to deliver nearly 1.5 SPECint points per Watt. The chips can serve as an offload DPU for host processors or as the primary processor in devices; advanced performance per watt also enables OEMs to design fanless systems to simplify systems and further reduce cost, maintenance and power consumption.

Intel "Emerald Rapids" Die Configuration Leaks, More Details Appear

Thanks to the leaked slides obtained by @InstLatX64, we have more details and some performance estimates about Intel's upcoming 5th Generation Xeon "Emerald Rapids" CPUs, boasting a significant performance leap over its predecessors. Leading the Emerald Rapids family is the top-end SKU, the Xeon 8592+, which features 64 cores and 128 threads, backed by a massive 480 MB L3 cache pool. The upcoming lineup shifts from a 4-tile to a 2-tile design to minimize latency and improve performance. The design utilizes the P-Core architecture under the Raptor Cove ISA and promises up to 40% faster performance than the current 4th Generation "Sapphire Rapids" CPUs in AI applications utilizing Intel AMX engine. Each chiplet has 35 cores, three of which are disabled, and each tile has two DDR5-5600 MT/s memory controllers, which operate two memory channels each and translating that into eight-channel design. There are three PCIe controllers per die, making it six in total.

Newer protocols and AI accelerators also back the upcoming lineup. Now, the Emerald Rapids family supports the Compute Express Link (CXL) Types 1/2/3 in addition to up to 80 PCIe Gen 5 lanes and enhanced Intel Ultra Path Interconnect (UPI). There are four UPI controllers spread over two dies. Moreover, features like the four on-die Intel Accelerator Engines, optimized power mode, and up to 17% improvement in general-purpose workloads make it seem like a big step up from the current generation. Much of this technology is found on the existing Sapphire Rapids SKUs, with the new generation enhancing the AI processing capability further. You can see the die configuration below. The 5th Generation Emerald Rapids designs are supposed to be official on December 14th, just a few days away.

Microsoft Introduces 128-Core Arm CPU for Cloud and Custom AI Accelerator

During its Ignite conference, Microsoft introduced a duo of custom-designed silicon made to accelerate AI and excel in cloud workloads. First of the two is Microsoft's Azure Cobalt 100 CPU, a 128-core design that features a 64-bit Armv9 instruction set, implemented in a cloud-native design that is set to become a part of Microsoft's offerings. While there aren't many details regarding the configuration, the company claims that the performance target is up to 40% when compared to the current generation of Arm servers running on Azure cloud. The SoC has used Arm's Neoverse CSS platform customized for Microsoft, with presumably Arm Neoverse N2 cores.

The next and hottest topic in the server space is AI acceleration, which is needed for running today's large language models. Microsoft hosts OpenAI's ChatGPT, Microsoft's Copilot, and many other AI services. To help make them run as fast as possible, Microsoft's project Athena now has the name of Maia 100 AI accelerator, which is manufactured on TSMC's 5 nm process. It features 105 billion transistors and supports various MX data formats, even those smaller than 8-bit bit, for maximum performance. Currently tested on GPT 3.5 Turbo, we have yet to see performance figures and comparisons with competing hardware from NVIDIA, like H100/H200 and AMD, with MI300X. The Maia 100 has an aggregate bandwidth of 4.8 Terabits per accelerator, which uses a custom Ethernet-based networking protocol for scaling. These chips are expected to appear in Microsoft data centers early next year, and we hope to get some performance numbers soon.

CyberLink and Intel Work Together to Lead the Gen-AI Era, Enhancing the AI ​​Content Creation Experience

CyberLink, a leader in digital creative editing software and artificial intelligence (AI), attended the Intel Innovation Taipei 2023. As a long-standing Intel independent software vendor (ISV) partner, CyberLink demonstrated how its latest generative AI technology is used for easily creating amazing photo and video content with tools such as: AI Business Outfits, AI Product Background, and AI Video to Anime. During the forum, CyberLink Chairman and CEO Jau Huang shared how Intel's upcoming AI PC is expected to benefit content creators by popularizing generative AI creativity from cloud computing to personal computers, to not only reduce the cost of AI computing but, simultaneously eliminate users' privacy concerns, fostering an entirely new AI content creation experience where it's even easier to unleash creativity with generative AI.

The Intel Innovation Taipei was kicked off by Intel CEO Pat Gelsinger. The event highlighted four major themes: artificial intelligence, edge to cloud, next-generation systems and platforms, and advance technologies, as well as the latest results of cooperation with Taiwan ecosystem partners, including the latest AI PCs, etc.

Supermicro Expands AI Solutions with the Upcoming NVIDIA HGX H200 and MGX Grace Hopper Platforms Featuring HBM3e Memory

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is expanding its AI reach with the upcoming support for the new NVIDIA HGX H200 built with H200 Tensor Core GPUs. Supermicro's industry leading AI platforms, including 8U and 4U Universal GPU Systems, are drop-in ready for the HGX H200 8-GPU, 4-GPU, and with nearly 2x capacity and 1.4x higher bandwidth HBM3e memory compared to the NVIDIA H100 Tensor Core GPU. In addition, the broadest portfolio of Supermicro NVIDIA MGX systems supports the upcoming NVIDIA Grace Hopper Superchip with HBM3e memory. With unprecedented performance, scalability, and reliability, Supermicro's rack scale AI solutions accelerate the performance of computationally intensive generative AI, large language Model (LLM) training, and HPC applications while meeting the evolving demands of growing model sizes. Using the building block architecture, Supermicro can quickly bring new technology to market, enabling customers to become more productive sooner.

Supermicro is also introducing the industry's highest density server with NVIDIA HGX H100 8-GPUs systems in a liquid cooled 4U system, utilizing the latest Supermicro liquid cooling solution. The industry's most compact high performance GPU server enables data center operators to reduce footprints and energy costs while offering the highest performance AI training capacity available in a single rack. With the highest density GPU systems, organizations can reduce their TCO by leveraging cutting-edge liquid cooling solutions.

Intel Launches Industry's First AI PC Acceleration Program

Building on the AI PC use cases shared at Innovation 2023, Intel today launched the AI PC Acceleration Program, a global innovation initiative designed to accelerate the pace of AI development across the PC industry.

The program aims to connect independent hardware vendors (IHVs) and independent software vendors (ISVs) with Intel resources that include AI toolchains, co-engineering, hardware, design resources, technical expertise and co-marketing opportunities. These resources will help the ecosystem take full advantage of Intel Core Ultra processor technologies and corresponding hardware to maximize AI and machine learning (ML) application performance, accelerate new use cases and connect the wider PC industry to the solutions emerging in the AI PC ecosystem. More information is available on the AI PC Acceleration Program website.

Striking Performance: LLMs up to 4x Faster on GeForce RTX With TensorRT-LLM

Generative AI is one of the most important trends in the history of personal computing, bringing advancements to gaming, creativity, video, productivity, development and more. And GeForce RTX and NVIDIA RTX GPUs, which are packed with dedicated AI processors called Tensor Cores, are bringing the power of generative AI natively to more than 100 million Windows PCs and workstations.

Today, generative AI on PC is getting up to 4x faster via TensorRT-LLM for Windows, an open-source library that accelerates inference performance for the latest AI large language models, like Llama 2 and Code Llama. This follows the announcement of TensorRT-LLM for data centers last month. NVIDIA has also released tools to help developers accelerate their LLMs, including scripts that optimize custom models with TensorRT-LLM, TensorRT-optimized open-source models and a developer reference project that showcases both the speed and quality of LLM responses.

d-Matrix Announces $110 Million in Funding for Corsair Inference Compute Platform

d-Matrix, the leader in high-efficiency generative AI compute for data centers, has closed $110 million in a Series-B funding round led by Singapore-based global investment firm Temasek. The goal of the fundraise is to enable d-Matrix to begin commercializing Corsair, the world's first Digital-In Memory Compute (DIMC), chiplet-based inference compute platform, after the successful launches of its prior Nighthawk, Jayhawk-I and Jayhawk II chiplets.

d-Matrix's recent silicon announcement, Jayhawk II, is the latest example of how the company is working to fundamentally change the physics of memory-bound compute workloads common in generative AI and large language model (LLM) applications. With the explosion of this revolutionary technology over the past nine months, there has never been a greater need to overcome the memory bottleneck and current technology approaches that limit performance and drive up AI compute costs.

MaxLinear Announces Production Availability of Panther III Storage Accelerator OCP Adapter Card

MaxLinear, Inc., a leader in data storage accelerator solutions, today announced the production-release of the OCP 3.0 storage accelerator adapter card for Panther III. The ultra-low latency accelerator is designed to quicken key storage workloads, including database acceleration, storage offload, encryption, compression, and deduplication enablement for maximum data reduction. The Panther III OCP card is ideal for use in modern data centers, including public to edge clouds, enterprise data centers, and telecommunications infrastructure, allowing users to access, process, and transfer data up to 12 times faster than without a storage accelerator. The OCP version of the card is available immediately with a PCIe version available in Q3 2023.

"In an era where the amount of data generated exceeds new storage installations by multiple fold, Panther III helps reduce the massive storage gap while improving TCO per bit stored," said Dylan Patel, Chief Analyst at SemiAnalysis.

Tour de France Bike Designs Developed with NVIDIA RTX GPU Technologies

NVIDIA RTX is spinning new cycles for designs. Trek Bicycle is using GPUs to bring design concepts to life. The Wisconsin-based company, one of the largest bicycle manufacturers in the world, aims to create bikes with the highest-quality craftsmanship. With its new partner Lidl, an international retailer chain, Trek Bicycle also owns a cycling team, now called Lidl-Trek. The team is competing in the annual Tour de France stage race on Trek Bicycle's flagship lineup, which includes the Emonda, Madone and Speed Concept. Many of the team's accessories and equipment, such as the wheels and road race helmets, were also designed at Trek.

Bicycle design involves complex physics—and a key challenge is balancing aerodynamic efficiency with comfort and ride quality. To address this, the team at Trek is using NVIDIA A100 Tensor Core GPUs to run high-fidelity computational fluid dynamics (CFD) simulations, setting new benchmarks for aerodynamics in a bicycle that's also comfortable to ride and handles smoothly. The designers and engineers are further enhancing their workflows using NVIDIA RTX technology in Dell Precision workstations, including the NVIDIA RTX A5500 GPU, as well as a Dell Precision 7920 running dual RTX A6000 GPUs.

Adlink's Next-Gen IPC Strives to Revolutionize Industry Use Cases at the Edge

ADLINK Technology Inc., a global leader in edge computing, and a Titanium member of the Intel Partner Alliance, is proud to announce the launch of its latest MVP Series fanless modular computers—the MVP-5200 Compact Modular Industrial Computers and MVP-6200 Expandable Modular Industrial Computers—powered by 12/13th Gen Intel Core i9/i7/i5/i3 and Celeron processors. Featuring Intel R680E chipset and supporting up to 65 W, the computers can also incorporate GPU cards in a rugged package suitable for AI inferencing at the Edge, and can be used for but not limited to smart manufacturing, semiconductor equipment, and warehouse applications.

The MVP-5200/MVP-6200 series though expandable remains compact with support for up to 4 PCI/PCIe slots that allow for performance acceleration through GPUs, accelerators, and other expansion cards. Comprehensive modularized options and the ease of configuration can effectively reduce lead times for customers' diverse requirements. In addition, ADLINK also offers a broad range of pre-validated expansion cards, such as GPU, motion, vision, and I/O embedded cards, all can be easily deployed for your industrial applications.

Valve Releases Major Steam Desktop Client Update

Hello! We're excited to announce that we've just shipped a new version of the Steam Client to everyone. This update includes all the new Steam Desktop features that have been tested and fine-tuned in the beta branch. Before we get into the details, we want to thank our beta testers really quick - we couldn't have shipped without all of your invaluable feedback and bug reports!

New framework, new foundation
The most impactful changes in this update aren't immediately visible; much of the work went into changing how we share code across the Steam Desktop Client, Big Picture mode, and Steam Deck. These changes also means quicker implementation and iteration of new features. For example, many of the features in this update (like Notes in the overlay) are simultaneously shipping on Steam Deck because of the shared codebase.

Steam On Linux Restores Hardware Acceleration by Default for NVIDIA GPUs

A previous attempt to enable NVIDIA GPU video hardware acceleration by default within Steam running on Linux platforms was thwarted by numerous bugs and faults - adopters of the mid-May Steam Client Beta update reported their experiences of various crashes encountered in Valve's user interface. The embattled software engineering team has since investigated this matter and released a new update (yesterday).

The June 6th Steam Client Beta patch notes list a number of general improvements along with Linux-specific adjustments: "a crash when Steam windows were closed with hardware (HW) acceleration enabled on NVIDIA GPUs" and the re-enabling of "HW acceleration by default for NVIDIA GPUs." Early reports indicate that Linux gamers are having a smoother time after installing yesterday's update.
Return to Keyword Browsing
May 1st, 2024 05:35 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts