News Posts matching #64-bit

Return to Keyword Browsing

Imagination's new Catapult CPU is Driving RISC-V Device Adoption

Imagination Technologies today unveils the next product in the Catapult CPU IP range, the Imagination APXM-6200 CPU: a RISC-V application processor with compelling performance density, seamless security and the artificial intelligence capabilities needed to support the compute and intuitive user experience needs for next generation consumer and industrial devices.

"The number of RISC-V based devices is skyrocketing with over 16Bn units forecast by 2030, and the consumer market is behind much of this growth" says Rich Wawrzyniak, Principal Analyst at SHD Group. "One fifth of all consumer devices will have a RISC-V based CPU by the end of this decade. Imagination is set to be a force in RISC-V with a strategy that prioritises quality and ease of adoption. Products like APXM-6200 are exactly what will help RISC-V achieve the promised success."

ScaleFlux To Integrate Arm Cortex-R82 Processors in Its Next-Generation Enterprise SSD Controllers

ScaleFlux, a leader in deploying computational storage at scale, today announced its commitment to integrating the Arm Cortex -R82 processor in its forthcoming line of enterprise Solid State Drive (SSD) controllers. The Cortex-R82, is the highest performance real-time processor from Arm and the first to implement the 64-bit Armv8-R AArch64 architecture, representing a significant advancement in processing power and efficiency for enterprise storage solutions.

ScaleFlux's adoption of the Cortex-R82 is a strategic move to leverage the processor's high performance and energy efficiency. This collaboration underscores ScaleFlux's dedication to delivering cutting-edge technology in its SSD controllers, enhancing data processing capabilities and efficiency for data center and AI infrastructure worldwide.

Latest HWInfo64 Beta Arrives with OSD, Drops Windows XP Support

HWiNFO v7.73-5370 Beta was released yesterday—the newly updated version includes a fully integrated On-Screen Display (OSD) feature. In the past, users have had to rely on external tools—for example; RivaTuner Statistics Server (RTSS)—to get vital information displayed on their monitor(s) of choice. Martin, HWiNFO's main author, revealed that this new addition is based on a Team Blue toolset—his February 13 official forum post stated: "(this) feature is based on Intel PresentMon and allows showing any value as a text or graph (with multiple values). Position, text font, size, weight and colors can be individually defined. It should work with any engine like DirectX 11, 12, OpenGL, Vulkan. The OSD is automatically placed over the most graphics intensive application currently running but it can also be manually targeted." Five days later he followed up with further information—HWInfo64's new OSD is "available in HWiNFO64 only," therefore making it incompatible with Windows XP. Similarly, MSI's Afterburner 4.6.6 Beta landed a week and a bit ago, without support for Microsoft's 2001-vintage operating system.

Martin reckons that the change could affect Windows Vista users: "Use legacy HWiNFO32 on these systems. We don't anticipate that these systems will benefit from 64-bit applications, nor require support of latest HWiNFO64 versions. So the impact of this (sad) limitation should be minimal. In case there will be a reasonable demand for new versions of HWiNFO64 on XP64 it's still possible to build such versions (without OSD support), but currently we don't expect to make such extra effort." Additionally, the popular monitoring application's latest upgrade brings enhanced sensor monitoring for ASUS NUC systems, improved health monitoring on a selection of NVMe drives (connected via Intel RST), and enhanced sensor monitoring on the ASUS TUF GAMING Z790-PRO WIFI motherboard model.

Nuvoton Unveils New Production-Ready Endpoint AI Platform for Machine Learning

Nuvoton is pleased to announce its new Endpoint AI Platform to accelerate the development of fully-featured microcontroller (MCU) AI products. These solutions are enabled by Nuvoton's powerful new MCU and MPU silicon, including the NuMicro M55M1 equipped with Ethos U55 NPU, NuMicro MA35D1, and NuMicro M467 series. These MCUs are a valuable addition to the modern AI-centric computing toolkit and demonstrate how Nuvoton continues to work closely with Arm and other companies to develop a user-friendly and complete Endpoint AI Ecosystem.

Development on these platforms is made easy by Nuvoton's NuEdgeWise: a well-rounded, simple-to-adopt tool for machine learning (ML) development, which is nonetheless suitable for cutting-edge tasks. Together, this powerful core hardware, combined with unique rich development tools, cements Nuvoton's reputation as a leading microcontroller platform provider. These new single-chip-based platforms are ideal for applications including smart home appliances and security, smart city services, industry, agriculture, entertainment, environmental protection, education, highly accurate voice-control tasks, and sports, health, and fitness.

Microsoft Introduces 128-Core Arm CPU for Cloud and Custom AI Accelerator

During its Ignite conference, Microsoft introduced a duo of custom-designed silicon made to accelerate AI and excel in cloud workloads. First of the two is Microsoft's Azure Cobalt 100 CPU, a 128-core design that features a 64-bit Armv9 instruction set, implemented in a cloud-native design that is set to become a part of Microsoft's offerings. While there aren't many details regarding the configuration, the company claims that the performance target is up to 40% when compared to the current generation of Arm servers running on Azure cloud. The SoC has used Arm's Neoverse CSS platform customized for Microsoft, with presumably Arm Neoverse N2 cores.

The next and hottest topic in the server space is AI acceleration, which is needed for running today's large language models. Microsoft hosts OpenAI's ChatGPT, Microsoft's Copilot, and many other AI services. To help make them run as fast as possible, Microsoft's project Athena now has the name of Maia 100 AI accelerator, which is manufactured on TSMC's 5 nm process. It features 105 billion transistors and supports various MX data formats, even those smaller than 8-bit bit, for maximum performance. Currently tested on GPT 3.5 Turbo, we have yet to see performance figures and comparisons with competing hardware from NVIDIA, like H100/H200 and AMD, with MI300X. The Maia 100 has an aggregate bandwidth of 4.8 Terabits per accelerator, which uses a custom Ethernet-based networking protocol for scaling. These chips are expected to appear in Microsoft data centers early next year, and we hope to get some performance numbers soon.

Synopsys Expands Its ARC Processor IP Portfolio with New RISC-V Family

Synopsys, Inc. (Nasdaq: SNPS) today announced it has extended its ARC Processor IP portfolio to include new RISC-V ARC-V Processor IP, enabling customers to choose from a broad range of flexible, extensible processor options that deliver optimal power-performance efficiency for their target applications. Synopsys leveraged decades of processor IP and software development toolkit experience to develop the new ARC-V Processor IP that is built on the proven microarchitecture of Synopsys' existing ARC Processors, with the added benefit of the expanding RISC-V software ecosystem.

Synopsys ARC-V Processor IP includes high-performance, mid-range, and ultra-low power options, as well as functional safety versions, to address a broad range of application workloads. To accelerate software development, the Synopsys ARC-V Processor IP is supported by the robust and proven Synopsys MetaWare Development Toolkit that generates highly efficient code. In addition, the Synopsys.ai full-stack AI-driven EDA suite is co-optimized with ARC-V Processor IP to provide an out-of-the-box development and verification environment that helps boost productivity and quality-of-results for ARC-V-based SoCs.

Intel Itanium Reaches End of the Road with Linux Kernel Stopping Updates

Today marks the end of support for Itanium's IA-64 architecture in the Linux kernel's 6.7 update—a significant milestone in the winding-down saga of Intel Itanium. Itanium, initially Intel's ambitious venture into 64-bit computing, faced challenges and struggled throughout its existence. It was jointly developed by Intel and HP but encountered delays and lacked compatibility with x86 software, a significant obstacle to its adoption. When AMD introduced x86-64 (AMD64) for its Opteron CPUs, which could run x86 software natively, Intel was compelled to update Xeon, based on x86-64 technology, leaving Itanium to fade into the background.

Despite ongoing efforts to sustain Itanium, it no longer received annual CPU product updates, and the last update came in 2017. The removal of IA-64 support in the Linux kernel will have a substantial impact since Linux is an essential operating system for Itanium CPUs. Without ongoing updates, the usability of Itanium servers will inevitably decline, pushing the (few) remaining Itanium users to migrate to alternative solutions, which are most likely looking to modernize their product stack.

Raspberry Pi Foundation Launches Raspberry Pi 5

It has been over four years since the release of the Raspberry Pi 4, and in that time a lot has changed in the maker board and single-board computer landscape. For the Raspberry Pi Foundation there were struggles with worldwide demand and production capacity brought on by the global pandemic starting in 2020, and plenty of new competitors came to the scene to offer ready to order alternatives to the venerable RPi 4. Today however the production woes have been assuaged and a new generation of Raspberry Pi is here; the Raspberry Pi 5.

Raspberry Pi 5 is being announced in advance of availability unlike every prior RPi device launch. Pre-orders are open with many of the listed Approved Resellers on RPi's website starting today but unit shipments aren't expected until near the end of October 2023. As part of this pre-order scheme, RPi Foundation is withholding pre-orders from bulk customers and will be dealing in single-unit sales for individuals until at least the end of the year, as well as running some promotions with The MagPi and HackSpace magazines to give priority access to their subscribers. Genuinely nice to see, considering how hard it was to obtain a Pi 4 for the average Joe over the last couple years. The two announced prices for the RPi 5 are $60 USD for the 4 GB variant, and $80 USD for the 8 GB variant; or about $5 USD more than current reseller pricing on comparable configurations of the Raspberry Pi 4.

Andes Announces General Availability of the New AndesCore RISC-V Multicore Vector Processor AX45MPV

Andes Technology, a leading supplier of high efficiency, low-power 32/64-bit RISC-V processor cores and Founding Premier member of RISC-V International, today proudly announces general availability of the high-performance AndesCore AX45MPV multicore vector processor IP. The AX45MPV is the third generation of the award winning AndesCore vector processor series. Equipped with powerful RISC-V vector processing and parallel execution capability, it targets the applications with large volumes of data such as ADAS, AI inference and training, AR/VR, multimedia, robotics, and signal processing.

Andes and Meta started collaboration on datacenter AI with RISC-V vector core from early 2019. Andes later unveiled the AndesCore NX27V, marking a significant milestone as the industry's first commercial RISC-V vector processor core with the capability of generating up to 4 512-bit vector (VLEN) results per cycle, at the end of 2019. It immediately attracted the attention of worldwide SoC design teams working on AI accelerators, and has landed over a dozen datacenter AI projects. Since then, the RISC-V vector processor cores have become the choice for ML and AI chip vendors.

Ampere Computing Creates Gaming on Linux Guide, Runs Steam Proton on Server-class Arm CPUs

Ampere Computing, known for its Altra (Max) and upcoming AmpereOne families of AArch64 server processors tailored for data centers, has released a guide for enthusiasts on running Steam for Linux on these ARM64 processors. This includes using Steam Play (Proton) to play Windows games on these Linux-powered servers. Over the summer, Ampere Computing introduced a GitHub repository detailing the process of running Steam for Linux on their AArch64 platforms, including Steam Play/Proton. While the guide is primarily designed for Ampere Altra/Altra Max and AmpereOne hardware, it can be adapted for other 64-bit Arm platforms. However, a powerful processor is essential to appreciate the gaming experience truly. Additionally, for the 3D OpenGL/Vulkan graphics to function optimally, an Ampere workstation system is more suitable than a headless server.

The guide recommends the Ampere Altra Developer platform paired with an NVIDIA RTX A6000 series graphics card, which supports AArch64 proprietary drivers. The guide uses Box86 and Box64 to run Steam x86 binaries and other x86/x86-64 games for emulation. While there are other options like FEX-Emu and Hangover to enhance the Linux binary experience on AArch64, Box86/Box64 is the preferred choice for gaming on Ampere workstations, as indicated by its mention in Ampere Computing's Once the AArch64 Linux graphics drivers are accelerated and Box86/Box64 emulation is set up, users can install Steam for Linux. By activating Proton within Steam, it becomes feasible to play Windows-exclusive x86/x86-64 games on Ampere AArch64 workstations or server processors. However, the guide doesn't provide insights into the performance of such a configuration.

Chinese Exascale Sunway Supercomputer has Over 40 Million Cores, 5 ExaFLOPS Mixed-Precision Performance

The Exascale supercomputer arms race is making everyone invest their resources into trying to achieve the number one spot. Some countries, like China, actively participate in the race with little proof of their work, leaving the high-performance computing (HPC) community wondering about Chinese efforts on exascale systems. Today, we have some information regarding the next-generation Sunway system, which is supposed to be China's first exascale supercomputer. Replacing the Sunway TaihuLight, the next-generation Sunway will reportedly boast over 40 million cores in its system. The information comes from an upcoming presentation for Supercomputing 2023 show in Denver, happening from November 12 to November 17.

The presentation talks about 5 ExaFLOPS in the HPL-MxP benchmark with linear scalability on the 40-million-core Sunway supercomputer. The HPL-MxP benchmark is a mixed precision HPC benchmark made to test the system's capability in regular HPC workloads that require 64-bit precision and AI workloads that require 32-bit precision. Supposedly, the next-generation Sunway system can output 5 ExaFLOPS with linear scaling on its 40-million-core system. What are those cores? We are not sure. The last-generation Sunway TaihuLight used SW26010 manycore 64-bit RISC processors based on the Sunway architecture, each with 260 cores. There were 40,960 SW26010 CPUs in the system for a total of 10,649,600 cores, which means that the next-generation Sunway system is more than four times more powerful from a core-count perspective. We expect some uArch and semiconductor node improvements as well.

TerraMaster Launches F2-212 F4-212 and U4-212 Private Cloud NAS

TerraMaster, a professional brand that focuses on providing innovative storage products for homes and businesses, recently released new design NAS series F2-212, F4-212 and U4-212, support TRAID, BTRFS file system, Snapshot and TFSS providing stronger data backup and better home multimedia experience, which perfectly meet the requirement of personal and home users. Among the 3 models, the sample of F2-212 for review will be firstly available soon in August.

More Modern Exterior Design
The 212 NAS series adopts brand-new design elements and color matching, and has a more fashionable and modern appearance design, better heat dissipation, lower noise, and more convenient installation and use.

BioWare Confirms "Star Wars: The Old Republic" Under New Management, Announces Layoffs

Gary McKay, General Manager at BioWare, states: "Hello, it's been a little while since I've checked in, and as you might have heard, there's a lot happening here at BioWare. Almost 12 years after launch, Star Wars: The Old Republic (SWTOR) remains a fantastic success, continuing to welcome new players to its vast galaxy and entertaining veteran players with its evolving content. It's the longest-running live service Star Wars game ever and we're enormously proud of the work the team has done in creating, expanding, and maintaining this incredible game. We're delighted to have grown such a dedicated and passionate community through all these years. The future of the game and the community continues to be very bright.

I've been working closely with Keith Kanneg, who leads the SWTOR team, to give the game and the team the best opportunity to grow and evolve. And so, while EA will remain SWTOR's publisher, development of the game will move to our partner and friends at Broadsword, a boutique studio with expertise in managing online games. Both the Broadsword studio and SWTOR team members will be joining forces and working tirelessly to support "every player, every day," ensuring that these worlds and these communities continue to thrive and grow. Their Founder and President, Rob Denton, even has direct experience with SWTOR, having helped lead the team during the development and launch of the game during his time at EA.

Tachyum Readying First Tape-out of its Prodigy SoCs

Tachyum announced today it will cease taking orders for its Prodigy Universal Processor Field Programmable Gate Array (FPGA) emulation system boards effective immediately. The company releases the final Prodigy build for tape-out. New partners and customers who wish to work with Prodigy FPGAs for product evaluation, performance measurements, software development, debugging and compatibility testing can arrange for private testing in Tachyum's facility. As these are shared systems, they can't be used for classified or proprietary data or data subject to regulatory governance.

The Prodigy hardware emulator consists of multiple FPGA and IO boards connected by cables in a rack. A single board with four FPGAs emulates eight Prodigy processor cores (a small fraction of the final Prodigy product design, which consists of 128 cores) including vector and matrix fixed and floating-point processing units. Deploying more FPGAs will improve test cycles by orders of magnitudes to achieve target quality, a risk reduction mechanism for early adopters.

Debian 12 Bookworm Released

After 1 year, 9 months, and 28 days of development, the Debian project is proud to present its new stable version 12 (code name bookworm). bookworm will be supported for the next 5 years thanks to the combined work of the Debian Security team and the Debian Long Term Support team.

Following the 2022 General Resolution about non-free firmware, we have introduced a new archive area making it possible to separate non-free firmware from the other non-free packages:
  • non-free-firmware
  • Most non-free firmware packages have been moved from non-free to non-free-firmware. This separation makes it possible to build a variety of official installation images.
Debian 12 bookworm ships with several desktop environments, such as:
  • Gnome 43,
  • KDE Plasma 5.27,
  • LXDE 11,
  • LXQt 1.2.0,
  • MATE 1.26,
  • Xfce 4.18

Milk-V Pioneer Developer Board Combines 64-Core RISC-V SoC with mATX Modularity

Chinese RISC-V developers Milk-V Technology and SOPHGO recently announced their collaborative open source Milk-V Pioneer developer motherboard and workstation based on the SOPHON SG2042 RISC-V server SoC. The SOPHON SG2042 is a 64-core, 2 GHz SoC based on T-Head Semiconductor's XuanTie C920 64-bit processor design which features clusters of one to four cores, each a 12-stage out-of-order multiple issue superscalar pipeline, and a 128-bit vector engine based on the preliminary RISC-V V Extension version 0.7.1. The SG2042 packs in 64+64 KB (I+D) L1 cache per core, 1 MB of L2 cache per core cluster, 64 MB of L3 system cache, a quad-channel DDR4 controller, and 32 lanes of PCI-E Gen 4. The SG2042 contains no integrated graphics solution.

The Milk-V Pioneer incorporates this highly threaded RISC-V SoC with a modular and expandable standard mATX motherboard featuring four DIMM slots with support for up to 128 GB of DDR4, three full-length PCI-E slots wired for Gen 4 x8, two M.2 M-Key PCI-E Gen 3 x4, one M.2 E-Key for PCI-E 3.0 x1 and USB 2.0, eight USB 3.2 10 Gbps ports, five SATA 6 Gbps ports, and a pair of 2.5G Ethernet ports. The bulk of this I/O runs off an ASMedia ASM 2824 PCI-E switch, however the PCI-E Gen 4 ports run directly off the SG2024 SoC. Milk-V Pioneer is also being offered as a prebuilt small form factor workstation which puts the board into a small portable chassis called the Pioneer Box. The Pioneer Box includes 64 GB of DDR4-3200, 1 TB M.2 SSD, an Intel X520-T2 10G network card, an AMD Radeon R5 230 graphics card for display, and a 350 W power supply.

Intel Exploring x86S Architecture, Envisions an Unadulterated 64-bit Future

Intel has published a highly involved and extensive whitepaper on the topic of streamlining its CPU architectures, most notably by focusing on a purely 64-bit specification, and consequently dropping legacy 32-bit operating modes (as well as 16-bit!). Team Blue's key proposal states: "This whitepaper details the architectural enhancements and modifications that Intel is currently investigating for a 64-bit mode-only architecture referred to as x86S (for simplification). Intel is publishing this paper to solicit feedback from the ecosystem while exploring the benefits of extending the ISA transition to a 64-bit mode-only solution."

The paper provides a bit of background context: "Since its introduction over 20 years ago, the Intel 64 architecture became the dominant operating mode. As an example of this evolution, Microsoft stopped shipping the 32-bit version of their Windows 11 operating system. Intel firmware no longer supports non UEFI64 operating systems natively. 64-bit operating systems are the de facto standard today. They retain the ability to run 32-bit applications but have stopped supporting 16-bit applications natively. With this evolution, Intel believes there are opportunities for simplification in our hardware and software ecosystem."

First Test Build of Windows 2000 64-bit Rediscovered

A 64-bit Dec Alpha C compiler was found by Virtually Fun's neozeed earlier this year - the software archeologist has been searching for various test builds of Microsoft Windows NT, including an "AXP64/ALPHA64 port," deemed extra special due to it being the first 64-bit version of Windows 2000 Professional. The small discovery of this obscure compiler was celebrated, but its functionality is ultimately not all that useful - neozeed notes that the items have been sitting within 1999 vintage Windows Platform SDKs: "It turns out that the AXP64 compiler set has been hiding in plain sight for DECADES. I know that it's so unlikely that we'd ever see any public release of a 64-bit version of Windows for the Alpha, but oddly enough the compiler, headers and libraries are all there. YES. You can make full executes for AXP64/Alpha64. Of course with no OS, so it's not like you can run them."

He continues: "Sadly as of today, there is no way to test. There is one surviving machine with Windows 2003 AXP64, outlined in an article by Raymond Chen. It's a great read about how Alpha64 NT port came to be. The machine is still sitting in Microsoft Archives. Hopefully one day someone can dig it out." The story could have ended there, but a follow up post appeared on Virtually Fun earlier this week - courtesy of guest contributor Antoni Sawicki (aka tenox) who has also experimented with the cross-compiler. He provided a little bit more historical context before making an interesting announcement: "The Win64 project for AXP64 and IA64 was code named "Sundown." Sadly, 64-bit Alpha AXP Windows was never released outside of Redmond."

Chinese Loongson 3D5000 Features 32 Cores and is 4x Faster Than the Average Arm Chip

Amid the push for technology independence, Chinese companies are pushing out more products to satisfy the need for the rapidly soaring demand for domestic data processing silicon. Today, we have information that Chinese Loongson has launched a 3D5000 CPU with as many as 32 cores. Utilizing chiplet technology, the 3D5000 represents a combination of two 16-core 3C5000 processors based on LA464 cores, based on LoongArch ISA that follows the combination of RISC and MIPS ISA design principles. The new chip features 64 MB of L3 cache, supports eight-channel DDR4-3200 ECC memory achieving 50 GB/s, and has five HyperTransport (HT) 3.0 interfaces. The TDP configuration of the chip is officially 300 Watts; however, normal operation is usually at around 150 Watts, with LA464 cores running at 2 GHz.

Scaling of the new chip goes beyond the chiplet, and pours over into system, as 3D5000 supports 2P and 4P configurations, where a single motherboard can become a system of up to 128 cores. To connect them, Loongson uses a 7A2000 bridge chip that is reportedly 400% faster than the previous solution, although we have no information about the last chip bridge. Based on the LGA-4129 package, the chip size is 75.4x58.5×6.5 mm. Regarding performance, Loongson compares it to the average Arm chip that goes into smartphones and claims that its designs are up to four times faster. In SPEC2006, performance reaches 425 points, while maintaining a single TeraFLOP at dual-precision 64-bit format. On the other hand, the processor was built for security, as the chip has a custom hardware-baked security to prevent Spectre and Meltdown, has an on-package Trusted Platform Module (TPM), and has a secret China-made security algorithm with an embedded custom security module that does encryption and decryption at 5 Gbps.

Intel Publishes Sorting Library Powered by AVX-512, Offers 10-17x Speed Up

Intel has recently updated its open-source C++ header file library for high-performance SIMD-based sorting to support the AVX-512 SIMD instruction set. Extending the capability of regular AVX2 support, the sorting functions now implement 512-bit extensions to offer greater performance. According to Phoronix, the NumPy Python library for mathematics that underpins a lot of software has updated its software base to use the AVX-512 boosted sorting functionality that yields a fantastic uplift in performance. The library uses AVX-512 to vectorize the quicksort for 16-bit and 64-bit data types using the extended instruction set. Benchmarked on an Intel Tiger Lake system, the NumPy sorting saw a 10-17x increase in performance.

Intel's engineer Raghuveer Devulapalli changed the NumPy code, which was merged into the NumPy codebase on Wednesday. Regarding individual data types, the new implementation increases 16-bit int sorting by 17x and 32-bit data type sorting by 12-13x, while float 64-bit sorting for random arrays has experienced a 10x speed up. Using the x86-simd-sort code, this speed-up shows the power of AVX-512 and its capability to enhance the performance of various libraries. We hope to see more implementations of AVX-512, as AMD has joined the party by placing AVX-512 processing elements on Zen 4.

Andes Technology Unveils The AndesCore AX60 Series, An Out-Of-Order Superscalar Multicore RISC-V Processor Family

Today, at Linley Fall Processor Conference 2022, Andes Technology, a leading provider of high efficiency, low power 32/64-bit RISC-V processor cores and founding premier member of RISC-V International, reveals its top-of-the-line AndesCore AX60 series of power and area efficient out-of-order 64-bit processors. The family of processors are intended to run heavy-duty OS and applications with compute intensive requirements such as advanced driver-assistance systems (ADAS), artificial intelligence (AI), augmented/virtual reality (AR/VR), datacenter accelerators, 5G infrastructure, high-speed networking, and enterprise storage.

The first member of the AX60 series, the AX65, supports the latest RISC-V architecture extensions such as the scalar cryptography extension and bit manipulation extension. It is a 4-way superscalar with Out-of-Order (OoO) execution in a 13-stage pipeline. It fetches 4 to 8 instructions per cycle guided by highly accurate TAGE branch predictor with loop prediction to ensure fetch efficiency. It then decodes, renames and dispatches up to 4 instructions into 8 execution units, including 4 integer units, 2 full load/store units, and 2 floating-point units. Besides the load/store units, the AX65's aggressive memory subsystem also includes split 2-level TLBs with multiple concurrent table walkers and up to 64 outstanding load/store instructions.

Tachyum Submits Bid for 20-Exaflop Supercomputer to U.S. Department of Energy Advanced Computing Ecosystems

Tachyum today announced that it has responded to a U.S. Department of Energy Request for Information soliciting Advanced Computing Ecosystems for DOE national laboratories engaged in scientific and national security research. Tachyum has submitted a proposal to create a 20-exaflop supercomputer based on Tachyum's Prodigy, the world's first universal processor.

The DOE's request calls for computing systems that are five to 10 times faster than those currently available and/or that can perform more complex applications in "data science, artificial intelligence, edge deployments at facilities, and science ecosystem problems, in addition to the traditional modeling and simulation applications."

NVIDIA PrefixRL Model Designs 25% Smaller Circuits, Making GPUs More Efficient

When designing integrated circuits, engineers aim to produce an efficient design that is easier to manufacture. If they manage to keep the circuit size down, the economics of manufacturing that circuit is also going down. NVIDIA has posted on its technical blog a technique where the company uses an artificial intelligence model called PrefixRL. Using deep reinforcement learning, NVIDIA uses the PrefixRL model to outperform traditional EDA (Electronics Design Automation) tools from major vendors such as Cadence, Synopsys, or Siemens/Mentor. EDA vendors usually implement their in-house AI solution to silicon placement and routing (PnR); however, NVIDIA's PrefixRL solution seems to be doing wonders in the company's workflow.

Creating a deep reinforcement learning model that aims to keep the latency the same as the EDA PnR attempt while achieving a smaller die area is the goal of PrefixRL. According to the technical blog, the latest Hopper H100 GPU architecture uses 13,000 instances of arithmetic circuits that the PrefixRL AI model designed. NVIDIA produced a model that outputs a 25% smaller circuit than comparable EDA output. This is all while achieving similar or better latency. Below, you can compare a 64-bit adder design made by PrefixRL and the same design made by an industry-leading EDA tool.

Researchers Use SiFive's RISC-V SoC to Build a Supercomputer

Researchers from Università di Bologna and CINECA, the largest supercomputing center in Italy, have been playing with the concept of developing a RISC-V supercomputer. The team has laid the grounds for the first-ever implementation that demonstrates the capability of the relatively novel ISA to run high-performance computing. To create a supercomputer, you need pieces of hardware that seem like Lego building blocks. Those are called clusters, made from a motherboard, processor, memory, and storage. Italian researchers decided to try and use something different than Intel/AMD solution to the problem and use a processor based on RISC-V ISA. Using SiFive's Freedom U740 SoC as the base, researchers named their RISC-V cluster "Monte Cimone."

Monte Cimone features four dual-board servers, each in a 1U form factor. Each board has a SiFive's Freedom U740 SoC with four U74 cores running up to 1.4 GHz and one S7 management core. In total, eight nodes combine for a total of 32 RISC-V cores. Paired with 16 GB of 64-bit DDR4 memory operating at 1866s MT/s, PCIe Gen 3 x8 bus running at 7.8 GB/s, one gigabit Ethernet port, USB 3.2 Gen 1 interfaces, the system is powered by two 250 Watt PSUs to support future expansion and addition of accelerator cards.

AMD Claims Radeon RX 6500M is Faster Than Intel Arc A370M Graphics

A few days ago, Intel announced its first official discrete graphics card efforts, designed for laptops. Called the Arc Alchemist lineup, Intel has designed these SKUs to provide entry-level to high-end options covering a wide range of use cases. Today, AMD has responded with a rather exciting Tweet made by the company's @Radeon Twitter account. The company compared Intel's Arc Alchemist A370M GPU with AMD's Radeon RX 6500M mobile SKUs in the post. These GPUs are made on TSMC's N6 node, feature 4 GB GDDR6 64-bit memory, 1024 FP32 cores, and have the same configurable TDP range of 35-50 Watts.

Below, you can see AMD's benchmarks of the following select games: Hitman 3, Total War Saga: Troy, F1 2021, Strange Brigade (High), and Final Fantasy XIV. The Radeon RX 6500M GPU manages to win in all of these games, thus explaining AMD's "FTW" hashtag on Twitter. Remember that these are vendor-supplied benchmarks runs, so we have to wait for some media results to surface.
Return to Keyword Browsing
May 9th, 2024 17:50 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts