News Posts matching #AVX-512

Return to Keyword Browsing

New Intel oneAPI 2023 Tools Maximize Value of Upcoming Intel Hardware

Today, Intel announced the 2023 release of the Intel oneAPI tools - available in the Intel Developer Cloud and rolling out through regular distribution channels. The new oneAPI 2023 tools support the upcoming 4th Gen Intel Xeon Scalable processors, Intel Xeon CPU Max Series and Intel Data Center GPUs, including Flex Series and the new Max Series. The tools deliver performance and productivity enhancements, and also add support for new Codeplay plug-ins that make it easier than ever for developers to write SYCL code for non-Intel GPU architectures. These standards-based tools deliver choice in hardware and ease in developing high-performance applications that run on multiarchitecture systems.

"We're seeing encouraging early application performance results on our development systems using Intel Max Series GPU accelerators - applications built with Intel's oneAPI compilers and libraries. For leadership-class computational science, we value the benefits of code portability from multivendor, multiarchitecture programming standards such as SYCL and Python AI frameworks such as PyTorch, accelerated by Intel libraries. We look forward to the first exascale scientific discoveries from these technologies on the Aurora system next year."
-Timothy Williams, deputy director, Argonne Computational Science Division

FinalWire AIDA64 v6.85 Released with NVIDIA Ada and AMD RDNA3 Support

FinalWire Ltd. today announced the immediate availability of AIDA64 Extreme 6.85 software, a streamlined diagnostic and benchmarking tool for home users; the immediate availability of AIDA64 Engineer 6.85 software, a professional diagnostic and benchmarking solution for corporate IT technicians and engineers; the immediate availability of AIDA64 Business 6.85 software, an essential network management solution for small and medium scale enterprises; and the immediate availability of AIDA64 Network Audit 6.85 software, a dedicated network audit toolset to collect and manage corporate network inventories.

The new AIDA64 update introduces AVX-512 optimized stress testing for AMD Ryzen 7000 Series processors, and supports the latest AMD and Intel CPU platforms as well as the new graphics and GPGPU computing technologies by both AMD and NVIDIA.

DOWNLOAD: FinalWire AIDA64 Extreme v6.85

GIGABYTE Delivers a Comprehensive Portfolio of Enterprise Solutions with AMD EPYC 9004 Series Processors

GIGABYTE Technology, an industry leader in high-performance servers and workstations, today announced its portfolio of products ready to support the new AMD EPYC 9004 Series Processors in the first wave of GIGABYTE solutions that will target a wide range of demanding workloads that include GPU-centric, high-density, edge, and general computing. A new x86 platform, a new socket, and a wealth of highly performant technologies provided new opportunities for GIGABYTE to tailor products for leading data centers. So far, GIGABYTE has released twenty-two new servers and motherboards to support the new AMD "Zen 4" architecture. Both single-socket and dual-socket options are available to handle big data and digital transformation. The ongoing collaboration between GIGABYTE and AMD has allowed for a comprehensive portfolio of computing solutions that are ready for the market.

The new 4th Gen AMD EPYC processors feature substantial compute performance and scalability by combing high core counts with impressive PCIe and memory throughput. In terms of out of the box performance, AMD estimates found that 4th Gen AMD EPYC CPUs are the highest performing server processors in the worldi. With the advancement to 5 nm technology and other performant innovations, the new AMD EPYC 9004 series processors move to a new SP5 socket. The new architecture leads the way to faster data insights with high performance and built-in security features, and this platform targets HPC, AI, cloud, big data, and general enterprise IT.

Intel Introduces the Max Series Product Family: Ponte Vecchio and Sapphire Rapids

In advance of Supercomputing '22 in Dallas, Intel Corporation has introduced the Intel Max Series product family with two leading-edge products for high performance computing (HPC) and artificial intelligence (AI): Intel Xeon CPU Max Series (code-named Sapphire Rapids HBM) and Intel Data Center GPU Max Series (code-named Ponte Vecchio). The new products will power the upcoming Aurora supercomputer at Argonne National Laboratory, with updates on its deployment shared today.

The Xeon Max CPU is the first and only x86-based processor with high bandwidth memory, accelerating many HPC workloads without the need for code changes. The Max Series GPU is Intel's highest density processor, packing over 100 billion transistors into a 47-tile package with up to 128 gigabytes (GB) of high bandwidth memory. The oneAPI open software ecosystem provides a single programming environment for both new processors. Intel's 2023 oneAPI and AI tools will deliver capabilities to enable the Intel Max Series products' advanced features.

AMD Rolls Out GCC Enablement for "Zen 4" Processors with Zenver4 Target, Enables AVX-512 Instructions

AMD earlier this week released basic enablement for the GNU Compiler Collections (GCC), which extend "Zen 4" microarchitecture awareness. The "basic enablement patch" for the new Zenver4 target is essentially similar to Zenver3, but with added support for the new AVX-512 instructions, namely AVX512F, AVX512DQ, AVX512IFMA, AVX512CD, AVX512BW, AVX512VL, AVX512BF16, AVX512VBMI, AVX512VBMI2, GFNI, AVX512VNNI, AVX512BITALG, and AVX512VPOPCNTDQ. Besides AVX-512, "Zen 4" is largely identical to its predecessor, architecturally, and so the enablement is rather basic. This should come just in time for software vendors to prepare for next-generation EPYC "Genoa" server processors, or even small/medium businesses building servers with Ryzen 7000-series processors.

Intel Outs First Xeon Scalable "Sapphire Rapids" Benchmarks, On-package Accelerators Help Catch Up with AMD EPYC

Intel in the second day of its InnovatiON event, turned attention to its next-generation Xeon Scalable "Sapphire Rapids" server processors, and demonstrated on-package accelerators. These are fixed-function hardware components that accelerate specific kinds of popular server workloads (i.e. run them faster than a CPU core can). With these, Intel hopes to close the CPU core-count gap it has with AMD EPYC, with the upcoming "Zen 4" EPYC chips expected to launch with up to 96 cores per socket in its conventional variant, and up to 128 cores per socket in its cloud-optimized variant.

Intel's on-package accelerators include AMX (advanced matrix extensions), which accelerate recommendation-engines, natural language processing (NLP), image-recognition, etc; DLB (dynamic load-balancing), which accelerates security-gateway and load-balancing; DSA (data-streaming accelerator), which speeds up the network stack, guest OS, and migration; IAA (in-memory analysis accelerator), which speeds up big-data (Apache Hadoop), IMDB, and warehousing applications; a feature-rich implementation of the AVX-512 instruction-set for a plethora of content-creation and scientific applications; and lastly, the QAT (QuickAssist Technology), with speed-ups for data compression, OpenSSL, nginx, IPsec, etc. Unlike "Ice Lake-SP," QAT is now implemented on the processor package instead of the PCH.

RPCS3 PlayStation 3 Emulator Updated with AVX-512 Support for AMD Zen 4

The popular PlayStation 3 emulator for PCs, RPCS3, just received a major update that lets it take advantage of the AVX-512 instruction-set on processors based on the AMD Zen 4 microarchitecture (the recently launched Ryzen 7000 series). RPCS3 emulates the PS3's CELL Broadband Engine SoC entirely on CPU, and does not use your GPU to draw any raster graphics. To emulate both a CPU and GPU of that time entirely on a multi-threaded CPU of today is no easy task, but is helped greatly by leveraging the latest instruction-sets. RPCS3 supports an AVX-512 code-path on Intel processors such as the Core i9-11900K "Rocket Lake," but the company has been fidgeting with AVX-512 support on its client processors since 12th Gen "Alder Lake." The developer of RPCS3 in a tweet confirmed that they have enabled AVX-512 support for AMD Zen 4 with the latest build.

AMD "Zen 4" Dies, Transistor-Counts, Cache Sizes and Latencies Detailed

As we await technical documents from AMD detailing its new "Zen 4" microarchitecture, particularly the all-important CPU core Front-End and Branch Prediction units that have contributed two-thirds of the 13% IPC gain over the previous-generation "Zen 3" core, the tech enthusiast community is already decoding images from the Ryzen 7000 series launch presentation. "Skyjuice" presented the first annotation of the "Zen 4" core, revealing its large branch-prediction unit, enlarged micro-op cache, TLB, load/store unit, and dual-pumped 256-bit FPU that enables AVX-512 support. A quarter of the core's die-area is also taken up by the 1 MB dedicated L2 cache.

Chiakokhua (aka Retired Engineer) posted a table detailing the various caches and their latencies, comparing it with those of the "Zen 3" core. As AMD's Mark Papermaster revealed in the Ryzen 7000 launch event, the company has enlarged the micro-op cache of the core from 4 K entries to 6.75 K entries. The L1I and L1D caches remain 32 KB in size, each; while the L2 cache has doubled in size. The enlargement of the L2 cache has slightly increased latency, from 12 cycles to 14. Latency of the shared L3 cache is also up, from 46 cycles to 50 cycles. The reorder buffer (ROB) in the dispatch stage has been enlarged from 256 entries to 320 entries. The L1 branch target buffer (BTB) has increased in size from 1 KB to 1.5 KB.

Latest Y-Cruncher Version Comes with "Zen 4" and AVX512 Optimization

Y-Cruncher is a multi-threaded Pi calculation benchmark. Its author, Alexander Yee, has access to an AMD Ryzen 9 7950X 16-core/32-thread sample, and has developed the latest version 0.7.10 of the Y-Cruncher binary with optimization for the "Zen 4" microarchitecture, and to take advantage of the AVX-512 instruction-set on these chips. Without disclosing the juicy performance numbers obtained in his testing, Yee posted a screenshot of Y-cruncher with the 7950X, on a machine with Windows 11 22Hx, and 64 GB of memory. You know it's optimized, since the multi-core efficiency is as high as 98% (all threads are being saturated with the Pi calculation workload).

Intel Teams Up with Aible to Fast-Track Enterprise Analytics and AI

Intel's collaboration with Aible enables teams across key industries to leverage artificial intelligence and deliver rapid and measurable business impact. This deep collaboration, which includes engineering optimizations and an innovative benchmarking program, enhances Aible's ability to deliver rapid results to its enterprise customers. When paired with Intel processors, Aible's technology provides a serverless-first approach, allowing developers to build and run applications without having to manage servers, and build modern applications with increased agility and lower total cost of ownership (TCO).

"Today's enterprise IT infrastructure leaders face significant challenges building a foundation that is designed to help business teams drive value from AI initiatives in the data center. We've moved past talking about the potential of AI, as business teams across key industries are experiencing measurable business impact within days, using Intel Xeon Scalable processors with built-in Intel software optimizations with Aible," said Kavitha Prasad, Intel vice president and general manager of Datacenter, AI and Cloud Execution and Strategy.

Intel Xeon W9-3495 Sapphire Rapids HEDT CPU with 56 Cores and 112 Threads Appears

Intel's upcoming Sapphire Rapids processors will not only be present in the server sector but will also span the high-end desktop (HEDT) platform. Today, according to the findings of a Twitter user, @InstLatX64, we have an appearance of Intel's upcoming Sapphire Rapids HEDT SKU in Kernel.org boot logs. Named Intel Xeon W9-3495, this model features 56 cores and 112 threads. While there is no specific information about base and boost frequencies, we know that the SKU supports AVX-512 and AMX instructions. This is a welcome addition, as we have seen Intel disable AVX-512 on consumer chips altogether.

With a high core count and additional instructions for Deep Learning, this CPU will power workstations sometimes in the future. With the late arrival of Sapphire Rapids for servers, a HEDT variant should follow.

Intel Core i9-13900 "Raptor Lake" Processor Gets a Preview

Intel is preparing to launch its 13th generation of desktop processors codenamed Raptor Lake. Succeeding Alder Lake, the 13th gen design will implement up to eight P-cores with 16 E-cores manufactured on Intel's improved 7+ technology node. Today, we got a performance preview from SiSoftware that has collected SiSoftware Sandra database scores of Intel Core i9-13900 Raptor Lake-S processor. They present an overview of a few benchmarks. Firstly, the SoC features 36 MB of unified L3 cache versus 30 MB in Alder Lake. With DDR5 memory running up to 5600 MT/s and PCIe 5.0, the SoC features the latest IO and memory standards. The big P-cores now lack AVX-512 and feature 2 MB of L2 cache per core. We see 4 MB of L2 cache for a cluster of small E-cores. An exciting addition to E-cores is the AVX/AVX2 support, which is a first for Atom cores.

Regarding testing, the author has collected a few tests that seemed appropriate to compare to the equivalent Alder Lake model. Starting with ALU/FPU tests that benchmark basic arithmetic tasks, Raptor Lake delivered 33% to 50% improvement over Alder Lake. The Raptor Lake design achieved this with 3.7 GHz P-Core and 2.76 GHz E-Core frequency. In vectorized and SIMD tests, the 13th gen design showed only 5% to 8% improvement over the previous generation. For more benchmarks and accurate results, we have to wait for TechPowerUp's test, which will be coming on the release day.

Intel Sapphire Rapids 56-Core ES Processor Boosts to 3.3 GHz at 420 Watts

Intel is slowly transitioning its data center customers to a new processor generation called Sapphire Rapids. Today, thanks to the hardware leaker Yuuki_ans we have more profound insights into the top-end 56-core Sapphire Rapids processor and its power settings. According to the leak, we have information on either Xeon Platinum 8476 or Platinum 8480 designs that are equipped with 56 cores and 112 threads. This model was running at the base frequency of 1.9 GHz and a boost frequency of 3.3 GHz. Single-core can boost to 3.7 GHz if the report is giving a correct reading. Remember that this is only an engineering sample, so the final target speeds could differ. It carries 112 MB of L2 and 105 MB of L3 cache, and this sample was running with 1 TB of DDR5 memory with CL40-39-38-76 timings.

Perhaps the most exciting finding is the power configuration of this SKU. Intel has enabled this CPU to consume 350 Watts in PL1 rating, with up to 420 Watts in PL2 performance mode. The enforced BIOS power limit rating is set at an astonishing 764 Watts, which could happen with AVX-512 enabled. Final TDP ratings are yet to be disclosed; however, these Sapphire Rapids processors are shaping to be relatively power-hungry chips.

Intel is Now Fusing Off AVX-512 support in Alder Lake CPUs

If you have already bought a 12th gen Intel Alder Lake CPU, you could be sitting on a collectors item, as according to Tom's Hardware, Intel is now fusing off AVX-512 support in production. It's possible this could be in preparation for the arrival of the Core "W" series of CPUs that might be replacing the Xeon-W series of processors for Intel. It should be noted that this isn't a rumour, as Tom's Hardware has had an official statement on the matter from Intel.

The statement reads, "Although AVX-512 was not fuse-disabled on certain early Alder Lake desktop products, Intel plans to fuse off AVX-512 on Alder Lake products going forward." As to exactly when this will go into full effect isn't clear, but according to Tom's Hardware, they've already had reports of batches of non-K Alder Lake CPUs that are lacking AVX-512 support. In all fairness to Intel, the company never claimed that its Alder Lake CPUs would support AVX-512 and the support has never been guaranteed to be flawless on the chips that have shipped with it enabled. Intel has also disabled AVX-512 via a microcode update that shipped to motherboard makers in January, but at least some motherboard makers have added a toggle to allow people to re-enable AVX-512 support. It's unlikely that this will affect many potential customers, since AVX-512 instructions aren't widely used in consumer facing software.

Intel Launches Xeon D Processor Built for the Network and Edge

Today, ahead of MWC Barcelona 2022, Intel launched new Intel Xeon D processors: the D-2700 and the D-1700. They are Intel's newest system-on-chip (SoC) built for the software-defined network and edge, with integrated AI and crypto acceleration, built-in Ethernet, support for Intel Time Coordinated Computing (Intel TCC) and Time Sensitive Networking (TSN), and industrial-class reliability. New Intel Xeon D processors extend compute with acceleration beyond the core data center, generating a better overall experience for key network and edge usages and workloads.

"As the industry enters a world of software-defined everything, Intel is delivering programmable platforms for networking and the edge to enable one of the most significant transformations our industry has ever seen. The new Intel Xeon D processor is built for this. Based on the proven and trusted Intel architecture, this processor is designed for a range of use cases to unleash innovation across the network and edge," said Dan Rodriguez, Intel corporate vice president, Network & Edge Group, general manager of the Network Platforms Group.

MSI Partially Reenables AVX-512 Support for Alder Lake-S Processors

Intel's Alder Lake processors have two types of cores present, with two distinct sets of features and capabilities enabled. For example, smaller E-cores don't support the execution of AVX-512 instructions, while the bigger P-cores have support for AVX-512 instructions. So Intel has decided to remove support for it altogether not to create software errors and run into issues with executing AVX-512 code on Alder Lake processors. This happened just months before the launch of Alder Lake, making us see some initial motherboard BIOSes come with AVX-512 enabled from the box. Later on, all motherboard makers pulled the plug on it, and it is a rare sight to see support for it.

However, it seems like MSI is unhappy with the lack of AVX-512, and the company is reenabling partial support for it. According to Xaver Amberger, editor at Igor's Lab, MSI reintroduces selecting microcode version with its MEG Z690 Unify-X motherboard. There is an option for AVX-512 enablement in the menu, and it is indeed a functional one. With BIOS A22, MSI enabled AVX-512 instruction execution, and there are benchmarks to prove it works. This shows an advantage of 512-bit wide execution units of AVX-512 over something like AVX2, which offers only 256-bit wide execution units. In applications such as Y-Cruncher, AVX-512 enabled the CPU to reach higher performance targets while consuming less power.
Return to Keyword Browsing
Jul 12th, 2025 03:01 CDT change timezone

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts