News Posts matching #AVX2

Return to Keyword Browsing

Zhaoxin Launches KX-7000 Desktop 8-Core x86 Processor to Power China's Ambitions

After years of delays, Chinese chipmaker Zhaoxin has finally launched its long-awaited KX-7000 series consumer CPUs, only one of its kind in China, based on the licensed x86-64 ISA. Zhaoxin claims the new 8-core processors based on "Century Avenue" uArch deliver double the performance of previous generations. Leveraging architectural improvements and 4X more cache, the KX-7000 represents essential progress for China's domestic semiconductor industry. While still likely lagging behind rival AMD and Intel chips in raw speed, the KX-7000 matches competitive specs in areas like DDR5 memory, PCIe 4.0, and USB4 support. For Chinese efforts to attain technological independence, closing feature gaps with foreign processors is just as crucial as boosting performance. Manufactured on a 16 nm process, the KX-7000 does not use the best silicon node available.

Other chip details include out-of-order execution (OoOE), 24 PCIe 4.0 lanes, a 32 MB pool of L3 cache and 4 MB L2 cache, a base frequency of 3.2 GHz, and a boost clock of 3.7 GHz. Interestingly, the CPU also has VT-x, BT-d 2.5, SSE4.2/AVX/AVX2 support, most likely also licensed from the x86 makers Intel and/or AMD. Ultimately, surpassing Western processors is secondary for China next to attaining self-reliance. Instructions like SM encryption catering to domestic data protection priorities underscore how the KX-7000 advances strategic autonomy goals. With its x86 architecture license giving software compatibility and now a vastly upgraded platform, the KX-7000 will raise China's chip capabilities even if it is still trailing rivals' speeds. Ongoing progress closing that performance gap could position Zhaoxin as a mainstream alternative for local PC builders and buyers.

"Downfall" Intel CPU Vulnerability Can Impact Performance By 50%

Intel has recently revealed a security vulnerability named Downfall (CVE-2022-40982) that impacts multiple generations of Intel processors. The vulnerability is linked to Intel's memory optimization feature, exploiting the Gather instruction, a function that accelerates data fetching from scattered memory locations. It inadvertently exposes internal hardware registers, allowing malicious software access to data held by other programs. The flaw affects Intel mainstream and server processors ranging from the Skylake to Rocket Lake microarchitecture. The entire list of affected CPUs is here. Intel has responded by releasing updated software-level microcode to fix the flaw. However, there's concern over the performance impact of the fix, potentially affecting AVX2 and AVX-512 workloads involving the Gather instruction by up to 50%.

Phoronix tested the Downfall mitigations and reported varying performance decreases on different processors. For instance, two Xeon Platinum 8380 processors were around 6% slower in certain tests, while the Core i7-1165G7 faced performance degradation ranging from 11% to 39% in specific benchmarks. While these reductions were less than Intel's forecasted 50% overhead, they remain significant, especially in High-Performance Computing (HPC) workloads. The ramifications of Downfall are not restricted to specialized tasks like AI or HPC but may extend to more common applications such as video encoding. Though the microcode update is not mandatory and Intel provides an opt-out mechanism, users are left with a challenging decision between security and performance. Executing a Downfall attack might seem complex, but the final choice between implementing the mitigation or retaining performance will likely vary depending on individual needs and risk assessments.

AAEON UP 7000 Brings Intel Processor N-series Onboard the World's Smallest Platform

AAEON's UP brand, renowned for producing sophisticated developer boards with industrial-grade specifications, has announced the release of the UP 7000, the third generation of boards built on the 85 mm x 56 mm form factor of the original UP Board. The UP 7000 boasts numerous upgrades compared to its predecessors, including 8 GB of LPDDR5 system memory, onboard TPM 2.0, and support for both Windows and Linux OS. Most notably, it stands out as the world's smallest board featuring onboard CPUs from the Intel Processor N-series platform.

The board is available in various SKUs, hosting Intel Processor N97, Intel Processor N100, or Intel Processor N50 CPUs. Notably, these processors offer clock speeds up to 50% faster than those of previous boards from the same product line, and also support Intel AVX2 for energy-efficient AI acceleration. Equipped with three USB Type-A ports for USB 3.2 Gen 2, one GbE LAN port supporting Realtek RTL8111H CG, and a Raspberry Pi-compatible 40-pin HAT for expansion, the UP 7000 provides a dense port configuration. This makes it ideal for applications requiring low latency but versatile connectivity, such as AMR and multifunction printing device solutions.

Intel Publishes Sorting Library Powered by AVX-512, Offers 10-17x Speed Up

Intel has recently updated its open-source C++ header file library for high-performance SIMD-based sorting to support the AVX-512 SIMD instruction set. Extending the capability of regular AVX2 support, the sorting functions now implement 512-bit extensions to offer greater performance. According to Phoronix, the NumPy Python library for mathematics that underpins a lot of software has updated its software base to use the AVX-512 boosted sorting functionality that yields a fantastic uplift in performance. The library uses AVX-512 to vectorize the quicksort for 16-bit and 64-bit data types using the extended instruction set. Benchmarked on an Intel Tiger Lake system, the NumPy sorting saw a 10-17x increase in performance.

Intel's engineer Raghuveer Devulapalli changed the NumPy code, which was merged into the NumPy codebase on Wednesday. Regarding individual data types, the new implementation increases 16-bit int sorting by 17x and 32-bit data type sorting by 12-13x, while float 64-bit sorting for random arrays has experienced a 10x speed up. Using the x86-simd-sort code, this speed-up shows the power of AVX-512 and its capability to enhance the performance of various libraries. We hope to see more implementations of AVX-512, as AMD has joined the party by placing AVX-512 processing elements on Zen 4.

Intel Core i9-13900 "Raptor Lake" Processor Gets a Preview

Intel is preparing to launch its 13th generation of desktop processors codenamed Raptor Lake. Succeeding Alder Lake, the 13th gen design will implement up to eight P-cores with 16 E-cores manufactured on Intel's improved 7+ technology node. Today, we got a performance preview from SiSoftware that has collected SiSoftware Sandra database scores of Intel Core i9-13900 Raptor Lake-S processor. They present an overview of a few benchmarks. Firstly, the SoC features 36 MB of unified L3 cache versus 30 MB in Alder Lake. With DDR5 memory running up to 5600 MT/s and PCIe 5.0, the SoC features the latest IO and memory standards. The big P-cores now lack AVX-512 and feature 2 MB of L2 cache per core. We see 4 MB of L2 cache for a cluster of small E-cores. An exciting addition to E-cores is the AVX/AVX2 support, which is a first for Atom cores.

Regarding testing, the author has collected a few tests that seemed appropriate to compare to the equivalent Alder Lake model. Starting with ALU/FPU tests that benchmark basic arithmetic tasks, Raptor Lake delivered 33% to 50% improvement over Alder Lake. The Raptor Lake design achieved this with 3.7 GHz P-Core and 2.76 GHz E-Core frequency. In vectorized and SIMD tests, the 13th gen design showed only 5% to 8% improvement over the previous generation. For more benchmarks and accurate results, we have to wait for TechPowerUp's test, which will be coming on the release day.

MSI Partially Reenables AVX-512 Support for Alder Lake-S Processors

Intel's Alder Lake processors have two types of cores present, with two distinct sets of features and capabilities enabled. For example, smaller E-cores don't support the execution of AVX-512 instructions, while the bigger P-cores have support for AVX-512 instructions. So Intel has decided to remove support for it altogether not to create software errors and run into issues with executing AVX-512 code on Alder Lake processors. This happened just months before the launch of Alder Lake, making us see some initial motherboard BIOSes come with AVX-512 enabled from the box. Later on, all motherboard makers pulled the plug on it, and it is a rare sight to see support for it.

However, it seems like MSI is unhappy with the lack of AVX-512, and the company is reenabling partial support for it. According to Xaver Amberger, editor at Igor's Lab, MSI reintroduces selecting microcode version with its MEG Z690 Unify-X motherboard. There is an option for AVX-512 enablement in the menu, and it is indeed a functional one. With BIOS A22, MSI enabled AVX-512 instruction execution, and there are benchmarks to prove it works. This shows an advantage of 512-bit wide execution units of AVX-512 over something like AVX2, which offers only 256-bit wide execution units. In applications such as Y-Cruncher, AVX-512 enabled the CPU to reach higher performance targets while consuming less power.
Return to Keyword Browsing
May 16th, 2024 20:50 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts