- Joined
- Oct 9, 2007
- Messages
- 47,297 (7.53/day)
- Location
- Hyderabad, India
System Name | RBMK-1000 |
---|---|
Processor | AMD Ryzen 7 5700G |
Motherboard | ASUS ROG Strix B450-E Gaming |
Cooling | DeepCool Gammax L240 V2 |
Memory | 2x 8GB G.Skill Sniper X |
Video Card(s) | Palit GeForce RTX 2080 SUPER GameRock |
Storage | Western Digital Black NVMe 512GB |
Display(s) | BenQ 1440p 60 Hz 27-inch |
Case | Corsair Carbide 100R |
Audio Device(s) | ASUS SupremeFX S1220A |
Power Supply | Cooler Master MWE Gold 650W |
Mouse | ASUS ROG Strix Impact |
Keyboard | Gamdias Hermes E2 |
Software | Windows 11 Pro |
With its "Zen" CPU microarchitecture, AMD removed support for the FMA4 instruction-set, on paper. This, while retaining FMA3. Level1Techs discovered that "Zen" CPUs do support FMA4 instructions, even through the instruction-set is not exposed to the operating system. FMA, or fused multiply add, is an efficient way to compute linear algebra. FMA3 and FMA4 are not generations of the instruction-set (unlike SSE3 and SSE4), but rather the digit denotes the number of operands per instruction. Support for both were introduced by AMD in 2012 with its FX-series processors, while Intel added FMA3 support in 2013 with "Haswell."
The exact reasons why AMD deprecated FMA4 with "Zen" are unknown, but some developers speculate it's because AMD's implementation of FMA4 is buggy, even though it's more efficient (33% more throughput). Intel's adoption of FMA3 made it more popular, and hence more stable over the years. Level1Techs used an OpenBLAS FMA4 test-program to confirm that feeding "Zen" processors with FMA4 instructions won't just return a "illegal instruction" error, but also the processor will go ahead and complete the operation. This is interesting because FMA4 isn't exposed as a CPUID bit, and the operating system has no idea the processor even supports the instruction. For linear algebra, FMA4 has proven more efficient than AVX in both single- and double-precision.
View at TechPowerUp Main Site
The exact reasons why AMD deprecated FMA4 with "Zen" are unknown, but some developers speculate it's because AMD's implementation of FMA4 is buggy, even though it's more efficient (33% more throughput). Intel's adoption of FMA3 made it more popular, and hence more stable over the years. Level1Techs used an OpenBLAS FMA4 test-program to confirm that feeding "Zen" processors with FMA4 instructions won't just return a "illegal instruction" error, but also the processor will go ahead and complete the operation. This is interesting because FMA4 isn't exposed as a CPUID bit, and the operating system has no idea the processor even supports the instruction. For linear algebra, FMA4 has proven more efficient than AVX in both single- and double-precision.
View at TechPowerUp Main Site