- Joined
- Nov 3, 2011
- Messages
- 698 (0.14/day)
- Location
- Australia
System Name | Eula |
---|---|
Processor | AMD Ryzen 9 7950X |
Motherboard | MSI MPG B850 Edge Ti WiFi |
Cooling | Corsair H150i Elite LCD XT White |
Memory | Trident Z5 Neo RGB DDR5-6000 CL32-38-38-96 1.40V 64GB (2x32GB) AMD EXPO F5-6000J3238G32GX2-TZ5NR |
Video Card(s) | Gigabyte GeForce RTX 4080 GAMING OC |
Storage | Crucial P3 Plus, 4 TB NVMe, Samsung 980 Pro 2TB NVMe, Toshiba N300 10TB HDD, WDC Red Pro NAS HDD |
Display(s) | Acer Predator X32FP 32in 160Hz 4K, Corsair Xeneon 32UHD144 32in 144 hz 4K |
Case | Antec Constellation C8 RGB White |
Audio Device(s) | Creative Sound Blaster Z |
Power Supply | Corsair HX1000 Platinum 1000W |
Mouse | SteelSeries Prime Pro Gaming Mouse |
Keyboard | SteelSeries Apex 5 |
Software | MS Windows 11 Pro |
My point is against your implied CUDA FP32 core lacking double rate FP16 since you argued for Tensor FP16."Rapid Packed Math" is just another AMD marketing name for their FP16 capabilities.
The point stands that both current AMD and nVidia GPUs can run FP16 at double rate compare to FP32, there is no trickery in the benchmark.
The facts, Turing CUDA FP32 core has double rate FP16 feature regardless of Tensor cores. There's more FP16 TFLOPS with Turing RTX GPUs.