- Joined
- Nov 3, 2011
- Messages
- 695 (0.15/day)
- Location
- Australia
System Name | Eula |
---|---|
Processor | AMD Ryzen 9 7900X PBO |
Motherboard | ASUS TUF Gaming X670E Plus Wifi |
Cooling | Corsair H150i Elite LCD XT White |
Memory | Trident Z5 Neo RGB DDR5-6000 64GB (4x16GB F5-6000J3038F16GX2-TZ5NR) EXPO II, OCCT Tested |
Video Card(s) | Gigabyte GeForce RTX 4080 GAMING OC |
Storage | Corsair MP600 XT NVMe 2TB, Samsung 980 Pro NVMe 2TB, Toshiba N300 10TB HDD, Seagate Ironwolf 4T HDD |
Display(s) | Acer Predator X32FP 32in 160Hz 4K FreeSync/GSync DP, LG 32UL950 32in 4K HDR FreeSync/G-Sync DP |
Case | Phanteks Eclipse P500A D-RGB White |
Audio Device(s) | Creative Sound Blaster Z |
Power Supply | Corsair HX1000 Platinum 1000W |
Mouse | SteelSeries Prime Pro Gaming Mouse |
Keyboard | SteelSeries Apex 5 |
Software | MS Windows 11 Pro |
My point is against your implied CUDA FP32 core lacking double rate FP16 since you argued for Tensor FP16."Rapid Packed Math" is just another AMD marketing name for their FP16 capabilities.
The point stands that both current AMD and nVidia GPUs can run FP16 at double rate compare to FP32, there is no trickery in the benchmark.
The facts, Turing CUDA FP32 core has double rate FP16 feature regardless of Tensor cores. There's more FP16 TFLOPS with Turing RTX GPUs.