• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA A800 China-Tailored GPU Performance within 70% of A100

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,651 (0.99/day)
The recent growth in demand for training Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT) has sparked the interest of many companies to invest in GPU solutions that are used to train these models. However, countries like China have struggled with US sanctions, and NVIDIA has to create custom models that meet US export regulations. Carrying two GPUs, H800 and A800, they represent cut-down versions of the original H100 and A100, respectively. We reported about H800; however, it remained as mysterious as A800 that we are talking about today. Thanks to MyDrivers, we have information that the A800 GPU performance is within 70% of the regular A100.

The regular A100 GPU manages 9.7 TeraFLOPs of FP64, 19.5 TeraFLOPS of FP64 Tensor, and up to 624 BF16/FP16 TeraFLOPS with sparsity. A rough napkin math would suggest that 70% performance of the original (a 30% cut) would equal 6.8 TeraFLOPs of FP64 precision, 13.7 TeraFLOPs of FP64 Tensor, and 437 BF16/FP16 TeraFLOPs with sparsity. MyDrivers notes that A800 can be had for 100,000 Yuan, translating to about 14,462 USD at the time of writing. This is not the most capable GPU that Chinese companies can acquire, as H800 exists. However, we don't have any information about its performance for now.



View at TechPowerUp Main Site | Source
 
Joined
May 7, 2020
Messages
264 (0.16/day)
Honestly, those better zh model like GLM are very efficient and will run on potato, A800 is probably more than enough for years to come.
 
Joined
Aug 30, 2006
Messages
7,223 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
I dont see how this sanction is effective. The cut down models will be cheaper. Buy two instead of one. Get 70+70=140% performance unless the drivers force single implementations only, in which case, virtualize.
 
Joined
Sep 10, 2015
Messages
530 (0.16/day)
System Name My Addiction
Processor AMD Ryzen 7950X3D
Motherboard ASRock B650E PG-ITX WiFi
Cooling Alphacool Core Ocean T38 AIO 240mm
Memory G.Skill 32GB 6000MHz
Video Card(s) Sapphire Pulse 7900XTX
Storage Some SSDs
Display(s) 42" Samsung TV + 22" Dell monitor vertically
Case Lian Li A4-H2O
Audio Device(s) Denon + Bose
Power Supply Corsair SF750
Mouse Logitech
Keyboard Glorious
VR HMD None
Software Win 10
Benchmark Scores None taken
I dont see how this sanction is effective. The cut down models will be cheaper. Buy two instead of one. Get 70+70=140% performance unless the drivers force single implementations only, in which case, virtualize.
Standalone unit price is secondary. Power consumption and compute density is much more importand since you plan for years if not decades with servers built around these.

Having 2 of these instad of 1 in a blade that can fit like 4 means your whole array of these units will be ~1,5 times bigger. That also means more boards, more cooling, more power supplies too.
 
Joined
Aug 30, 2006
Messages
7,223 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
I still dont see it as an effective sanction. A sanction needs to prohibit or scale back by an order of magnitude. BTW i dont agree with the sanction BUT if you are going to do it, do it properly.
 
Joined
Dec 22, 2011
Messages
3,890 (0.82/day)
Processor AMD Ryzen 7 3700X
Motherboard MSI MAG B550 TOMAHAWK
Cooling AMD Wraith Prism
Memory Team Group Dark Pro 8Pack Edition 3600Mhz CL16
Video Card(s) NVIDIA GeForce RTX 3080 FE
Storage Kingston A2000 1TB + Seagate HDD workhorse
Display(s) Samsung 50" QN94A Neo QLED
Case Antec 1200
Power Supply Seasonic Focus GX-850
Mouse Razer Deathadder Chroma
Keyboard Logitech UltraX
Software Windows 11
Yeah sadly these are more than capable of computing various ways of flattening Taiwan.
 
Joined
Jan 11, 2022
Messages
913 (0.85/day)
I still dont see it as an effective sanction. A sanction needs to prohibit or scale back by an order of magnitude. BTW i dont agree with the sanction BUT if you are going to do it, do it properly.
You can do that with a party that can’t bite back.
in this case there are a lot of things china could do to make things really hard for the Biden administration

this boils down to nothing but a bit of an export tax.

that causes a whole lot of extra pollution, which is kinda funny for this admin to do.
 
Joined
Sep 17, 2014
Messages
22,666 (6.05/day)
Location
The Washing Machine
System Name Tiny the White Yeti
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
VR HMD HD 420 - Green Edition ;)
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
I dont see how this sanction is effective. The cut down models will be cheaper. Buy two instead of one. Get 70+70=140% performance unless the drivers force single implementations only, in which case, virtualize.
Power is cost and if you have large scale datacenters you will want highly efficient stuff.
 
Joined
Aug 30, 2006
Messages
7,223 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
Power is cost and if you have large scale datacenters you will want highly efficient stuff.
Not sure where it says the H/A800 has poorer performance/watt than the H/A100 series.
 
Top