• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Meta Announces New MTIA AI Accelerator with Improved Performance to Ease NVIDIA's Grip

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,644 (0.99/day)
Meta has announced the next generation of its Meta Training and Inference Accelerator (MTIA) chip, which is designed to train and infer AI models at scale. The newest MTIA chip is a second-generation design of Meta's custom silicon for AI, and it is being built on TSMC's 5 nm technology. Running at the frequency of 1.35 GHz, the new chip is getting a boost to 90 Watts of TDP per package compared to just 25 Watts for the first-generation design. Basic Linear Algebra Subprograms (BLAS) processing is where the chip shines, and it includes matrix multiplication and vector/SIMD processing. At GEMM matrix processing, each chip can process 708 TeraFLOPS at INT8 (presumably meant FP8 in the spec) with sparsity, 354 TeraFLOPS without, 354 TeraFLOPS at FP16/BF16 with sparsity, and 177 TeraFLOPS without.

Classical vector and processing is a bit slower at 11.06 TeraFLOPS at INT8 (FP8), 5.53 TeraFLOPS at FP16/BF16, and 2.76 TFLOPS single-precision FP32. The MTIA chip is specifically designed to run AI training and inference on Meta's PyTorch AI framework, with an open-source Triton backend that produces compiler code for optimal performance. Meta uses this for all its Llama models, and with Llama3 just around the corner, it could be trained on these chips. To package it into a system, Meta puts two of these chips onto a board and pairs them with 128 GB of LPDDR5 memory. The board is connected via PCIe Gen 5 to a system where 12 boards are stacked densely. This process is repeated six times in a single rack for 72 boards and 144 chips in a single rack for a total of 101.95 PetaFLOPS, assuming linear scaling at INT8 (FP8) precision. Of course, linear scaling is not quite possible in scale-out systems, which could bring it down to under 100 PetaFLOPS per rack.



Below, you can see images of the chip floorplan, specifications compared to the prior version, as well as the system.




View at TechPowerUp Main Site | Source
 
Joined
Jan 3, 2021
Messages
3,589 (2.48/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
Thanks for correcting Meta's "TFLOPS/s" from their AI-generated specifications list to TFLOPS.
 
Joined
Jul 31, 2020
Messages
17 (0.01/day)
System Name work PC
Processor AMD Ryzen 5 1600
Motherboard ASRock AB350 Pro4
Cooling DeepCool GAMMAXX C40
Memory 32GB (HyperX Fury 16GB DDR4 PC4-21300 x2)
Video Card(s) Palit GeForce RTX 2070 Dual 8GB
Storage Samsung SSD 970 EVO Plus 250GB + 4TB WD R HDD
Display(s) LG (27") + ViewSonic VP2030b (20")
Case Zalman S1
Audio Device(s) Superlux HD 681, etc.
Mouse Redragon M650
Keyboard logitech K280e
Yeah I noticed that as well. The S literally stands for Second so not sure why do it again. ¯\_(ツ)_/¯
AI acceleration, you know...
 
Joined
Feb 23, 2016
Messages
135 (0.04/day)
System Name Computer!
Processor i7-6700K
Motherboard AsRock Z170 Extreme 7+
Cooling EKWB on CPU & GPU, 240 slim and 360 Monsta, Aquacomputer Aquabus D5, Aquaaero 6 Pro.
Memory 32Gb Kingston Hyper-X 3Ghz
Video Card(s) Asus 980 Ti Strix
Storage 2 x 950 Pro
Display(s) Old Acer thing
Case NZXT 440 Modded
Audio Device(s) onboard
Power Supply Seasonic PII 600W Platinum
Mouse Razer Deathadder Chroma
Keyboard Logitech G15
Software Win 10 Pro
Anyone notice from spec sheet: nearly 4x power for 'only' 2x performance of 1st Gen?

Edit: Was looking at the Instances flops, maybe that is a bad comparison ¯\_(ツ)_/¯
 
Joined
Jan 3, 2021
Messages
3,589 (2.48/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
Lol, I get it, nice joke. Meters per second per second and all that.
But if they are actually right... then we better run... and run fast!
 
Joined
Dec 30, 2021
Messages
394 (0.36/day)
Anyone notice from spec sheet: nearly 4x power for 'only' 2x performance of 1st Gen?

Edit: Was looking at the Instances flops, maybe that is a bad comparison ¯\_(ツ)_/¯
I don't know much about AI compute specs, but it's definitely over 3x faster in several metrics in their spec sheet. This is a card designed primarily for their own use, so it only really needs to be faster in the ways that matter to them, anyway.
 
Joined
Nov 4, 2005
Messages
12,007 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
I too RPMs at that.
 
Joined
Feb 20, 2020
Messages
9,340 (5.29/day)
Location
Louisiana
System Name Ghetto Rigs z490|x99|Acer 17 Nitro 7840hs/ 5600c40-2x16/ 4060/ 1tb acer stock m.2/ 4tb sn850x
Processor 10900k w/Optimus Foundation | 5930k w/Black Noctua D15
Motherboard z490 Maximus XII Apex | x99 Sabertooth
Cooling oCool D5 res-combo/280 GTX/ Optimus Foundation/ gpu water block | Blk D15
Memory Trident-Z Royal 4000c16 2x16gb | Trident-Z 3200c14 4x8gb
Video Card(s) Titan Xp-water | evga 980ti gaming-w/ air
Storage 970evo+500gb & sn850x 4tb | 860 pro 256gb | Acer m.2 1tb/ sn850x 4tb| Many2.5" sata's ssd 3.5hdd's
Display(s) 1-AOC G2460PG 24"G-Sync 144Hz/ 2nd 1-ASUS VG248QE 24"/ 3rd LG 43" series
Case D450 | Cherry Entertainment center on Test bench
Audio Device(s) Built in Realtek x2 with 2-Insignia 2.0 sound bars & 1-LG sound bar
Power Supply EVGA 1000P2 with APC AX1500 | 850P2 with CyberPower-GX1325U
Mouse Redragon 901 Perdition x3
Keyboard G710+x3
Software Win-7 pro x3 and win-10 & 11pro x3
Benchmark Scores Are in the benchmark section
Hi,
More worrying is Meta's grip a known bad actor hehe
 

the54thvoid

Super Intoxicated Moderator
Staff member
Joined
Dec 14, 2009
Messages
13,106 (2.39/day)
Location
Glasgow - home of formal profanity
Processor Ryzen 7800X3D
Motherboard MSI MAG Mortar B650 (wifi)
Cooling be quiet! Dark Rock Pro 4
Memory 32GB Kingston Fury
Video Card(s) Gainward RTX4070ti
Storage Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s) LG 32" 165Hz 1440p GSYNC
Case Asus Prime AP201
Audio Device(s) On Board
Power Supply be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software W10
Hi,
More worrying is Meta's grip a known bad actor hehe

They're a bad actor for all sides - that makes them chaotic evil, I believe.

And with a bit of PR, they think they can swipe away Nvidia's marketshare? Don't think it works that way.
 
Joined
Jan 3, 2021
Messages
3,589 (2.48/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
This is a card designed primarily for their own use, so it only really needs to be faster in the ways that matter to them, anyway.
They have published detailed specs, it looks like they are going to sell the card to others too.
 
Joined
Nov 27, 2023
Messages
2,496 (6.43/day)
System Name The Workhorse
Processor AMD Ryzen R9 5900X
Motherboard Gigabyte Aorus B550 Pro
Cooling CPU - Noctua NH-D15S Case - 3 Noctua NF-A14 PWM at the bottom, 2 Fractal Design 180mm at the front
Memory GSkill Trident Z 3200CL14
Video Card(s) NVidia GTX 1070 MSI QuickSilver
Storage Adata SX8200Pro
Display(s) LG 32GK850G
Case Fractal Design Torrent (Solid)
Audio Device(s) FiiO E-10K DAC/Amp, Samson Meteorite USB Microphone
Power Supply Corsair RMx850 (2018)
Mouse Razer Viper (Original) on a X-Raypad Equate Plus V2
Keyboard Cooler Master QuickFire Rapid TKL keyboard (Cherry MX Black)
Software Windows 11 Pro (24H2)
And with a bit of PR, they think they can swipe away Nvidia's marketshare? Don't think it works that way.
It’s Meta, the company who apparently thought that saying “metaverse” enough times and even rebranding themselves as it would inevitably lead to said stupid idea becoming a reality and making them bank. They are Chaotic Evil alright, also in the sense that any rational thought had left the building a while ago.
 
Joined
Jan 3, 2021
Messages
3,589 (2.48/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
Hi,
More worrying is Meta's grip a known bad actor hehe
At least they've released some interesting stuff as open source - maybe they'll release the software ecosystem for this chip.
 
Joined
Jun 1, 2021
Messages
310 (0.24/day)
Thanks for correcting Meta's "TFLOPS/s" from their AI-generated specifications list to TFLOPS.
It's even funnier because they put it as

Classical vector and processing is a bit slower at 11.06 TeraFLOPS at INT8 (...)
INT8 isn't 'FLOPS' as FLOPS' are 'FLoating point OPerations per Second'
There is no floating point in int8...
 

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,644 (0.99/day)
It's even funnier because they put it as


INT8 isn't 'FLOPS' as FLOPS' are 'FLoating point OPerations per Second'
There is no floating point in int8...
Which is true. I assume they meant FP8, which is the hot new low-precision format everyone is pushing.
 
Joined
Sep 8, 2009
Messages
1,077 (0.19/day)
Location
Porto
Processor Ryzen 9 5900X
Motherboard Gigabyte X570 Aorus Pro
Cooling AiO 240mm
Memory 2x 32GB Kingston Fury Beast 3600MHz CL18
Video Card(s) Radeon RX 6900XT Reference (amd.com)
Storage O.S.: 256GB SATA | 2x 1TB SanDisk SSD SATA Data | Games: 1TB Samsung 970 Evo
Display(s) LG 34" UWQHD
Audio Device(s) X-Fi XtremeMusic + Gigaworks SB750 7.1 THX
Power Supply XFX 850W
Mouse Logitech G502 Wireless
VR HMD Lenovo Explorer
Software Windows 10 64bit
Unless Meta is going to be selling these as AIB products, it's not really going to ease Nvidia's grip.

Nvidia has plenty of competition in cloud services regardless of the hardware. What the market is lacking is competition in hardware that can be bought to run in the clients' installations.
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
27,066 (3.83/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Razer Viper mini signature edition (mercury white)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
Unless Meta is going to be selling these as AIB products, it's not really going to ease Nvidia's grip.
It’s eases the grip Nvidia has on them.
 
Joined
May 3, 2018
Messages
2,881 (1.19/day)
Honestly if it came to a choice, I would choose Nvidia over Meta any day of the week. Fcukerberg is one of the three biggest scum on the planet. Huang an amateur compared to this clown. I put Meta and Google in the same category. Nvidia is next tier down.
 
Top