• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AI Startup Etched Unveils Transformer ASIC Claiming 20x Speed-up Over NVIDIA H100

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,626 (0.98/day)
A new startup emerged out of stealth mode today to power the next generation of generative AI. Etched is a company that makes an application-specific integrated circuit (ASIC) to process "Transformers." The transformer is an architecture for designing deep learning models developed by Google and is now the powerhouse behind models like OpenAI's GPT-4o in ChatGPT, Anthropic Claude, Google Gemini, and Meta's Llama family. Etched wanted to create an ASIC for processing only the transformer models, making a chip called Sohu. The claim is Sohu outperforms NVIDIA's latest and greatest by an entire order of magnitude. Where a server configuration with eight NVIDIA H100 GPU clusters pushes Llama-3 70B models at 25,000 tokens per second, and the latest eight B200 "Blackwell" GPU cluster pushes 43,000 tokens/s, the eight Sohu clusters manage to output 500,000 tokens per second.

Why is this important? Not only does the ASIC outperform Hopper by 20x and Blackwell by 10x, but it also serves so many tokens per second that it enables an entirely new fleet of AI applications requiring real-time output. The Sohu architecture is so efficient that 90% of the FLOPS can be used, while traditional GPUs boast a 30-40% FLOP utilization rate. This translates into inefficiency and waste of power, which Etched hopes to solve by building an accelerator dedicated to power transformers (the "T" in GPT) at massive scales. Given that the frontier model development costs more than one billion US dollars, and hardware costs are measured in tens of billions of US Dollars, having an accelerator dedicated to powering a specific application can help advance AI faster. AI researchers often say that "scale is all you need" (resembling the legendary "attention is all you need" paper), and Etched wants to build on that.




However, there are some doubts going forward. While it is generally believed that transformers are the "future" of AI development, having an ASIC solves the problem until the operations change. For example, this is reminiscent of the crypto mining craze, which brought a few cycles of crypto ASIC miners that are now worthless pieces of sand, like Ethereum miners used to dig the ETH coin on proof of work staking, and now that ETH has transitioned to proof of stake, ETH mining ASICs are worthless.

Nonetheless, Etched wants the success formula to be simple: run transformer-based models on the Sohu ASIC with an open-source software ecosystem and scale it to massive sizes. While details are scarce, we know that the ASIC runs on 144 GB of HBM3E memory, and the chip is manufactured on TSMC's 4 nm process. Enabling AI models with 100 trillion parameters, more than 55x bigger than GPT-4's 1.8 trillion parameter design.

View at TechPowerUp Main Site | Source
 

Space Lynx

Astronaut
Joined
Oct 17, 2014
Messages
17,387 (4.69/day)
Location
Kepler-186f
Processor 7800X3D -25 all core
Motherboard B650 Steel Legend
Cooling Frost Commander 140
Video Card(s) Merc 310 7900 XT @3100 core -.75v
Display(s) Agon 27" QD-OLED Glossy 240hz 1440p
Case NZXT H710 (Red/Black)
Audio Device(s) Asgard 2, Modi 3, HD58X
Power Supply Corsair RM850x Gold
RIP Nvidia?
 
Joined
Jan 5, 2006
Messages
18,584 (2.69/day)
System Name AlderLake
Processor Intel i7 12700K P-Cores @ 5Ghz
Motherboard Gigabyte Z690 Aorus Master
Cooling Noctua NH-U12A 2 fans + Thermal Grizzly Kryonaut Extreme + 5 case fans
Memory 32GB DDR5 Corsair Dominator Platinum RGB 6000MT/s CL36
Video Card(s) MSI RTX 2070 Super Gaming X Trio
Storage Samsung 980 Pro 1TB + 970 Evo 500GB + 850 Pro 512GB + 860 Evo 1TB x2
Display(s) 23.8" Dell S2417DG 165Hz G-Sync 1440p
Case Be quiet! Silent Base 600 - Window
Audio Device(s) Panasonic SA-PMX94 / Realtek onboard + B&O speaker system / Harman Kardon Go + Play / Logitech G533
Power Supply Seasonic Focus Plus Gold 750W
Mouse Logitech MX Anywhere 2 Laser wireless
Keyboard RAPOO E9270P Black 5GHz wireless
Software Windows 11
Benchmark Scores Cinebench R23 (Single Core) 1936 @ stock Cinebench R23 (Multi Core) 23006 @ stock
RIP Nvidia?


Working Stock Market GIF by Adult Swim




That would be something...
 

dgianstefani

TPU Proofreader
Staff member
Joined
Dec 29, 2017
Messages
5,070 (2.00/day)
Location
Swansea, Wales
System Name Silent
Processor Ryzen 7800X3D @ 5.15ghz BCLK OC, TG AM5 High Performance Heatspreader
Motherboard ASUS ROG Strix X670E-I, chipset fans replaced with Noctua A14x25 G2
Cooling Optimus Block, HWLabs Copper 240/40 + 240/30, D5/Res, 4x Noctua A12x25, 1x A14G2, Mayhems Ultra Pure
Memory 32 GB Dominator Platinum 6150 MT 26-36-36-48, 56.6ns AIDA, 2050 FCLK, 160 ns tRFC, active cooled
Video Card(s) RTX 3080 Ti Founders Edition, Conductonaut Extreme, 18 W/mK MinusPad Extreme, Corsair XG7 Waterblock
Storage Intel Optane DC P1600X 118 GB, Samsung 990 Pro 2 TB
Display(s) 32" 240 Hz 1440p Samsung G7, 31.5" 165 Hz 1440p LG NanoIPS Ultragear, MX900 dual gas VESA mount
Case Sliger SM570 CNC Aluminium 13-Litre, 3D printed feet, custom front, LINKUP Ultra PCIe 4.0 x16 white
Audio Device(s) Audeze Maxwell Ultraviolet w/upgrade pads & LCD headband, Galaxy Buds 3 Pro, Razer Nommo Pro
Power Supply SF750 Plat, full transparent custom cables, Sentinel Pro 1500 Online Double Conversion UPS w/Noctua
Mouse Razer Viper V3 Pro 8 KHz Mercury White w/Tiger Ice Skates & Pulsar Supergrip tape, Razer Atlas
Keyboard Wooting 60HE+ module, TOFU-R CNC Alu/Brass, SS Prismcaps W+Jellykey, LekkerV2 mod, TLabs Leath/Suede
Software Windows 11 IoT Enterprise LTSC 24H2
Benchmark Scores Legendary
Joined
Oct 6, 2021
Messages
1,605 (1.38/day)
"The Sohu architecture is so efficient that 90% of the FLOPS can be used, while traditional GPUs boast a 30-40% FLOP utilization rate."

In some cases even less than that. Eventually, the limitations of silicon might compel companies to rethink their GPUs completely, but it's hard to say for sure.
 
Joined
Sep 6, 2013
Messages
3,371 (0.82/day)
Location
Athens, Greece
System Name 3 desktop systems: Gaming / Internet / HTPC
Processor Ryzen 5 5500 / Ryzen 5 4600G / FX 6300 (12 years latter got to see how bad Bulldozer is)
Motherboard MSI X470 Gaming Plus Max (1) / MSI X470 Gaming Plus Max (2) / Gigabyte GA-990XA-UD3
Cooling Νoctua U12S / Segotep T4 / Snowman M-T6
Memory 32GB - 16GB G.Skill RIPJAWS 3600+16GB G.Skill Aegis 3200 / 16GB JUHOR / 16GB Kingston 2400MHz (DDR3)
Video Card(s) ASRock RX 6600 + GT 710 (PhysX)/ Vega 7 integrated / Radeon RX 580
Storage NVMes, ONLY NVMes/ NVMes, SATA Storage / NVMe boot(Clover), SATA storage
Display(s) Philips 43PUS8857/12 UHD TV (120Hz, HDR, FreeSync Premium) ---- 19'' HP monitor + BlitzWolf BW-V5
Case Sharkoon Rebel 12 / CoolerMaster Elite 361 / Xigmatek Midguard
Audio Device(s) onboard
Power Supply Chieftec 850W / Silver Power 400W / Sharkoon 650W
Mouse CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Keyboard CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Software Windows 10 / Windows 10&Windows 11 / Windows 10
GPUs can be used for many things, no matter how inefficient they might look. I think this is the reason why Intel's Gaudi isn't having as much success. While in the short time buying specialized hardware looks as the smart move from any perspective, that hardware could end up as a huge and highly expensive pile of garbage, if things somewhat change, as mentioned in the article. With GPUs you adapt them or just throw them to do other computational tasks.
 

64K

Joined
Mar 13, 2014
Messages
6,773 (1.72/day)
Processor i7 7700k
Motherboard MSI Z270 SLI Plus
Cooling CM Hyper 212 EVO
Memory 2 x 8 GB Corsair Vengeance
Video Card(s) Temporary MSI RTX 4070 Super
Storage Samsung 850 EVO 250 GB and WD Black 4TB
Display(s) Temporary Viewsonic 4K 60 Hz
Case Corsair Obsidian 750D Airflow Edition
Audio Device(s) Onboard
Power Supply EVGA SuperNova 850 W Gold
Mouse Logitech G502
Keyboard Logitech G105
Software Windows 10
It comes down to $$$ at the end of the day. Can this do the same thing as Nvidia GPUs for the same money or preferably less since they are the relatively unknown and businesses are more comfortable sticking with the known which gives Nvidia the edge.
 
Joined
Feb 18, 2005
Messages
5,847 (0.81/day)
Location
Ikenai borderline!
System Name Firelance.
Processor Threadripper 3960X
Motherboard ROG Strix TRX40-E Gaming
Cooling IceGem 360 + 6x Arctic Cooling P12
Memory 8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s) MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage 2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s) 3x AOC Q32E2N (32" 2560x1440 75Hz)
Case Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply Fractal Design Ion+ 2 Platinum 760W
Mouse Logitech G602
Keyboard Razer Pro Type Ultra
Software Windows 10 Professional x64
I wonder how long before somebody creates an "AI" company called "Grift".
 
Joined
Nov 4, 2005
Messages
11,994 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
"The Sohu architecture is so efficient that 90% of the FLOPS can be used, while traditional GPUs boast a 30-40% FLOP utilization rate."

In some cases even less than that. Eventually, the limitations of silicon might compel companies to rethink their GPUs completely, but it's hard to say for sure.


Just look at how AMD screwed the 7900XTX in benchmarking, the dual issue doesn't work unless its verbosely in the code, meaning while it performs great with game ready drivers generic benchmarks or unaware software suffers at half the performance in many situations. GPU hardware is slowly merging on one standard, like X86-64 or ARM is, pretty soon its going to be like ARM hardware, you check the boxes for your application and silicon or whatever substrate is shipped to you.


I wonder how long before somebody creates an "AI" company called "Grift".

The milk maids are a milking.......
 
Joined
Jan 8, 2017
Messages
9,479 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
Just look at how AMD screwed the 7900XTX in benchmarking, the dual issue doesn't work unless its verbosely in the code
Nvidia architectures struggle with utilization just as much, AD102 has 30% more FP32 units than Navi31 but is only 20% faster in raster. In fact every architecture does, CPUs included.
 
Joined
Aug 26, 2021
Messages
382 (0.32/day)
I wonder if ASIC's coming out or soon to be coming out is the reason for Nvidia stock drop say what you want about tech investors but they are a very tech savvy clued in bunch and maybe more ASIC products are on the way or maybe my tinfoil hat needs pressed, folded and recycled.
 
Joined
Jan 14, 2023
Messages
836 (1.20/day)
System Name Asus G16
Processor i9 13980HX
Motherboard Asus motherboard
Cooling 2 fans
Memory 32gb 4800mhz
Video Card(s) 4080 laptop
Storage 16tb, x2 8tb SSD
Display(s) QHD+ 16in 16:10 (2560x1600, WQXGA) 240hz
Power Supply 330w psu
I wonder if ASIC's coming out or soon to be coming out is the reason for Nvidia stock drop say what you want about tech investors but they are a very tech savvy clued in bunch and maybe more ASIC products are on the way or maybe my tinfoil hat needs pressed, folded and recycled.
I sold all my NVDA stock once it hit 1100 or 110. It was a nice run, I still own AMD stock, I bought that right after Zen 1 was released. But I didnt buy very much of it.
 
Joined
Apr 24, 2020
Messages
2,721 (1.61/day)
There's a ton of architectures that are better than NVidia or GPUs in general for AI.

The fundamental fact is that NVidia GPUs are doing FP16 4x4 matrix multiplications as their basis. You can gain significantly more efficiencies by going 8x8 matrix or 16x16 matrix. (Or go TPU and go a full 256x256 sized matrix). The matricies in these "Deep Learning AI" are all huge, so making bigger-and-bigger matricies at a time leads to more efficiencies in power, area, etc. etc.

The issue is that the 4x4 matrix multiplication was chosen because it fits in a GPU register space. Its the best a general purpose GPU can basically do on NVidia's architecture for various reasons. I'd expect that if a few more registers (or 64-way CDNA cores from AMD) were used, then maybe 8x4 or maybe 8x8 sizes could be possible, but even AMD is doing 4x4 matrix sizes on their GPUs. So 4x4 is it.

Anyone can just take a bigger fundamental matrix, write software that efficiently splits up the work as 8x8 or 16x16 (or Google TPU it to 256x256 splits) and get far better efficiency. Its not a secret and such "systolic arrays" are cake to do from an FPGA perspective. The issue is that these bigger architectures are "not GPUs" anymore, and will be useless outside of AI. And furthermore, you have even more competitors (Google TPU in particular) who you actually should be gunning for.

No one is buying NVidia GPUs to lead in AI. They're buying GPUs so that they have something else to do if the AI bubble pops. Its a hedged bet. If you go 100% AI with your ASIC chip (like Google or this "Etched" company), you're absolutely going to get eff'd when the AI bubble pops, as all those chips suddenly become worthless. The NVidia GPUs will lose valuation, but there's still other compute projects you can do with them afterwards.
 
Joined
May 7, 2023
Messages
670 (1.15/day)
Processor Ryzen 5700x
Motherboard Gigabyte Auros Elite AX V2
Cooling Thermalright Peerless Assassin SE White
Memory TeamGroup T-Force Delta RGB 32GB 3600Mhz
Video Card(s) PowerColor Red Dragon Rx 6800
Storage Fanxiang S660 1TB, Fanxiang S500 Pro 1TB, BraveEagle 240GB SSD, 2TB Seagate HDD
Case Corsair 4000D White
Power Supply Corsair RM750x SHIFT
GPUs can be used for many things, no matter how inefficient they might look. I think this is the reason why Intel's Gaudi isn't having as much success. While in the short time buying specialized hardware looks as the smart move from any perspective, that hardware could end up as a huge and highly expensive pile of garbage, if things somewhat change, as mentioned in the article. With GPUs you adapt them or just throw them to do other computational tasks.
When it gets to that point with an ASIC, you are probably many GPU generations ahead anyway, so keep buying GPU's for extortionate pricing or buy an ASIC and replace it when it becomes obsolete? people talking like GPU's don't become obsolete and become e-waste.... they indeed do when performance/efficiency/instruction sets/API's etc are behind the latest generation
 
Joined
May 26, 2023
Messages
43 (0.08/day)
There's a ton of architectures that are better than NVidia or GPUs in general for AI.

The fundamental fact is that NVidia GPUs are doing FP16 4x4 matrix multiplications as their basis. You can gain significantly more efficiencies by going 8x8 matrix or 16x16 matrix. (Or go TPU and go a full 256x256 sized matrix). The matricies in these "Deep Learning AI" are all huge, so making bigger-and-bigger matricies at a time leads to more efficiencies in power, area, etc. etc.

The issue is that the 4x4 matrix multiplication was chosen because it fits in a GPU register space. Its the best a general purpose GPU can basically do on NVidia's architecture for various reasons. I'd expect that if a few more registers (or 64-way CDNA cores from AMD) were used, then maybe 8x4 or maybe 8x8 sizes could be possible, but even AMD is doing 4x4 matrix sizes on their GPUs. So 4x4 is it.

Anyone can just take a bigger fundamental matrix, write software that efficiently splits up the work as 8x8 or 16x16 (or Google TPU it to 256x256 splits) and get far better efficiency. Its not a secret and such "systolic arrays" are cake to do from an FPGA perspective. The issue is that these bigger architectures are "not GPUs" anymore, and will be useless outside of AI. And furthermore, you have even more competitors (Google TPU in particular) who you actually should be gunning for.

No one is buying NVidia GPUs to lead in AI. They're buying GPUs so that they have something else to do if the AI bubble pops. Its a hedged bet. If you go 100% AI with your ASIC chip (like Google or this "Etched" company), you're absolutely going to get eff'd when the AI bubble pops, as all those chips suddenly become worthless. The NVidia GPUs will lose valuation, but there's still other compute projects you can do with them afterwards.

Nobody is hedging with ASICs that cost more than $30000 which are only worth that price at AI. Which is why AMD is #1 on Top 500 and Intel is #2.

You are not doing really much else with A100s and later, because they are not V100s. They are very, very much specialized for AI calculations and economically worthless for anything else.
 
Joined
Apr 24, 2020
Messages
2,721 (1.61/day)
Nobody is hedging with ASICs that cost more than $30000 which are only worth that price at AI. Which is why AMD is #1 on Top 500 and Intel is #2.

You are not doing really much else with A100s and later, because they are not V100s. They are very, very much specialized for AI calculations and economically worthless for anything else.

A100 and H100 are still better at FP64 and FP32 than their predecessors. They're outrageously expensive because of the AI chips, but the overall GPU-performance (aka: traditional FP64 physics modeling performance) is still outstanding.

As such, the H100 is still a hedge. If AI collapses tomorrow, I'd rather have an H100 than an "Etched" AI ASIC. Despite being a hedge, the H100 is still the market leader in practical performance, thanks to all the software optimizations in CUDA. (Even if the fundamental organization of the low-level 4x4 Matrix Multiplication routines are much smaller and less efficient than large 8x8 or 16x16 sized competitors).
 
Joined
Jan 8, 2017
Messages
9,479 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
As such, the H100 is still a hedge.
I really doubt it, things like MI300 look to be much faster in general purpose compute and likely a lot cheaper, if demand for ML drops off a cliff you don't want these on your hand, it will take ages till you break ROI.

Nvidia really doesn't treat these as anything more than ML accelerators despite them still being "GPUs" technically, they have far inferior FP64/FP16 performance compared to MI300 for example.
 
Joined
Jun 18, 2021
Messages
2,554 (2.01/day)
GPUs can be used for many things, no matter how inefficient they might look. I think this is the reason why Intel's Gaudi isn't having as much success. While in the short time buying specialized hardware looks as the smart move from any perspective, that hardware could end up as a huge and highly expensive pile of garbage, if things somewhat change, as mentioned in the article. With GPUs you adapt them or just throw them to do other computational tasks.

Let's put things a different way, everything is an ASIC, the A (application) can just be more or less generic. A gpu is an ASIC designed for a wide range of applications, this startup thingy is designed for a very specific set of instructions, a TPU or NPU is not as generic as a gpu but also not as constrained as what's tipically referred as ASIC like this thingy. Right now everyone is using nvidia gpus because the software stack is very robust and things are still developing very quickly to become tied to a specific instruction set.

That will eventually change.

I wonder if ASIC's coming out or soon to be coming out is the reason for Nvidia stock drop say what you want about tech investors but they are a very tech savvy clued in bunch and maybe more ASIC products are on the way or maybe my tinfoil hat needs pressed, folded and recycled.

Nah, they're just as dumb as anyone else, otherwise nvidia wouldn't be the most valuable company in the world right now.

Anyone can just take a bigger fundamental matrix, write software that efficiently splits up the work as 8x8 or 16x16 (or Google TPU it to 256x256 splits) and get far better efficiency. Its not a secret and such "systolic arrays" are cake to do from an FPGA perspective. The issue is that these bigger architectures are "not GPUs" anymore, and will be useless outside of AI. And furthermore, you have even more competitors (Google TPU in particular) who you actually should be gunning for.

How does intel XMX architecture fare with that?

No one is buying NVidia GPUs to lead in AI. They're buying GPUs so that they have something else to do if the AI bubble pops. Its a hedged bet. If you go 100% AI with your ASIC chip (like Google or this "Etched" company), you're absolutely going to get eff'd when the AI bubble pops, as all those chips suddenly become worthless. The NVidia GPUs will lose valuation, but there's still other compute projects you can do with them afterwards.

I don't think it's about hedging their bets, i think it's just a case of what's available and easy to start with because of all the work nvidia already put towards a robust software stack.
 
Joined
Sep 15, 2011
Messages
6,748 (1.40/day)
Processor Intel® Core™ i7-13700K
Motherboard Gigabyte Z790 Aorus Elite AX
Cooling Noctua NH-D15
Memory 32GB(2x16) DDR5@6600MHz G-Skill Trident Z5
Video Card(s) ZOTAC GAMING GeForce RTX 3080 AMP Holo
Storage 2TB SK Platinum P41 SSD + 4TB SanDisk Ultra SSD + 500GB Samsung 840 EVO SSD
Display(s) Acer Predator X34 3440x1440@100Hz G-Sync
Case NZXT PHANTOM410-BK
Audio Device(s) Creative X-Fi Titanium PCIe
Power Supply Corsair 850W
Mouse Logitech Hero G502 SE
Software Windows 11 Pro - 64bit
Benchmark Scores 30FPS in NFS:Rivals
Are any of those claimed results verified by somebody??
I can claim the sea and the sun, but with actuall proof and 3rd party confirmation, I'm just dust in the wind...
 
Joined
Apr 24, 2020
Messages
2,721 (1.61/day)
Are any of those claimed results verified by somebody??
I can claim the sea and the sun, but with actuall proof and 3rd party confirmation, I'm just dust in the wind...

The benefits, and downsides, of a textbook systolic array architecture are well known and well studied.

If you know how memory is going to move, then you can hardwire the data movements to occur. A hardwired data movement is just that: a wire. It's not even a transistor... a dumb wire is the cheapest thing in cost, power and has instantaneous performance.

The problem with hardwired data movements is that they're hardwired. They literally cannot do anything else. If it's add then multiply, the hardwired will only do adds then multiply. (not like a CPU or GPU that can change the order, the data and do other things).

I can certainly believe that a large systolic array is exponentially faster at this job. But their downsides is that its.... Hardwired. Unchanging. Inflexible.

--------

Systolic arrays were deployed as error correction back in the CD-ROM days, since the error correction always had the same order of math in a regular matrix multiplication pattern. Same with Hardware RAID or other ASICs. They've been for decades, superior in performance.

The question of ASIC AI accelerators is not about the performance benefits. The pure question is if it is a worthy $Billion-ish investment and business plan. It's only a good idea if it makes all the money back.
 
Last edited:
Joined
Feb 12, 2021
Messages
220 (0.16/day)
Let's put things a different way, everything is an ASIC, the A (application) can just be more or less generic.
The A "application" is nothing more than a generic word that has no meaning in this context until it is placed next to the S "specific" "Application Specific" is very, VERY different to "Application" in this context and I would personally take ASIC and all four of it's letters/words together as one because that is how it is meant to be understood and used. I have not read anything else you wrote beyond this point because all of your arguments hinge on this point about "Application" vs "Application Specific" in discussing this proposed new ASIC product.
 
Joined
Aug 17, 2023
Messages
77 (0.16/day)
and what if someone comes up with a better model? probably the Transformers is just the beginning
 
Joined
Jan 11, 2022
Messages
898 (0.84/day)
and what if someone comes up with a better model? probably the Transformers is just the beginning
Like mining, you need new machines.

other than mining, you get to keep running that old workload for maybe a cheaper subscription fee as it’s still worth something

before the as a service model, buying something and having it do that specific thing until you bought something new was normal
 
Top