• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Files Patent for Chiplet Machine Learning Accelerator to be Paired With GPU, Cache Chiplets

Raevenlord

News Editor
Joined
Aug 12, 2016
Messages
3,755 (1.24/day)
Location
Portugal
System Name The Ryzening
Processor AMD Ryzen 9 5900X
Motherboard MSI X570 MAG TOMAHAWK
Cooling Lian Li Galahad 360mm AIO
Memory 32 GB G.Skill Trident Z F4-3733 (4x 8 GB)
Video Card(s) Gigabyte RTX 3070 Ti
Storage Boot: Transcend MTE220S 2TB, Kintson A2000 1TB, Seagate Firewolf Pro 14 TB
Display(s) Acer Nitro VG270UP (1440p 144 Hz IPS)
Case Lian Li O11DX Dynamic White
Audio Device(s) iFi Audio Zen DAC
Power Supply Seasonic Focus+ 750 W
Mouse Cooler Master Masterkeys Lite L
Keyboard Cooler Master Masterkeys Lite L
Software Windows 10 x64
AMD has filed a patent whereby they describe a MLA (Machine Learning Accelerator) chiplet design that can then be paired with a GPU unit (such as RDNA 3) and a cache unit (likely a GPU-excised version of AMD's Infinity Cache design debuted with RDNA 2) to create what AMD is calling an "APD" (Accelerated Processing Device). The design would thus enable AMD to create a chiplet-based machine learning accelerator whose sole function would be to accelerate machine learning - specifically, matrix multiplication. This would enable capabilities not unlike those available through NVIDIA's Tensor cores.

This could give AMD a modular way to add machine-learning capabilities to several of their designs through the inclusion of such a chiplet, and might be AMD's way of achieving hardware acceleration of a DLSS-like feature. This would avoid the shortcomings associated with implementing it in the GPU package itself - an increase in overall die area, with thus increased cost and reduced yields, while at the same time enabling AMD to deploy it in other products other than GPU packages. The patent describes the possibility of different manufacturing technologies being employed in the chiplet-based design - harkening back to the I/O modules in Ryzen CPUs, manufactured via a 12 nm process, and not the 7 nm one used for the core chiplets. The patent also describes acceleration of cache-requests from the GPU die to the cache chiplet, and on-the-fly usage of it as actual cache, or as directly-addressable memory.



View at TechPowerUp Main Site
 

FreedomEclipse

~Technological Technocrat~
Joined
Apr 20, 2007
Messages
24,060 (3.74/day)
Location
London,UK
System Name DarnGosh Edition
Processor AMD 7800X3D
Motherboard MSI X670E GAMING PLUS
Cooling Thermalright AM5 Contact Frame + Phantom Spirit 120SE
Memory 2x32GB G.Skill Trident Z5 NEO DDR5 6000 CL32-38-38-96
Video Card(s) Asus Dual Radeon™ RX 6700 XT OC Edition
Storage WD SN770 1TB (Boot)| 2x 2TB WD SN770 (Gaming)| 2x 2TB Crucial BX500| 2x 3TB Toshiba DT01ACA300
Display(s) LG GP850-B
Case Corsair 760T (White) {1xCorsair ML120 Pro|5xML140 Pro}
Audio Device(s) Yamaha RX-V573|Speakers: JBL Control One|Auna 300-CN|Wharfedale Diamond SW150
Power Supply Seasonic Focus GX-850 80+ GOLD
Mouse Logitech G502 X
Keyboard Duckyshine Dead LED(s) III
Software Windows 11 Home
Benchmark Scores ლ(ಠ益ಠ)ლ
Joined
Sep 1, 2020
Messages
2,353 (1.52/day)
Location
Bulgaria
LoL. Artificial Intelect in GPU? Is possible to talk use street language with next gen graphic cards?
 
Joined
Jan 8, 2017
Messages
9,436 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
Interesting, clearly GPUs are being left in the dust by dedicated accelerators for certain computations. But dedicated accelerators are inflexible while GPUs are fully programable and can run just about anything and they both have a major problem, memory bandwidth. This is a nice way of solving all the problems.
 
Joined
Aug 12, 2018
Messages
30 (0.01/day)
With the purchase of Xilinx I suspect AMD will start reducing the number of hardware accelerated functions like encode and decode, machine learning, and will replace them with an FPGA section. Perfect for chiplet implementation.
 
Joined
Feb 8, 2020
Messages
19 (0.01/day)
Location
Ottawa, Canada
System Name La Machina
Processor AMD Ryzen 2700
Motherboard ASUS B450 TUF mATX
Cooling EVO 212
Memory Corsair 3200MHz CL16
Video Card(s) RX 560
Storage Some SSD here, some old spinning stuff there
Display(s) 4k Samsung TV and an Asus Pro Art 231
Case Some microatx Antec
Audio Device(s) ASUS Essence STX
Power Supply Seasonic 600W maybe?
I'm loving this idea. I've been thinking about this for a while now. Add a little machine learning hardware to a APU SoC or a GPU, and things could be really interesting for both the general use, gaming and GPU-computing markets. Like OGoc said, match this with FPGA tech, and the potential becomes pretty obvious.
 
Joined
Feb 11, 2009
Messages
5,555 (0.96/day)
System Name Cyberline
Processor Intel Core i7 2600k -> 12600k
Motherboard Asus P8P67 LE Rev 3.0 -> Gigabyte Z690 Auros Elite DDR4
Cooling Tuniq Tower 120 -> Custom Watercoolingloop
Memory Corsair (4x2) 8gb 1600mhz -> Crucial (8x2) 16gb 3600mhz
Video Card(s) AMD RX480 -> RX7800XT
Storage Samsung 750 Evo 250gb SSD + WD 1tb x 2 + WD 2tb -> 2tb MVMe SSD
Display(s) Philips 32inch LPF5605H (television) -> Dell S3220DGF
Case antec 600 -> Thermaltake Tenor HTCP case
Audio Device(s) Focusrite 2i4 (USB)
Power Supply Seasonic 620watt 80+ Platinum
Mouse Elecom EX-G
Keyboard Rapoo V700
Software Windows 10 Pro 64bit
Joined
Jul 16, 2016
Messages
300 (0.10/day)
Location
Binghamton, NY
System Name The Final Straw
Processor Intel i7-7700
Motherboard Asus Prime H270M Plus
Cooling Arctic Liquid Freezer II 120
Memory G.Skill 32GB DDR4 2400 - F4-2400C15D
Video Card(s) EVGA GTX 1660 Super SC Ultra 6GB GDDR6
Storage WD Blue SN550 512GB and 1TB M.2 + Seagate 2TB 7200 SATA
Display(s) Acer VG270U P 2k
Case Thermaltake Versa H17
Audio Device(s) HDMI
Power Supply EVGA 750 white
Mouse Logitech
Keyboard Logitech
VR HMD Why?
Software Windows 10
Benchmark Scores 3DMark06 = 33,624 / Fire Strike = 12,690 / Time Spy = 5,465 as of 7/16/2024
0b709598dc96252fab45c17486f79891.jpg

Chicklets Machine...
 
Joined
Apr 24, 2020
Messages
2,710 (1.61/day)
With the purchase of Xilinx I suspect AMD will start reducing the number of hardware accelerated functions like encode and decode, machine learning, and will replace them with an FPGA section. Perfect for chiplet implementation.

Those Xilinx FPGAs are VLIW SIMD cores, probably more similarities to a GPU than you might think.

Yeah, there are some LUTs on those FPGAs, but the actual computational girth comes from these babies: https://www.xilinx.com/support/documentation/white_papers/wp506-ai-engine.pdf
 
Joined
Nov 19, 2018
Messages
59 (0.03/day)
I'm sorry, I just don't want them to add yet another reason to raise the prices of these already far too expensive graphics cards. I have 0 excitement for this.
 
Joined
Mar 10, 2010
Messages
11,878 (2.21/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
I would love to see AMD and intel secret sauce chiplets of the future lists, this isn't unexpected what arm, apple and many more do with specific hardware X86 will leverage more heavy-hitting but adaptable circuitry.
would be nice if Amd got one Api ish too.
 
Joined
Jan 3, 2021
Messages
3,501 (2.46/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
You get a chiplet, and you get a chiplet, and you get a chiplet! Chiplets for everyone! WOO!

In all seriousness, this just sounds like AMD doing more of the same thing they've been working towards for years. You have an I/O chiplet, and a CPU chiplet, and soon we'll have GPU chiplets and AI accelerator chiplets. We've already seen that this can scale well, so this should be an exciting prospect for future products. An APU with one of all of the above would be one hell of a chip.
 
Joined
Sep 28, 2012
Messages
980 (0.22/day)
System Name Poor Man's PC
Processor waiting for 9800X3D...
Motherboard MSI B650M Mortar WiFi
Cooling Thermalright Phantom Spirit 120 with Arctic P12 Max fan
Memory 32GB GSkill Flare X5 DDR5 6000Mhz
Video Card(s) XFX Merc 310 Radeon RX 7900 XT
Storage XPG Gammix S70 Blade 2TB + 8 TB WD Ultrastar DC HC320
Display(s) Xiaomi G Pro 27i MiniLED + AOC 22BH2M2
Case Asus A21 Case
Audio Device(s) MPow Air Wireless + Mi Soundbar
Power Supply Enermax Revolution DF 650W Gold
Mouse Logitech MX Anywhere 3
Keyboard Logitech Pro X + Kailh box heavy pale blue switch + Durock stabilizers
VR HMD Meta Quest 2
Benchmark Scores Who need bench when everything already fast?
Apparently first attempt of RT implementation didn't go well and AMD trying to solve it with another "glue". With another bump in cache size and leaning towards agnostic function, I can see wider adoption not just RT in gaming.
 
Joined
Apr 24, 2020
Messages
2,710 (1.61/day)
Apparently first attempt of RT implementation didn't go well and AMD trying to solve it with another "glue". With another bump in cache size and leaning towards agnostic function, I can see wider adoption not just RT in gaming.

AMD has had so many patents over the years that I've basically stopped paying attention to patents in general.

Remember "Super ALUs" ?? Yeah, they're not around. AMD decided against them for whatever reason. Maybe it wasn't as good as other techniques they got, or maybe they ran some simulations and it could have made things worse. Just wait for the whitepapers to come out.
 
Joined
Sep 17, 2014
Messages
22,452 (6.03/day)
Location
The Washing Machine
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling Thermalright Peerless Assassin
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
The next AMD meme

MOAR CHIPLUTZ
 
Joined
Jan 8, 2017
Messages
9,436 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
Apparently first attempt of RT implementation didn't go well and AMD trying to solve it with another "glue".
This has zero to do with RT.
 

Nkd

Joined
Sep 15, 2007
Messages
364 (0.06/day)
I'm sorry, I just don't want them to add yet another reason to raise the prices of these already far too expensive graphics cards. I have 0 excitement for this.
what are you talking about? this actually makes shit cheaper because you are not making one big fat die like Nvidia, because eventually you are going to need chiplets because you are not going to keep shrinking forever. AMD is just ahead of everyone and have been working towards this for years. You have nothing to worry about lol.

AMD has had so many patents over the years that I've basically stopped paying attention to patents in general.

Remember "Super ALUs" ?? Yeah, they're not around. AMD decided against them for whatever reason. Maybe it wasn't as good as other techniques they got, or maybe they ran some simulations and it could have made things worse. Just wait for the whitepapers to come out.
Clearly this is totally different and fits exactly in to their future gameplan. Not all Patents are the same, some do have big implications lol.
 
Joined
Mar 10, 2010
Messages
11,878 (2.21/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
what are you talking about? this actually makes shit cheaper because you are not making one big fat die like Nvidia, because eventually you are going to need chiplets because you are not going to keep shrinking forever. AMD is just ahead of everyone and have been working towards this for years. You have nothing to worry about lol.


Clearly this is totally different and fits exactly in to their future gameplan. Not all Patents are the same, some do have big implications lol.
I agree but think The point of chiplets is that they are one of few ways to make cutting edge nodes financially viable, as time goes by this is only going to escalate, by 2nm and euv processing as well as the increase in mask costs are putting the cost of a complete wafer up considerably and advanced packaging technology isn't cheaper packaging technology, AMD were ahead of the game but emib does rule some of that gain out, interesting times what with others still stuck on monolithic designs, apple for example.
 
Joined
Sep 13, 2007
Messages
225 (0.04/day)
System Name Suteki Ryzen
Processor AMD Ryzen 3900 @ Stock
Motherboard MSI Tomahawk B450 MAX
Cooling Stock Box Cooler
Memory 32gb ( 2x16gb) 3600mhz DDR4 G.Skill 16-19-19-39-58
Video Card(s) MSI GeForce RTX 2060 6GB Ram
Storage 256gb nvme SSD, 250GB SATA3 SSD, 480GB USB3 SSD, 750GB SSHD, 3TB WD Red HDD
Display(s) Asus VE220 / Epson TW-3000 1080p projector
Case Raijintek Thetis Black
Audio Device(s) Asus Xonar D1 PCI on a PCIe-to-PCI Adapter
Power Supply EVGA 550G2
Mouse Razer DeathAdder 2013
Software Windows 10 Pro
Ohhh... This sounds quite promising! Shame most machine learning frameworks are written for cuda. Hope if this comes out, bigger frameworks like Tensorflow or PyTorch make use of it.
 
Joined
Oct 27, 2009
Messages
1,184 (0.21/day)
Location
Republic of Texas
System Name [H]arbringer
Processor 4x 61XX ES @3.5Ghz (48cores)
Motherboard SM GL
Cooling 3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory 16x gskill DDR3 1600 cas6 2gb
Video Card(s) blah bigadv folder no gfx needed
Storage 32GB Sammy SSD
Display(s) headless
Case Xigmatek Elysium (whats left of it)
Audio Device(s) yawn
Power Supply Antec 1200w HCP
Software Ubuntu 10.10
Benchmark Scores http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww
Ohhh... This sounds quite promising! Shame most machine learning frameworks are written for cuda. Hope if this comes out, bigger frameworks like Tensorflow or PyTorch make use of it.
Tensorflow was made by google for their own TPU hardware. It works on anyone's stuff.
https://github.com/ROCmSoftwarePlatform/tensorflow-upstream Been supported via ROCm for a couple of years now.
PyTorch has been supported since ROCm 3.7, 4.01 is current. https://github.com/aieater/rocm_pytorch_informations

Nvidia's stuff is definitely a bit more plug and play, and AMD's engineering support is just now ramping, they have a long way to catch up.

There are a lot of interesting accelerators on the market now, its a fun time.
 
Joined
Sep 28, 2012
Messages
980 (0.22/day)
System Name Poor Man's PC
Processor waiting for 9800X3D...
Motherboard MSI B650M Mortar WiFi
Cooling Thermalright Phantom Spirit 120 with Arctic P12 Max fan
Memory 32GB GSkill Flare X5 DDR5 6000Mhz
Video Card(s) XFX Merc 310 Radeon RX 7900 XT
Storage XPG Gammix S70 Blade 2TB + 8 TB WD Ultrastar DC HC320
Display(s) Xiaomi G Pro 27i MiniLED + AOC 22BH2M2
Case Asus A21 Case
Audio Device(s) MPow Air Wireless + Mi Soundbar
Power Supply Enermax Revolution DF 650W Gold
Mouse Logitech MX Anywhere 3
Keyboard Logitech Pro X + Kailh box heavy pale blue switch + Durock stabilizers
VR HMD Meta Quest 2
Benchmark Scores Who need bench when everything already fast?
This has zero to do with RT.

Bummer, I thought matrix multiplication sound like complex version of Fused Multiply Add :D

The design would thus enable AMD to create a chiplet-based machine learning accelerator whose sole function would be to accelerate machine learning - specifically, matrix multiplication
 
Joined
Aug 17, 2017
Messages
274 (0.10/day)
years ago Intel created the first chiplets, why didn't they patent the idea then?? maybe they didn't do so because of that previous do nothing ceo they had? (I am referring to the ceo who was getting his noddle wet with an employee)
 
Top