AMD "Strix Halo" a Large Rectangular BGA Package the Size of an LGA1700 Processor

alwayssts · Jul 9, 2024

AnotherReader said:
Moreover, LPDDR5X, along with HBM3, is the most power efficient DRAM type. Opting for GDDR6 would increase system power consumption without a commensurate performance increase.

There we definitely disagree. Samsung GDDR6 can run at 1.1v (as opposed to 1.05v for LPDDR5x; not a huge difference)...and the bandwidth between the two could make a substantial performance difference.

Certainly the difference between playing a game at 1080p60 or not.

Also, if 4 (16Gb/2GB) chips...8GB. That's what, like ~8-12W? I would take that trade-off 100% of the time, personally.

273gbps with just LPDDR5x; that's not even enough bandwidth for a desktop 7600 (given the similar cache) alone, ntm 7600 was always borderline-acceptable as a contemporary GPU; less-so moving forward.

By all accounts the mobile N33 was a failure (for not meeting even close to that standard); hence why we got the 7600XT (16GB) on desktop; why would they settle for trying to attract the same failed market?

I have to imagine this thing was created to keep pace with (at least) the PS5 (or in laptop GPU terms; at least a 4070 mobile), at least as an *option*.

They could run very low clocks w/o it and that's all fine and good wrt power or competing with Intel...but not when versus a contemporary discrete GPU; most generally target at least that metric.

Add to this...Navi 32 was never released as a mobile part AFAIK. I wonder why that could be...IMO probably because it would be a more-efficient (if-expensive) option than this.

That's why this part makes very little sense to me without some kind of additional bandwidth option....why create something so large if not to do battle with something like a 4070 mobile (and win)?

Besides the obvious reasons, a 'sideport' (yes, I know it's not exactly the same thing) of GDDR6 would make sense for a host of reasons, some that are in that article from 20 years ago.

ADB1979 · Jul 9, 2024

mrnagant said:
And it still has infinity cache. That is neat.

One thing I have been curious about, with this path AMD has been going down. Why have GPU AI accelerators and a dedicated NPU? Is the space the GPU accelerators take up mostly insignificant? What kind of capability overlap do they have? What makes them unique?

RDNA 3.5 is being used in packages like this and won't be offered on a dedicated discrete card where those AI accelerators make sense, as you'd likely not have an NPU. It seems like if your package has an NPU, you could have designed RDNA 3.5 to not have those AI accelerators at all. But AMD chose to leave them there for a reason. I wonder what that reasoning is.

To my knowledge the "AI accelerators" in the GPU that you are talking about do not exist, it's simply just using the GPU to process AI workloads which has been happening for years.

The NPU (Neural Processing Unit "AI engine" in Intel speak) is a different thing and a separate unit that can or cannot be included into the die design. AMD seems to be making a beeline for all of their "client" (end user) chips to have a separate NPU, they have started with the mobile line because the NPU can save a lot of battery power due to being able to do the same job as the GPU or CPU at much lower power consumption. This is not very critical on desktops, and many desktops also have more powerful dedicated GPU's and more powerful CPU's that can handle the AI workload (that MicroShaft etc envisage) so they can wait a while to get dedicated NPU's in future desktop CPU versions (Zen 6 I expect).

R0H1T · Jul 9, 2024

alwayssts said:
273gbps with just LPDDR5x; that's not even enough bandwidth for a desktop 7600

But this isn't really meant for desktops, as of now any way. Think of it as a competitor to Mx Pro not the Max or ultra Max uber edition :laugh:

HisDivineOrder · Jul 9, 2024

If Valve is working on a Steam Deck Home console variant, I hope this (or something similarly strong) is the chip they use for it.

ymdhis · Jul 9, 2024

ToTTenTranz said:
Why? It'll probably be cheaper and faster to get a 12/16-core Ryzen 9 with a discrete GPU. Especially if you wait for RDNA4.

Well we don't yet know the price so it's too early to tell. However RDNA4 has already been all but confirmed to have RDNA3 perf + improved RT.

As for why I'd rather have this? I have specific needs where a dGPU would be a crutch, and building a new AM5 machine would be significantly more efficient in power draw, thermals, noise, physical space used (I find the current trend for 3 slots + 35cm cards to be absolutely disgusting), peripheral support, and it wouldn't be that much worse in cost either since selling my current AM4 machine would cover a large chunk of the cost.

Wirko · Jul 9, 2024

R0H1T said:
What do you mean by split? It has 256bit LPDDR5x support & "possibly" separate support for GDDR6 ~ you can't split memory interfaces like that IIRC.

Two separate and different memory controllers, LPDDR for the CPU, GDDR for the GPU, each 128 bits wide. I too see it as a possibility, unless we consider all leaks as reliable and accurate (huh). There would be one big downside to that: no unified memory pool.

AnotherReader · Jul 9, 2024

alwayssts said:
There we definitely disagree. Samsung GDDR6 can run at 1.1v (as opposed to 1.05v for LPDDR5x; not a huge difference)...and the bandwidth between the two could make a substantial performance difference.

Certainly the difference between playing a game at 1080p60 or not.

Also, if 4 (16Gb/2GB) chips...8GB. That's what, like ~8-12W? I would take that trade-off 100% of the time, personally.

273gbps with just LPDDR5x; that's not even enough bandwidth for a desktop 7600 (given the similar cache) alone, ntm 7600 was always borderline-acceptable as a contemporary GPU; less-so moving forward.

By all accounts the mobile N33 was a failure (for not meeting even close to that standard); hence why we got the 7600XT (16GB) on desktop; why would they settle for trying to attract the same failed market?

I have to imagine this thing was created to keep pace with (at least) the PS5 (or in laptop GPU terms; at least a 4070 mobile), at least as an *option*.

They could run very low clocks w/o it and that's all fine and good wrt power or competing with Intel...but not when versus a contemporary discrete GPU; most generally target at least that metric.

Add to this...Navi 32 was never released as a mobile part AFAIK. I wonder why that could be...IMO probably because it would be a more-efficient (if-expensive) option than this.

That's why this part makes very little sense to me without some kind of additional bandwidth option....why create something so large if not to do battle with something like a 4070 mobile (and win)?

Besides the obvious reasons, a 'sideport' (yes, I know it's not exactly the same thing) of GDDR6 would make sense for a host of reasons, some that are in that article from 20 years ago.

DRAM power efficiency is measured in pJ per transferred bit. LPDDR5 and HBM3 are both around 3.5 pJ per bit while GDDR6 is close to 8 pJ per bit. Note that this doesn't account for device power consumption; only transmission power is considered.

alwayssts · Jul 9, 2024

R0H1T said:
But this isn't really meant for desktops, as of now any way. Think of it as a competitor to Mx Pro not the Max or ultra Max uber edition

Right, but that wasn't my point. The point is the laptop version didn't sell because it didn't perform to that (barely-enough) level of performance from the desktop version.

4070 mobile (really more of a low-clocked 4060 Ti Super) more-or-less can/does, depending on which model you buy and how you use it.

This could (and should) be roughly equal (without) and better than all of those (with it), as perf is perf.

This could actually be a chip that hits that sweet-spot of better than most mobile 4070's and much, much, cheaper than a 4080 mobile (which is actually a cut-down 4070 desktop)...literally in the center and good-enough for general laptop (or even general [1080p60] PC) gaming...but it needs the BW....which the current LPDRR5x/cache simply cannot provide.

As I say, they can go at it with low clocks and high efficiency, and that's fine (as there is a market for that), but sub-optimal GPU perf is still sub-optimal GPU perf regardless.

I'm saying there is a market for what they COULD do, which IMO is the only reason you specifically make this chip. I'm sure a vanilla option will still will look nice wrt power/perf.

The thesis of adding GDDR6 clicked everything into place for me, as I had not even considered that possible.

In reality though, it makes perfect sense (for those willing to use a higher power envelope, just as those whom would buy a discrete 80-120w nVIDIA laptop GPU would do).

I'm simply saying before it looked like they were attempting to kill some small birds (other CPU/SoCs) with a very big stone because the it lacked bw to push it into competing with a discrete GPU; maybe take some market share from <4070 laptops...but they aren't for (most, non e-sport) gaming anyway.

Now it would appear they can kill multiple birds with one stone...and some GDDR6. They could/should be able to compete with, if not exceed the performance of 4070 laptops in a tangible way for less money.

People could actually have a decent 1080p60 SoC laptop

AnotherReader said:
DRAM power efficiency is measured in pJ per transferred bit. LPDDR5 and HBM3 are both around 3.5 pJ per bit while GDDR6 is close to 8 pJ per bit. Note that this doesn't account for device power consumption; only transmission power is considered.

I assume you are using 1.35v with your metric. Try using 1.1v. I don't know if any current products use Samsung @ 1.1v? I think most use Hynix @ 1.35v (and sometimes substitute in Samsung at 1.35v).

I'm not saying it's MORE efficient, I'm saying it's (potentially) not nearly as bad as you're implying, and the power/performance trade-off would be worth it for someone that was deciding between a productivity machine and a budget gaming laptop in a similar price range; especially versus something like a laptop with a 4070/4080 mobile inside of it (which would be much more expensive and/or likely use even more power).

I guess we'll just see how it goes?

It will be interesting to see how such setups without (or conceivingly with) GDDR6 match-up against competing solutions (both in productivity and gaming; iGPU/7600/4060/4070 laptops). I guess I'm just more optimistic this will be a low-cost good-enough option for many different kinds of people/markets vs their direct competition regardless of TDP configuration...although it will be interesting to see the power required to achieve parity with 4060/4070 mobile.

Wirko said:
Two separate and different memory controllers, LPDDR for the CPU, GDDR for the GPU, each 128 bits wide. I too see it as a possibility, unless we consider all leaks as reliable and accurate (huh). There would be one big downside to that: no unified memory pool.

That is indeed possible; unified pool or not there's always conceivably crossbar/HUMA.

I messed up though thinking it would be 128+128, not 256+128-bit. I don't know why I subtracted from the 'known' 256-bit LPDDR5x controller to add the possible GDDR6 controller. Whoops. :laugh:

Again, who knows....Like you say: leaks and rumors...maybe it could be 128+128 after-all. My (perhaps wrong) thinking was that 256-bit was known, but it was unknown 128-bit could be wired out to GDDR6.

Just making conversation and attempting conceivable projections and their use-cases. Never trying to proclaim infalability vs what might actually transpire.

The thing I never understood about a 256-bit LPDDR5x controller is...wouldn't that require 4 sticks of ram? That's pretty weird. Not impossible; just unconventional.

Wirko · Jul 10, 2024

alwayssts said:
The thing I never understood about a 256-bit LPDDR5x controller is...wouldn't that require 4 sticks of ram? That's pretty weird. Not impossible; just unconventional.

No ... because LPDDR5 doesn't come in sticks but rather in funny car-shaped LPCAMM modules, which almost no one has seen so far anyway. Each of them is 128 bits wide.

Minus Infinity · Jul 10, 2024

Daven said:
The rumors still say 256-bit 8000 LPDDRX. Not sure why TPU now says 128 bit because of the socket size.

Edit: Oh the article is saying 128-bit GDDR6 not LPDDRX. Now that would be cool.

It's right there in the image: 256 bit 8533 LPDDR5X!

mikesg said:
Strix Point (RDNA 3.5) / 890M 16CU (in a GPD Duo) has scored in the region of a RTX 3050 in Time Spy.

With more than double the CU's and TDP headroom it's easily a 4050-4060-5050 competitor.

Strix Point is more for thin laptop/mini PC. Strix Halo would suit desktop a lot more. Within one day you would go shopping for the biggest cooler.

Yes, and TDP is ~120-130W so not too limiting. As long as the chassis isn't stupidly low on volume and can exhaust that heat, it should do really well. Would be a great laptop for photo/video editing and not just gaming, which seems a waste of it's potential.

Chrispy_ said:
The number of things that use OpenCL to any success these days is dwindling by the day, and ROCm support is a noble attempt but its influence so far on the software market is somewhere between "absolute zero" and "too small to measure". It's why Nvidia is now the most valuable company on earth, bar none. I certainly don't like that fact, but it's the undeniable truth.

That's true, but AMD is working hard on ROCm and even in ayear people are saying working with LLM's has gone from broken to functional in that time and hopefully with AMD becoming more software focused they throw more resources at ROCm. I still AMD will still get more support for ROCm than Intel will for SYCL.

ToTTenTranz · Jul 10, 2024

Sound_Card said:
As much as I am hyped for Strix Halo, it does make me wonder why not make a APU that is 60cu and 12 cores? or 80cu and 8 cores? Do we really need 16cores 32threads for gaming? They could easily market a 'gamer 3D APU' that is 8 cores, 60cu, with 3D cache and Infinity Cache. The mini PC market would go absolute bonkers.

ymdhis said:
Well we don't yet know the price so it's too early to tell.

Some seem to be convinced Strix Halo is a gaming-oriented chip going into medium-priced laptops, Mini ATX motherboards and NUCs. It's not.

Strix Halo is a premium chip for premium windows laptops that will compete with the premium MacBook Pro models. It's above all a competitor to the M3 Max and probably M4 Max.

It's a great-all-around, no-cut-corners big SoC for laptops that has a capable GPU for gaming and GPU-accelerated tasks like video/image editors, a whopping 16-core Zen5 (no Zen5c BTW) for demanding multithreaded tasks like simulation and product development tools, a powerful ~50 TOPs NPU to run generative AI models and access to a truckload of RAM thanks to its 256bit width. It's everything at once.

In fact, there's more than 16 Zen5 cores in the solution. There's an additional 4 Zen5LP cores inside the I/O+GPU chip that consume very little power and clock very slowly, but take over the OS tasks while the system is idling. It's AMD's answer to Qualcomm's superior power efficiency on low demand loads, so that these premium windows laptops get the same 12-16h battery life on light usage.

So don't count on the full Strix Halo to appear on anything that isn't a premium laptop above $2500. As much as even I'd love to see ~$1000 gaming handhelds with a cut-down version of Strix Halo, the chip is going into laptops competing against the $4000 MBP M3 Max.

TechLurker · Jul 10, 2024

Wirko said:
Only the Threadripper socket could be a candidate for that.

I've been pushing this idea since Ryzen TR first came out; a "Super APU" of some kind built on TR and making use of all that extra space. It would be a great way to add a lot of I/O while also integrating an iGPU capable of some moderate level gaming or rendering/AI. Even if the I/O is truncated down a few lanes and/or downgraded a PCIe generation if slotted with such a theoretical CPU.

R0H1T · Jul 10, 2024

That's probably coming a couple of gens down the road, when they can afford to sell a massive APU in large numbers. The caveat being if they can also do unified memory in that time? That probably will seal the deal.

ST33LDI9ITAL · Jul 14, 2024

Yes! This is what we want and need. C'mon AMD please deliver.

dafolzey · Jul 14, 2024

I could see this chip being popular in mid level gaming laptops like the Legion 5 or maybe G14. Of course the Nvidia brand still carries a lot of cachet so high end products will have GeForce chips - not necessarily just the highest performance, but anything marketed as a premium product like XPS. Maybe some gaming oriented mini-PCs. But I doubt there's any reason for there to be a socketed version, it will never be cheaper or higher performance than traditional build methods.

ADB1979 · Jul 22, 2024

TechLurker said:
I've been pushing this idea since Ryzen TR first came out; a "Super APU" of some kind built on TR and making use of all that extra space. It would be a great way to add a lot of I/O while also integrating an iGPU capable of some moderate level gaming or rendering/AI. Even if the I/O is truncated down a few lanes and/or downgraded a PCIe generation if slotted with such a theoretical CPU.

The primary problem there is that EPYC (and thus Threadripper) was designed without any video I/O, and the only video I/O comes from a third party chip for remote secure access and to provide that video output.

I do not know what difference this would make and whether or not some of those many traces could be used for video output instead of PCIe or whether it would require a whole new design. On that note, the current EPYC (and thus Threadripper) socket is physically capable of 12x DDR5 memory channels, restricted with Threadrippers to 8 and 6 channels, and FYI there is also the "SP6 socket" to look at, it (for now at least) uses the same physical IOD as EPYC and Threadripper chips but is in a smaller physical chip substrate, smaller socket, fewer memory channels and PCIe lanes and is (for now) restricted to 4x 15-core Zen 4c chiplets. This to me is a much closer basis for a new HEDT platform with reduced costs.

SP6 also shows that AMD is in a phase of serious expansion in all markets, specifically here "lower end servers" (and hopefully low end Threadrippers) that are currently limited to 64x Zen 4c cores and "only" 5 channels of DDR5 and 96 PCIe 5 lanes. Whatever socket that Strix Halo uses will be yet another new socket in a short period of time and IMHO a whole new family of products all using 256-bit RAM. As Strix Halo is going to be the first of a new line of products, it's success, it's pros and cons etc will all be scrutinised and no-doubt will be tested to the nth degree and people will find some interesting niches for this product and potential future avenues to aim towards if the need to tweak the socket for the 2nd generation (RAM will typically do that), but also whether Strix Point will be a product that then spawns it's own split in product lines, one towards HEDT/Threadripper that is affordable, and the other as a high performance SoC that does not require a dedicated GPU. Time will tell, and IMHO Strix Halo is my most anticipated product this year (even if it's delayed yet again and is released next year), specifically because it is essentially a whole new class of product, otherwise, Zen 6 is going to bring a minor revolution at the technical level and highlight technologies to come and direction of travel at the mass market desktop level.

System Name	HTPC whhaaaat?
Processor	2600k @ 4500mhz
Motherboard	Asus Maximus IV gene-z gen3
Cooling	Noctua NH-C14
Memory	Gskill Ripjaw 2x4gb
Video Card(s)	EVGA 1080 FTW @ 2037/11016
Storage	2x512GB MX100/1x Agility 3 128gb ssds, Seagate 3TB HDD
Display(s)	Vizio P 65'' 4k tv
Case	Lian Li pc-c50b
Audio Device(s)	Denon 3311
Power Supply	Corsair 620HX

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

Processor	Ryzen 7 5700X
Motherboard	ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling	Noctua NH-C14S (two fans)
Memory	2x16GB DDR4 3200
Video Card(s)	Reference Vega 64
Storage	Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s)	Nixeus NX-EDG27, and Samsung S23A700
Case	Fractal Design R5
Power Supply	Seasonic PRIME TITANIUM 850W
Mouse	Logitech
VR HMD	Oculus Rift
Software	Windows 11 Pro, and Ubuntu 20.04

System Name	HTPC whhaaaat?
Processor	2600k @ 4500mhz
Motherboard	Asus Maximus IV gene-z gen3
Cooling	Noctua NH-C14
Memory	Gskill Ripjaw 2x4gb
Video Card(s)	EVGA 1080 FTW @ 2037/11016
Storage	2x512GB MX100/1x Agility 3 128gb ssds, Seagate 3TB HDD
Display(s)	Vizio P 65'' 4k tv
Case	Lian Li pc-c50b
Audio Device(s)	Denon 3311
Power Supply	Corsair 620HX

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

AMD "Strix Halo" a Large Rectangular BGA Package the Size of an LGA1700 Processor

alwayssts

ADB1979

R0H1T

HisDivineOrder

ymdhis

Wirko

AnotherReader

alwayssts

Wirko

Minus Infinity

ToTTenTranz

TechLurker

R0H1T

ST33LDI9ITAL

New Member

dafolzey

New Member

ADB1979

Processor	Ryzen 9 5900X
Motherboard	Gigabyte X570 Aorus Pro
Cooling	AiO 240mm
Memory	2x 32GB Kingston Fury Beast 3600MHz CL18
Video Card(s)	Radeon RX 6900XT Reference (amd.com)
Storage	O.S.: 256GB SATA \| 2x 1TB SanDisk SSD SATA Data \| Games: 1TB Samsung 970 Evo
Display(s)	LG 34" UWQHD
Audio Device(s)	X-Fi XtremeMusic + Gigaworks SB750 7.1 THX
Power Supply	XFX 850W
Mouse	Logitech G502 Wireless
VR HMD	Lenovo Explorer
Software	Windows 10 64bit