Apple Introduces M1 Pro and M1 Max: the Most Powerful Chips Apple Has Ever Built

Wirko · Oct 18, 2021

Valantar said:
The interfaces are 256-bit (M1 Pro) and 512-bit (M1 Max). Probably a bit power hungry, sure, but they are mounted extremely close to the SoC, on the same package, so they've likely optimized for that. Plus, these are 40-60W SoCs. The memory power isn't going to be an issue.

The interfaces might be power hungry but energy per bit transferred should be mighty low.

Steevo · Oct 18, 2021

R0H1T said:
They also massive caches which negates some of the disadvantages of having IF, besides they're designed for workstations & servers so they're kinda made for different loads.

Maybe maybe not, we can't do a truly apples to apples comparison here without a monolithic AMD APU having unified memory subsystem ~ that IMO is the biggest gamechanger!

AMD & Intel have long talked about unified memory (CPU+GPU) for nearly half a decade now, even more for AMD, & yet Apple is the one that stole the show.

PS, Xbox have it.

Not much different than the walled garden of Apple. Truly interesting to see who they are paying royalties to.

R0H1T · Oct 18, 2021

Steevo said:
PS, Xbox have it.

I don't believe they do, they share memory but don't have truly unified memory for both CPU+GPU otherwise AMD will have no problem introducing it on their notebook or desktop platforms.

windwhirl · Oct 18, 2021

Vya Domus said:
I just wonder what's the point in having an SoC at this stage.

Higher data transfer speed between the SoC components (since it only has to work with their own OS by design, they can go bonkers with trying anything and everything under the Sun to optimize it to the limits)?

Besides that, you could also throw in the theory that it's all to enhance Apple's profits, since no longer will you be able to upgrade the device without switching out the whole SoC.

R0H1T said:
AMD & Intel have long talked about unified memory (CPU+GPU) for nearly half a decade now, even more for AMD, & yet Apple is the one that stole the show.

Well, Apple doesn't have to care about compatibility with anything other than their own OS. Hell, they don't care for compatibility with apps that are not designed for the latest macOS versions.. Intel, AMD, Windows can't go all in with radical changes because backwards compatibility is sacred there.

dragontamer5788 said:
But Apple here only has an 8-core CPU + 16/32-core GPU to feed here. GPUs can be absurdly high-latency no problem and that CPU-count is small enough that a classic ring-bus would be fine.

On top of that, they have lots of fixed function hardware to cover multiple tasks.

R0H1T said:
I don't believe they do, they share memory but don't have truly unified memory for both CPU+GPU otherwise AMD will have no problem introducing it on their notebook or desktop platforms.

Wouldn't that require massive changes in how Windows does memory management too? I imagine the compatibility problem could be solved from Windows side, but still.

Plus, with the hardware variety in the PC world, that probably wouldn't fly so well without certain minimal specs.

Darmok N Jalad · Oct 18, 2021

Vya Domus said:
The SoCs themselves are meh, in the sense that they're just scaled up variants of M1, nothing particularly interesting there. Those memory bandwidth claims are however rather curios, I can't see how they'd achieve that other than by using GDDR6 modules. Besides the performance implications (some of which being negative, actually) the other thing is that they're pretty power hungry and dissipate a lot of heat (probably more than the SoC itself). To put up to 64GB of that in a laptop is a questionable choice.

Also, these have gotten soo big with such a large transistor budget that it now defeats the purpose of having an SoC in the first place.

Edit: Apparently it's LPDDR5, which is still kind of stupid because that means a very wide interface, which is also power hungry.

The 2 MPBs get like 17 hours and 21 hours of battery life. Apple‘s battery claims are usually fairly accurate, though obviously variable loads will have different results. Still, I could see one getting a workday knocked out on battery. Perhaps by the memory being on package, there is energy savings to be had. Traditional slot memory has got to lose a fair amount of that efficiency by comparison. As for meh, I don’t know how much more interesting could they have made it in one generation? They added CPU cores, doubled/tripled the GPU, and made massive chips with high memory bandwidth. I’d love to see AMD do this with their IP, but they only come close in price-conscious consoles. These appear to be video editing beasts, which is a very popular and lucrative segment right now.

I do agree that these monoliths might spell trouble down the road, but considering the price segments these are in, I guess Apple and TSMC can make it work. The only advantage I can see is that Apple doesn’t appear to be pushing clocks very hard, as the thermals look very reasonable. Maybe if they had to hit higher frequencies, they could have some supply woes?

heinztvoert said:
I saw the Event. So, they took away the ports - forcing us to buy a bunch of peripherals and adapters. They took away HDMI, SD card, etc.. Now they bring them all back like if it was a big deal. I was never a fan of the touch bar, but now that am used to it I kind of like it - so now they take it away. They say they introduce cutting edge innovative products - Yes sorry I don't see where the innovation or the cutting-edge is.

Because now Johny Ive is gone. He was the crazy-thin, portless design madman. Ever since he left, thinness has become less of an emphasis, and ports are finally making a comeback. The iPhone and iPad have thickened up, and these MPBs look thicker than the outgoing models, too. A proper keyboard returned, and Touchbar got thrown into the trash can. The Ive philosophy certainly flew in the face of a company with claims of concern for the environment. Dongles and adapters make money, but they sure don’t make a case for environmental sustainability. I’d call these the “about flippin’ time” MacBook Pros that Apple customers have been waiting 6 or 7 years for.

R0H1T · Oct 18, 2021

windwhirl said:
Wouldn't that require massive changes in how Windows does memory management too? I imagine the compatibility problem could be solved from Windows side, but still.

Plus, with the hardware variety in the PC world, that probably wouldn't fly so well without certain minimal specs.

Don't think so, AMD proposed this back in 2012 ~

AMD Outlines HSA Roadmap: Unified Memory for CPU/GPU in 2013, HSA GPUs in 2014

www.anandtech.com

Heck even Intel was supposed to do full unified memory with TGL IIRC.

Understanding Apple’s Unified Memory Architecture

Apple has just announced improvements to its Unified Memory Architecture, but what the devil is it? Jeff Butts breaks down how UMA helps.

www.macobserver.com

But it was Apple that took the chequered flag :ohwell:

Yes that's undoubtedly the biggest issue IMO ~ compatibility!

Minus Infinity · Oct 18, 2021

R0H1T said:
I don't believe they do, they share memory but don't have truly unified memory for both CPU+GPU otherwise AMD will have no problem introducing it on their notebook or desktop platforms.

Or AMD is going to be doing that with Zen 4/RDNA3. The consoles APU's are custom designs, not a straight up Zen 2 design. They have features not in the deskptop APU's.

TheoneandonlyMrK · Oct 18, 2021

Interesting approach, I think reviews could be interesting on a technology level, it couldn't do enough for me to buy apple but tech is tech I'll read.

R0H1T · Oct 18, 2021

Minus Infinity said:
Or AMD is going to be doing that with Zen 4/RDNA3. The consoles APU's are custom designs, not a straight up Zen 2 design. They have features not in the deskptop APU's.

Yes but we've been waiting for this nearly a decade now! This could be the next big thing for x86 in terms of efficiency ~ I'm also of the opinion atm that ARM isn't more efficient just because they're designed for low(er) power devices. In the last 4-5 years the kind of growth we're seeing on regular PC's I'm sure Apple can be beaten even at their own game but AMD/Intel will need all the right tools for that including the best nodes!

Wirko · Oct 18, 2021

R0H1T said:
I don't believe they do, they share memory but don't have truly unified memory for both CPU+GPU

What's the difference? Is the memory "truly unified" only if memory access is governed by a single MMU for both CPU and GPU?

windwhirl · Oct 18, 2021

R0H1T said:
Yes but we've been waiting for this nearly a decade now! This could be the next big thing for x86 in terms of efficiency ~ I'm also of the opinion atm that ARM isn't more efficient just because they're designed for low(er) power devices. In the last 4-5 years the kind of growth we're seeing on regular PC's I'm sure Apple can be beaten even at their own game but AMD/Intel will need all the right tools for that including the best nodes!

There's also a matter of who would want that. That kind of thing would probably demand fusing the CPU, GPU and DRAM together. A very high performance SoC, yes, but also very expensive. And a potential upgrade nightmare too, if it's not done right.

dragontamer5788 · Oct 18, 2021

R0H1T said:
Yes but we've been waiting for this nearly a decade now! This could be the next big thing for x86 in terms of efficiency ~ I'm also of the opinion atm that ARM isn't more efficient just because they're designed for low(er) power devices. In the last 4-5 years the kind of growth we're seeing on regular PC's I'm sure Apple can be beaten even at their own game but AMD/Intel will need all the right tools for that including the best nodes!

I mean... its called the PS5 / XBox Series X.

I'm pretty sure they have unified memory. Hell, CUDA + CPU / OpenCL + CPU has unified memory. Its just emulated over PCIe. PS5 / XBox Series X actually have the same, literal RAM work for the iGPU side and CPU side.

TheoneandonlyMrK · Oct 18, 2021

Wirko said:
What's the difference? Is the memory "truly unified" only if memory access is governed by a single MMU for both CPU and GPU?

Unified is exactly like the Ps5 and Xbox.
One pool of memory for any use.
So apple clearly were not first and are doing something similar..
The GPU or CPU Can make memory calls in those.
Though inevitably the MMU is going to be on the edge of the soc on a buss.

Wirko · Oct 18, 2021

windwhirl said:
There's also a matter of who would want that. That kind of thing would probably demand fusing the CPU, GPU and DRAM together. A very high performance SoC, yes, but also very expensive. And a potential upgrade nightmare too, if it's not done right.

The PC is modular, after all.

dragontamer5788 · Oct 18, 2021

Wirko said:
The PC is modular, after all.

Expandable is the bigger issue. If you solder the RAM onboard, you can get very high performance, such as GPUs.

But if you want sticks of RAM that you can plug-and-play, from 16GBs laptops to 2048GB servers, things are a bit more difficult.

Darmok N Jalad · Oct 18, 2021

windwhirl said:
On top of that, they have lots of fixed function hardware to cover multiple tasks.

I think this is something that can be easily overlooked, but it's a big advantage Apple now has, IMO. I think it is how they can provide the performance they do in such (relatively) low-TDP designs. Because PC's have to accommodate so many hardware configurations and various software expectations, there's a lot of emphasis on general performance. Specialized hardware just won't get the same adoption in the PC-space, because not everyone will have access to it. With the M1 line, developers can treat it much like a game console--they know exactly how much hardware they have to work with, and the APIs give them one way to access it. No worries about OS version, vendor driver support, etc. It's not exactly the same, since Macs do provide general purpose computing needs for legacy software, but the hardware should be more accessible due to the limited number of configurations. Intel's CEO wants to win Apple back, but it's pretty obvious that Apple is well on its way to approaching this with a finer brush. Sure, there's some brute force in these new M1s, but there's this fixed function hardware making some key differences. I'm not sure how Wintel answers that, at least in a nimble fashion. This ain't PowerPC 2.0, that's for sure.

Valantar · Oct 18, 2021

Vya Domus said:
"Trounce" is a bit extreme, plus, whatever advantages it has they most likely come from the huge caches and not because of the core architecture itself.

That's just for the SoC ? Then the memory and package power can easily reach a good chunk of that in addition to the 40-60W. That doesn't really matter, I just wonder what's the point in having an SoC at this stage.

They didnt say whether it's SoC or package power, but given that the RAM is on the same package I would kind of expect them to be counted together. Either way, how much can this be? If a 256-bit GDDR5 bus is about 20W, I would expect the same in LPDDR5 to be way less than 10, especially when mounted on package like this.

And how is "trounce" extreme when we're talking about a >50% IPC advantage? And obviously the caches play a huge part in that, especially how they somehow manage much, much lower latencies than everyone else. That doesn't make it any less impressive though.

Fouquin said:
24MB L2 is the P-core LLC. LLC is not shared by P and E cores, they each have dedicated L2.

View attachment 221410

Oh, no, that's not the LLC. LLC is SoC-wide L3(ish) where cores, ML cores and likely gpu all have access, illustrated with that 3*8 block grid to the lower right of the cores in that diagram. If the L2 is 24MB, I would expect the LLC to be far, far larger than that given the relative size in the diagram. 128MB? 256?

Edit: the A15 has a 32MB LLC. I'm thinking 256MB now.

R0H1T said:
They also massive caches which negates some of the disadvantages of having IF, besides they're designed for workstations & servers so they're kinda made for different loads.

Maybe maybe not, we can't do a truly apples to apples comparison here without a monolithic AMD APU having unified memory subsystem ~ that IMO is the biggest gamechanger!

AMD & Intel have long talked about unified memory (CPU+GPU) for nearly half a decade now, even more for AMD, & yet Apple is the one that stole the show.

It's a bit strange for you to bring up the Epyc/TR comparison just to then say it's not a valid comparison once people get into why this is likely to be more efficient. LPDDR5 is much lower power than a heap of IF links - but also much lower bandwidth, of course. Apple makes up for this with huge and fast caches, keeping memory accesses to a minimu, while the monolithic architecture and low core counts lets them stick to relatively efficient and low power on-die interconnects.

Wirko said:
What's the difference? Is the memory "truly unified" only if memory access is governed by a single MMU for both CPU and GPU?

Truly unified means everything can access the same data equally, with no copies needed. That is a major performance benefit and power savings.

dragontamer5788 said:
I mean... its called the PS5 / XBox Series X.

I'm pretty sure they have unified memory. Hell, CUDA + CPU / OpenCL + CPU has unified memory. Its just emulated over PCIe. PS5 / XBox Series X actually have the same, literal RAM work for the iGPU side and CPU side.

No, at least MS has explicitly stated how their memory is split between OS/CPU software/GPU.

dragontamer5788 · Oct 18, 2021

Valantar said:
No, at least MS has explicitly stated how their memory is split between OS/CPU software/GPU.

CPUs have to transfer data to the GPUs all the time (and sometimes rarely, maybe a GPU->CPU transfer). One of the key advantages of a SOC is that this "data transfer" takes place in L3 cache instead of over system memory.

I find it hard to believe that Microsoft would design a SOC like the XBox Series X and ignore this simple and useful optimization. I see that Microsoft is playing cute games with its 10+6 GB layout, but I'm pretty sure they're just saying that CPUs use less memory bandwidth, so 10GB of fast-RAM + 6GB of slow-RAM is intended for the CPU to use slow-RAM and GPU to use fast-RAM. But both CPU+GPU should have access to both halfs.

If for no other reason than to optimize the "no copy" methodology between CPU -> GPU data transfers. (Why ever copy data when GPUs can simply just read the RAM themselves?). In dGPU world, you need to transfer the data over PCIe because the VRAM is physically a different chip. But in XBox Series X land, VRAM and RAM are literally the same chips, no copying needed.

windwhirl · Oct 18, 2021

dragontamer5788 said:
I find it hard to believe that Microsoft would design a SOC like the XBox Series X and ignore this simple and useful optimization. I see that Microsoft is playing cute games with its 10+6 GB layout, but I'm pretty sure they're just saying that CPUs use less memory bandwidth, so 10GB of fast-RAM + 6GB of slow-RAM is intended for the CPU to use slow-RAM and GPU to use fast-RAM. But both CPU+GPU should have access to both halfs.

The Xbox has been using the Windows kernel for years now, just with less baggage. It wouldn't be entirely strange if the memory management scheme kept being fundamentally similar to standard Windows. Besides, HyperV is also involved, so that may have an effect in how memore is manged too.

Wirko · Oct 18, 2021

dragontamer5788 said:
But in XBox Series X land, VRAM and RAM are literally the same chips, no copying needed.

Isn't that the case with every Intel and AMD processor with integrated graphics? At least since Haswell for Intel (AnandTech) and since Kaveri for AMD (Wikipedia).

Darmok N Jalad · Oct 18, 2021

Valantar said:
Oh, no, that's not the LLC. LLC is SoC-wide L3(ish) where cores, ML cores and likely gpu all have access, illustrated with that 3*8 block grid to the lower right of the cores in that diagram. If the L2 is 24MB, I would expect the LLC to be far, far larger than that given the relative size in the diagram. 128MB? 256?

Anandtech is speculating it’s probably 64MB on the Max, 32MB on the Pro. They are looking at the actual die shots (provided in the presentation, interestingly), not the illustrative diagram Apple used in the presentation.

Apple Announces M1 Pro & M1 Max: Giant New Arm SoCs with All-Out Performance

www.anandtech.com

Aquinus · Oct 19, 2021

If Apple isn't over-hyping the M1 Max, then it should bend over and spank my i9 9880h and Radeon Pro 5600m all while retaining the option of having 64GB of memory, while using the same or less power. I'm not going to lie, that's impressive, as it should for a freaking massive die. Of course, I've been saving my extra kidney for this special occasion. :laugh:

Edit: Holy crap. The pricing is about equal for what I paid for my almost-loaded 2019 16".

TheoneandonlyMrK · Oct 19, 2021

Aquinus said:
If Apple isn't over-hyping the M1 Max, then it should bend over and spank my i9 9880h and Radeon Pro 5600m all while retaining the option of having 64GB of memory, while using the same or less power. I'm not going to lie, that's impressive, as it should for a freaking massive die. Of course, I've been saving my extra kidney for this special occasion.

6K , with what sounds like 3.5k minimum spend for 32 GB ram, and in some cases it's said (look around I'm not posting links to other tech sites)to be beat by the outgoing Intel 9th gen chip's sooo, there's that.
I mean it gets beat while being more efficient though for sure.:,p

I'm sure reviews and time will sort it out though it's early and speculative yet.

dragontamer5788 · Oct 19, 2021

Wirko said:
Isn't that the case with every Intel and AMD processor with integrated graphics? At least since Haswell for Intel (AnandTech) and since Kaveri for AMD (Wikipedia).

Yeah, its not a new feature at all.

windwhirl said:
The Xbox has been using the Windows kernel for years now, just with less baggage. It wouldn't be entirely strange if the memory management scheme kept being fundamentally similar to standard Windows. Besides, HyperV is also involved, so that may have an effect in how memore is manged too.

But as Wirko has pointed out: this isn't new at all. Intel / AMD chips have been doing zero-copy transfers on Windows for nearly a decade now on its iGPUs.

Yes, that is even on Windows 10, which is HyperV virtualized for security purposes. (The most secure parts of Windows start up in a separate VM these days, so that not even a kernel-level hack can reach those secrets... unless it also includes a VM-break of some kind)

Now don't get me wrong: XBox Series X has a weird / complicated memory scheme going on. But I'd still expect that this extremely strange memory scheme was unified, much akin to AMD's Kaveri or Intel iGPU stuffs that you'd find on any typical iGPU for the past decade.

Arc1t3ct · Oct 19, 2021

The M1 Max, at least on paper, makes every other CPU seem like a decade out of date... How can this be?

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

System Name	Compy 386
Processor	7800X3D
Motherboard	Asus
Cooling	Air for now.....
Memory	64 GB DDR5 6400Mhz
Video Card(s)	7900XTX 310 Merc
Storage	Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s)	55" Samsung 4K HDR
Audio Device(s)	ATI HDMI
Mouse	Logitech MX518
Keyboard	Razer
Software	A lot.
Benchmark Scores	Its fast. Enough.

System Name	System V
Processor	AMD Ryzen 7 9700X
Motherboard	ASRock X670E Pro Rs
Cooling	Deepcool AK620 // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory	2x16GB Kingston 6400MT CL32
Video Card(s)	Gigabyte AORUS Radeon RX 580 8 GB
Storage	SHFS37A240G / DT01ACA200 / ST10000VN0008 / ST8000VN004 / SA400S37960G / SNV21000G / NM620 2TB
Display(s)	LG 22MP55 IPS Display
Case	NZXT Source 210
Audio Device(s)	Logitech G430 Headset
Power Supply	XPG Core Reactor 750 W
Software	Whatever build of Windows 11 is being served in Canary channel at the time.

System Name	Mac mini
Processor	Apple M1 8C
Motherboard	Mac mini logic board
Cooling	Mac mini cooler
Memory	16GB
Video Card(s)	M1 GPU
Storage	512GB
Display(s)	ASUS Pro Art 27"
Case	Mac mini enclosure
Power Supply	Apple 150W

System Name	RyzenGtEvo/ Asus strix scar II
Processor	Amd R5 5900X/ Intel 8750H
Motherboard	Crosshair hero8 impact/Asus
Cooling	360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory	Gskill Trident Z 3900cas18 32Gb in four sticks./16Gb/16GB
Video Card(s)	Asus tuf RX7900XT /Rtx 2060
Storage	Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s)	Samsung UAE28"850R 4k freesync.dell shiter
Case	Lianli 011 dynamic/strix scar2
Audio Device(s)	Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply	corsair 1200Hxi/Asus stock
Mouse	Roccat Kova/ Logitech G wireless
Keyboard	Roccat Aimo 120
VR HMD	Oculus rift
Software	Win 10 Pro
Benchmark Scores	laptop Timespy 6506

System Name	Hotbox
Processor	AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard	ASRock Phantom Gaming B550 ITX/ax
Cooling	LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory	32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s)	PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage	2TB Adata SX8200 Pro
Display(s)	Dell U2711 main, AOC 24P2C secondary
Case	SSUPD Meshlicious
Audio Device(s)	Optoma Nuforce μDAC 3
Power Supply	Corsair SF750 Platinum
Mouse	Logitech G603
Keyboard	Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software	Windows 10 Pro

System Name	Apollo
Processor	Intel Core i9 9880H
Motherboard	Some proprietary Apple thing.
Memory	64GB DDR4-2667
Video Card(s)	AMD Radeon Pro 5600M, 8GB HBM2
Storage	1TB Apple NVMe, 2TB external SSD, 4TB external HDD for backup.
Display(s)	32" Dell UHD, 27" LG UHD, 28" LG 5k
Case	MacBook Pro (16", 2019)
Audio Device(s)	AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply	Display or Thunderbolt 4 Hub
Mouse	Logitech G502
Keyboard	Logitech G915, GL Clicky
Software	MacOS 15.3.1

Apple Introduces M1 Pro and M1 Max: the Most Powerful Chips Apple Has Ever Built

Resident Wat-man