# Apple M1 Beats Intel "Willow Cove" in Cinebench R23 Single Core Test?



## btarunr (Nov 17, 2020)

Maxon ported the its latest Cinebench R23 benchmark to the macOS "Big Sur" Apple M1 platform, and the performance results are groundbreaking. An Apple M1-powered MacBook Pro allegedly scored 1498 points in the single-core Cinebench R23 test, beating the 1382 points of the Core i7-1165G7 reference score as tested by Maxon. These scores were posted to Twitter by an M1 MacBook Pro owner who goes by "@mnloona48_" The M1 chip was clocked at 3.10 GHz for the test. The i7-1165G7 uses Intel's latest "Willow Cove" CPU cores. In the same test, the M1 scores 7508 points in the multi-core test. If these numbers hold up, we can begin to see why Apple chose to dump Intel's x86 machine architecture in favor of its own Arm-powered custom silicon, as the performance on offer holds up against the highest IPC mobile processors in the market.



 

 



*View at TechPowerUp Main Site*


----------



## bonehead123 (Nov 17, 2020)

And so it begins....

waitin to see all the FANBOIS, doomer-gloomers, skeptics, arm-chair analysts etc etc coming out of the woodwork making all sorts of wild, unproven claims, conspiracy theories, etc etc.....


----------



## z1n0x (Nov 17, 2020)

Seems like a great SoC, congratulations to the engineers.
Too bad Apple is such PoS of a company and i would never spend money on their products ever.


----------



## ShurikN (Nov 17, 2020)

Was thinking about how that MP ratio of 5x is kinda low considering it's an 8 core cpu. 
But it is a mobile chip so all core freq must be terribly low compared to short term single core burst (like in all of them). Desktop chips usually get perfect scaling. 
My 3200g has 3.94 MP ratio.


----------



## Steevo (Nov 17, 2020)

bonehead123 said:


> And so it begins....
> 
> waitin to see all the FANBOIS, doomer-gloomers, skeptics, arm-chair analysts etc etc coming out of the woodwork making all sorts of wild, unproven claims, conspiracy theories, etc etc.....



There is nothing so tiring as expectation postponed.

Also, good for Apple, they are aiming at content creation and seem to have built a CPU that does it very well. 

I'm interested to see other benchmarks.


----------



## Aquinus (Nov 17, 2020)

ShurikN said:


> Was thinking about how that MP ratio of 5x is kinda low considering it's an 8 core cpu.


Probably the part where 4 of the cores are slower low power cores that don't clock as high as the other 4. Honestly, I'm not surprised. The numbers are pretty good to be honest, but that multithreaded score still isn't quite where my 9880H is at, but that's saying something for their low-end chip.


----------



## ShurikN (Nov 17, 2020)

Aquinus said:


> Probably the part where 4 of the cores are slower low power cores that don't clock as high as the other 4. Honestly, I'm not surprised. The numbers are pretty good to be honest, but that multithreaded score still isn't quite where my 9880H is at, but that's saying something for their low-end chip.


Ahh yes it has a big.little core arrangement, I forgot about that. Makes sense.


----------



## Vya Domus (Nov 17, 2020)

ShurikN said:


> Was thinking about how that MP ratio of 5x is kinda low considering it's an 8 core cpu.
> But it is a mobile chip so all core freq must be terribly low compared to short term single core burst (like in all of them). Desktop chips usually get perfect scaling.
> My 3200g has 3.94 MP ratio.



4 of the cores are much narrower and likely lower clocked as well. Apple's chip always had terrible MT scaling for that reason, 5x scaling means those cores are dog slow really.

Remember those claims about efficiency and performance per watt ? Yeah, it's because of those small cores.

Oh and they're on 5nm.


----------



## ebivan (Nov 17, 2020)

Looks promising. 

But didn't Apple promise* "3.5 x faster" *so i was expecting single core performance somewhere in the 4000-5000 and multicore somewhere high  above 20000...??


----------



## birdie (Nov 17, 2020)

It's not even about performance per se. In terms of performance per watt Apple M1 is leaps and bounds better than both TGL and Zen 3.


----------



## Steevo (Nov 17, 2020)

Intel And AMD x86 Mobility CPUs Destroy Apple's M1 In Cinebench R23 Benchmark Results
					

Apple recently made some marketing claims using a 5nm processor against a 4-year old architecture and we were waiting for benchmarks to appear that we can use to do some solid comparisons. Earlier today, the single-core and multi-core scores in the latest Cinebench R23 have leaked out and boy is...




					wccftech.com
				




Then again this site says different. What's the difference?

Also in the same image when the Intel chip was clocked at 2.8Ghz VS the Apple 3.1Ghz the Intel chip won by 34 points despite running 300Mhz slower per core.


----------



## AnarchoPrimitiv (Nov 17, 2020)

birdie said:


> It's not even about performance per se. In terms of performance per watt Apple M1 is leaps and bounds better than both TGL and Zen 3.



some of that is due to node advantage, just as how AMD 7nm is more efficient than intel 14nm


----------



## TheoneandonlyMrK (Nov 17, 2020)

Steevo said:


> Intel And AMD x86 Mobility CPUs Destroy Apple's M1 In Cinebench R23 Benchmark Results
> 
> 
> Apple recently made some marketing claims using a 5nm processor against a 4-year old architecture and we were waiting for benchmarks to appear that we can use to do some solid comparisons. Earlier today, the single-core and multi-core scores in the latest Cinebench R23 have leaked out and boy is...
> ...


Single core verses multi core.
And on multiple cores the intel and AMD Mobile cores win out easily despite a one or two node advantage for apple.

Hype train derailment in action , fastest mobile CPU my ass.


----------



## Rahnak (Nov 17, 2020)

ebivan said:


> Looks promising.
> 
> But didn't Apple promise* "3.5 x faster" *so i was expecting single core performance somewhere in the 4000-5000 and multicore somewhere high  above 20000...??



Here's the asterisk on the 3.5x faster performance, straight from Apple:

Testing conducted by Apple in October 2020 using preproduction MacBook Air systems with Apple M1 chip and 8-core GPU, as well as production 1.2GHz quad-core Intel Core i7-based MacBook Air systems, all configured with 16GB RAM and 2TB SSD. Tested with prerelease Final Cut Pro 10.5 using a 55-second clip with 4K Apple ProRes RAW media, at 4096x2160 resolution and 59.94 frames per second, transcoded to Apple ProRes 422. Performance tests are conducted using specific computer systems and reflect the approximate performance of MacBook Air.

A single test.


----------



## Frick (Nov 17, 2020)

theoneandonlymrk said:


> Single core verses multi core.
> And on multiple cores the intel and AMD Mobile cores win out easily despite a one or two node advantage for apple.
> 
> Hype train derailment in action , fastest mobile CPU my ass.



Perf/watt is where the gold (supposedly) is. Being the fastest isn't interesting, it's how fast you can be within a specific power envelope, and these are supposed to really shine there.


----------



## TheoneandonlyMrK (Nov 17, 2020)

Frick said:


> Perf/watt is where the gold (supposedly) is. Being the fastest isn't interesting, it's how fast you can be within a specific power envelope, and these are supposed to really shine there.


Obviously, but then they're not going to replace Those needing more powerful systems like some are implying.
And being the fastest is Exactly what apple reported on being just days ago without proof.


----------



## dyonoctis (Nov 17, 2020)

theoneandonlymrk said:


> Single core verses multi core.
> And on multiple cores the intel and AMD Mobile cores win out easily despite a one or two node advantage for apple.
> 
> Hype train derailment in action , fastest mobile CPU my ass.


For a 10w SoC with an igpu that's fatser than a 1050ti those are still nice numbers. The ryzen 7 5800u might be able to match the single core cpu perf with 15w, but i don't know if rdna2 is going to be as efficient.

(and Apple did said fastest single core for a low power cpu*, the tiger lake can either be a 28w or a 12w part)


----------



## Vya Domus (Nov 17, 2020)

dyonoctis said:


> i don't know if rdna2 is going to be as efficient.



RDNA1 is about 50% more efficient than Vega and RDNA2 is 50% more efficient than RDNA2.

Yeah, it's going to be much, much more efficient. Regardless, efficiency in GPUs isn't that impressive because what you can always do is make a really wide and low clocked GPU, which Apple did. Vega iGPUs run at over 2 Ghz, that's actually way, way more impressive than anything else.


----------



## Endeavour (Nov 17, 2020)

theoneandonlymrk said:


> Single core verses multi core.
> And on multiple cores the intel and AMD Mobile cores win out easily despite a one or two node advantage for apple.
> 
> Hype train derailment in action , fastest mobile CPU my ass.


You are forgetting about a couple of things:
- These are low-power CPUs. The new M1 Macbook Air is fanless and runs cooler than the previous Intel chip they used, which btw was way slower.
- The GPU is also very nice. First benchmarks put it above Intel Xe, MX350 and anything that AMD has in that power envelope.

It also has hardware acceleration for many video codecs, apparently the M1 is as fast as a Mac Pro transcoding video.
Clearly a big win for Apple.


----------



## TheoneandonlyMrK (Nov 17, 2020)

dyonoctis said:


> For a 10w SoC with an igpu that's fatser than a 1050ti those are still nice numbers. The ryzen 7 5800u might be able to match the single core cpu perf with 15w, but i don't know if rdna2 is going to be as efficient.
> 
> (and Apple did said fastest single core for a low power cpu*, the tiger lake can either be a 28w or a 12w part)
> View attachment 175976
> View attachment 175977


So look at your argument.

"Our high performance core is the world's fastest CPU core"

They have forgotten your caveats.
And I doubt Rdna2 isn't more efficient.

Looking at it your way the M1 can equal the better chips with ONE core but get beat out using multiple cores , hmnn that power limit that's never going away on the latest node precludes them Ever competing on multiple cores and where is software headed.
GPU is unproven and 100% unicorn until we see some comparable in game benches let's see some AAA Cod running on it or fortnite, err hmnn.


----------



## techisfun (Nov 17, 2020)

The M1 does appear to have the most efficient cores, as Apple said.

In terms of efficiency, I doubt PC hardware will be able to catch up to Apple Silicon hardware. PC hardware is far less integrated by design, which makes it less efficient.


----------



## Caring1 (Nov 17, 2020)

techisfun said:


> The M1 does appear to have the most efficient cores, as Apple said.
> 
> In terms of efficiency, I doubt PC hardware will be able to catch up to Apple Silicon hardware. PC hardware is far less integrated by design, which makes it less efficient.


Are you saying Apple don't make Personal Computers?


----------



## Chrispy_ (Nov 17, 2020)

If the M1 does anything, it might finally get Microsoft to hurry the hell up and ditch all the legacy crap that is bogging down x86 Windows.

With x86 and Windows playing a chicken-and-egg game, it's never going to become a more streamlined architecture until someone takes the first step. AMD and Intel can't afford to cull features in hardware until they are dropped from the OS, and Microsoft is still unwilling to completely let go of 32-bit OS even in this day and age....


----------



## TheoneandonlyMrK (Nov 17, 2020)

Caring1 said:


> Are you saying Apple don't make Personal Computers?


"What's a personal computer"

That's a line I remember from an apple advertisement.

They make apple's not PC's.


----------



## dyonoctis (Nov 17, 2020)

theoneandonlymrk said:


> So look at your argument.
> 
> "Our high performance core is the world's fastest CPU core"
> 
> ...


True. I tend to forget that this isn't a classic 8 core, but 4+4. So there's a bit of "cheating" there. It's a design philosophy that works well for those machine, but AMD already said that they are not interested in an Hybrid design.

Well, on paper AMD got the tech to make a great soc: zen3 core (low power zen 2 was already impressive in it's own right), rdna 2 with the AV1 acceleration...machine learning for the consumer is the thing that they don't do yet, but it's a whole mess on windows (intel with their odd deep link, and nvidia tensor cores are only for "beefy" laptops.) But then there is the software side. Open cl being what it is, cuda is the only thing close to the metal api on windows. Adobe and Nvidia have a love story where AMD is a third wheel at best.


----------



## Valantar (Nov 17, 2020)

Impressive, especially considering R23 is a continuous, throttling-inducing load designed to bypass short-term performance boosts. 3.1GHz isn't particularly impressive, though the IPC on show clearly demonstrates that at least in some workloads you can make up for clock speeds through a wider core. AnandTech's recent A14 article does show how Apple is managing some things that current X86 designs aren't even close to in terms of architectural width and caches, so it'll be really interesting to see how this in turn affects future X86 development.

Also, of course, it kind of demonstrates what can happen if you give a chip development company unlimited resources. No wonder they're doing things nobody else can.


----------



## OGoc (Nov 17, 2020)

The top score in the screen shot is Intel i7-1165G7 with a single score of 1532


----------



## TheoneandonlyMrK (Nov 17, 2020)

dyonoctis said:


> True. I tend to forget that this isn't a classic 8 core, but 4+4. So there's a bit of "cheating" there. It's a design philosophy that works well for those machine, but AMD already said that they are not interested in an Hybrid design.
> 
> Well, on paper AMD got the tech to make a great soc: zen3 core (low power zen 2 was already impressive in it's own right), rdna 2 with the AV1 acceleration...machine learning for the consumer is the thing that they don't do yet, but it's a whole mess on windows (intel with their odd deep link, and nvidia tensor cores are only for "beefy" laptops.) But then there is the software side. Open cl being what it is, cuda is the only thing close to the metal api on windows. Adobe and Nvidia have a love story where AMD is a third wheel at best.


Well progress is finally being made on your later points with windows now supporting openCl and openGl.
GPU can be used for ML though that will remain an advantage for the M1's specific unit's, both Intel with one API and AMD with God knows what (they do have ML in chip already on Ryzen for process optimization) tbf could incorporate better AI and Ml hardware catching Up to the M1's main advantage.


----------



## Vya Domus (Nov 17, 2020)

Valantar said:


> Also, of course, it kind of demonstrates what can happen if you give a chip development company unlimited resources.



Really ? Apparently everyone said the same about Intel, that they also have an unlimited budget.



Valantar said:


> No wonder they're doing things nobody else can.



They are doing things nobody else have an interest in doing. The segments in which Apple and Intel/AMD operate have little overlap actually, Apple is basically an exclusively mobile silicon company and Intel/AMD are ... not. Their designs are first and foremost meant for servers where the goal is to fit as much compute as possible on a single package, what we get on desktops and even on mobile is a cut down version of whatever that is and at core are basically the same architectures just configured differently.


----------



## DeathtoGnomes (Nov 17, 2020)

"Way to go Dall..uhhh Apple"

Not surprised here, Apple seems on course to truly separate itself. Next up their own discreet card? Maybe?


----------



## Frick (Nov 17, 2020)

Vya Domus said:


> Really ? Apparently everyone said the same about Intel, that they also have an unlimited budget.



I would add the "and not being beholden to an open ecosystem burdened by legacy stuff" to the system.



theoneandonlymrk said:


> Obviously, but then they're not going to replace Those needing more powerful systems like some are implying.
> And being the fastest is Exactly what apple reported on being just days ago without proof.



I will never understand why people take PR statements at face value. "Fastest" obviousy comes with a fistful of asterisks.


----------



## Ashtr1x (Nov 17, 2020)

Chrispy_ said:


> If the M1 does anything, it might finally get Microsoft to hurry the hell up and ditch all the legacy crap that is bogging down x86 Windows.
> 
> With x86 and Windows playing a chicken-and-egg game, it's never going to become a more streamlined architecture until someone takes the first step. AMD and Intel can't afford to cull features in hardware until they are dropped from the OS, and Microsoft is still unwilling to completely let go of 32-bit OS even in this day and age....



Well the world is not always a locked down gated bs environment right, so they are taking their time to do it, piece by piece. Making OS look and feel like a mobile OS, and then removing power featrures like Control Panel and then adding a UWP store with UWP drivers and MSIX to make exes go away, same stuff in a different method. Add the WaaS model, feels amazing right latest and greatest right on your desktop every day and every 6 months for big OS release. Wonderful right ? vs the old crappy Win7 RTM release which doesn't even need any sort of NTkernel updates to run the 2020 software.

"Legacy Crap bogging down Windows"
Wonder how the legacy crap is bogging the windows. Windows got it's strength from the software compatibility not gated bs like Apple. At Apple utopia, all is locked down only passed when Apple almighty says so. Do you even realize how big 32bit axe would be for the Applications ? Win10 32bit is already being phased out and so are GPU drivers, the Application support is very important for an OS to be very robust and non user restrictive. Did Linux got bogged down by all Legacy crap ?

M1 won't do anything, Apple users will buy their BGA riddled walled garden Macs no matter what, and Windows machines will be sold as they are, people need GPUs and OS supporting their Software requirements, until Apple catches up to AMD or Nvidia that day is not going to come.


----------



## TheLostSwede (Nov 17, 2020)

Luckily we now have proper tests.
Yes, it's very fast in single threaded workloads, no it's not what Apple claims overall.
No, you still don't want to game on Apple hardware.


























						The 2020 Mac Mini Unleashed: Putting Apple Silicon M1 To The Test
					






					www.anandtech.com


----------



## Valantar (Nov 17, 2020)

Vya Domus said:


> Really ? Apparently everyone said the same about Intel, that they also have an unlimited budget.


Apple is AFAIK the highest valued company on the planet, and the one with the biggest cash hoard too. Intel has never been even close to that. So while Intel's R&D budgest might have been "unlimited" in terms of the tech industry at the time, Apple's R&D budgets are likely only limited by how many people it's possible for them to hire, how many concurrent projects it's possible for them to run, and what they are interested in doing.



Vya Domus said:


> They are doing things nobody else have an interest in doing. The segments in which Apple and Intel/AMD operate have little overlap actually, Apple is basically an exclusively mobile silicon company and Intel/AMD are ... not. Their designs are first and foremost meant for servers where the goal is to fit as much compute as possible on a single package, what we get on desktops and even on mobile is a cut down version of whatever that is and at core are basically the same architectures just configured differently.


Intentionally or not, you are completely misunderstanding what I said. I was referring to the A14/M1 Firestorm "big core" microarchitecture and its features, and not mobile-specific features at that, just ones that massively increase the throughput of the core. An 8-wide decode block compared to 4-wide in x86; a re-order buffer 2-3x the size of Intel and AMD's newest architectures, 4x/2x the FP/clock throughput of current Intel architectures/Zen 3, dramatically deeper load/store queues than any other architecture, L1 caches *6x* the size of current X86 architectures (seemingly without a latency penalty!) ....and the list goes on. These aren't mobile-specific features, these are common features across all modern CPU architectures. And somehow Apple is able to massively beat the competition on several fronts, doing things that seems to be impossible for the others. Some of it can be blamed on the X86 ISA (decode block width, for example), but not everything. And if you're trying to tell me that AMD and Intel could grow their L1 caches 6x without increasing latency by a single cycle, yet are choosing not to, then you need to provide some proof for that, because that's an outlandish thing to even suggest. If AMD and/or Intel could do what Apple is doing here to increase IPC without tanking performance through either killing power efficiency, introducing massive latency, etc., they would very clearly do so.

The thing is, this design is likely to scale _extremely_ well for servers and workstations, as the clock speeds they are reaching are the same as the ones hit by current top-end multi-core server chips, just at much lower power. They'd obviously need to tweak the architecture in various ways and find a way to couple together more than 4+4 cores efficiently without introducing bottlenecks or massive latency, but ... given what they've already done, that should be doable. Whatever silicon Apple makes for the next Mac Pro, it's looking like it'll be extremely impressive.



TheLostSwede said:


> Luckily we now have proper tests.
> Yes, it's very fast in single threaded workloads, no it's not what Apple claims overall.
> No, you still don't want to game on Apple hardware.
> 
> ...


I would really, really like to see what RotTR performance would look like if it was compiled for this architecture rather than run through a translation layer. Even if gaming performance is lacklustre overall, that is _damn_ impressive.


----------



## ZoneDymo (Nov 17, 2020)

interesting comments to read, so passionate.

Personally I could not care less, by their own design/philosophy I dont have anything to do with Apple and probably never will.
They could make a low power mobile chip that is as fast as an RTX3090 and it would not affect me in the slightest, I have nothing to do with Apple.


----------



## Bruno Vieira (Nov 17, 2020)

There is an Anandtech review up and its very very close.








						The 2020 Mac Mini Unleashed: Putting Apple Silicon M1 To The Test
					






					www.anandtech.com


----------



## Easy Rhino (Nov 17, 2020)

It not only outperforms intel by a large margin it does it with massive power efficiency. And just a note to everyone... don't choose sides. If you think Apple is any worse than Google or Microsoft as far as culture goes then you are sorely mistaken. They are all equally terrible.


----------



## phanbuey (Nov 17, 2020)

Valantar said:


> I would really, really like to see what RotTR performance would look like if it was compiled for this architecture rather than run through a translation layer. Even if gaming performance is lacklustre overall, that is _damn_ impressive.



Agreed ...

This is the version 1 of a new/unsupported arch in some instances BEATING the next gen best.  For a first attempt it's quite insane how fast this is.


----------



## Vya Domus (Nov 17, 2020)

Valantar said:


> An 8-wide decode block compared to 4-wide in x86; a re-order buffer 2-3x the size of Intel and AMD's newest architectures, 4x/2x the FP/clock throughput of current Intel architectures/Zen 3, dramatically deeper load/store queues than any other architecture, L1 caches *6x* the size of current X86 architectures (seemingly without a latency penalty!) ....and the list goes on.



All of which make it horrendously inefficient in terms of area and transistor budget in order to extract the same performance Intel and AMD do with much smaller cores and on an bigger node by the way, am I really the only one noticing that ? There is a very good reason practically nobody is making cores this wide, it's a scalability dead end, everyone figured this out in the late 90s.



Valantar said:


> And if you're trying to tell me that AMD and Intel could grow their L1 caches 6x without increasing latency by a single cycle, yet are choosing not to, then you need to provide some proof for that, because that's an outlandish thing to even suggest. If AMD and/or Intel could do what Apple is doing here to increase IPC without tanking performance through either killing power efficiency, introducing massive latency, etc., they would very clearly do so.



Of course they can and they will gradually make wider cores. The reason Apple can use a 128KB cache  (which is 4 times larger not 6, and that is for the L1 Data cache not the Instruction cache) is because they use a minimum 16KB page and not 4KB hence a cache that is 4 times larger with 8-way associativity, that's all there is to it and I don't have to explain why a cache that is 4 time bigger with the same associativity is pretty terrible and inefficient. I have no idea why everyone thinks Apple is using some sort of magic fairy dust to make these things.

Edit: I got confused that was for a 128KB cache, for a 192KB one that is 6 times larger it's basically the same explanation, they can do it because of the 16KB page.


----------



## TheLostSwede (Nov 17, 2020)

phanbuey said:


> Agreed ...
> 
> This is the version 1 of a new/unsupported arch in some instances BEATING the next gen best.  For a first attempt it's quite insane how fast this is.


It's easy for Apple to squeeze out some extra performance though, as they control the compilers, the OS, the driver layer, the drivers and now the hardware. 
No other company in the world has a complete in-house platform that can be tuned to this degree for optimal performance.
It gives them what one could almost call an unfair advantage. 

However, we still have to seem them scale this, as right now we're looking at an iPad with a keyboard, on steroids. 
This is not going to be a hardware solution that will be competitive in all aspects and so far we're barely scratching the surface, as all the benchmarks so far are somewhat limited.
Single core performance is no longer as important as it once was and judging by the benchmarks, it's running into problems keeping up once we go beyond the performance cores.

Not saying Apple did a bad job, I'm just not buying into all the hype, as Apple clearly oversold this when it was announced.
Yes, it's going to be good enough of a work computer for a lot of people, but it's clearly not for everyone.


----------



## Valantar (Nov 17, 2020)

Vya Domus said:


> All of which make it horrendously inefficient in terms of area and transistor budget in order to extract the same performance Intel and AMD do with much smaller cores and on an bigger node by the way, am I really the only one noticing that ? There is a very good reason practically nobody is making cores this wide, it's a scalability dead end, everyone figured this out in the late 90s.


Nope, not the only one noticing that, and there's no doubt that these chips are really expensive compared to the competition. Sizeable silicon, high transistor counts, and a very expensive node should make for quite a high silicon cost. There's another factor to the equation though: Intel (not really, but not _that_ far behind) and AMD are delivering the same performance with much smaller cores and on a bigger node _but with several times the power consumption_. That's a notable difference. Of course comparing a ~25W mobile chip to a 105W desktop chip is an unfair metric of efficiency, but even if Renoir is really efficient for X86, this still beats it.


Vya Domus said:


> Of course they can and they will gradually make wider cores. The reason Apple can use a 128KB cache is because they use a minimum 16KB page and not 4KB hence a cache that is 6 times larger with 8-way associativity, that's all there is to it and I don't have to explain why a cache that is 6 time bigger with the same associativity is pretty terrible and inefficient. I have no idea why everyone thinks Apple is using some sort of magic fairy dust to make these things.


Intel grew their L1 cache from 32k to 48k with Ice Lake, which caused its latency to increase from 4 to 5 cycles. Apple manages a cache 3x the size with 3/5 the latency. Regardless of associativity, Apple is managing something that nobody else is. Also, if it's such an inefficient design, how come they're beating every other architecture in efficiency, even when factoring in the node advantage? One would expect a "pretty terrible and inefficient" L1 cache to be rather harmful to overall SoC efficiency, right?

As for making wider cores: yes, they likely will, but this is actually an area where x86 is a real problem. To quote AT:


			
				AnandTech said:
			
		

> Other contemporary designs such as AMD’s Zen(1 through 3) and Intel’s µarch’s, x86 CPUs today still only feature a 4-wide decoder designs (Intel is 1+4) that is seemingly limited from going wider at this point in time due to the ISA’s inherent variable instruction length nature, making designing decoders that are able to deal with aspect of the architecture more difficult compared to the ARM ISA’s fixed-length instructions.


So they can, but they would need to take a significant efficiency penalty, or find some way to mitigate this.


----------



## Punkenjoy (Nov 17, 2020)

The performance of the M1 in 5 nanometer is where i would expect it to be. I don't find it game breaking and they already implemented most of the tricks that x86 use. The architecture is getting mature and they aren't really way more performance than x86. 

modern processors are just so much more complex than just the instruction set that in the end it do not really matter. at least for pure performance. for low power, x86 still seem to have a bit higher overhead...

But the thing is ARM is getting more powerful by using more transitors and more power. It's not the 4 watt cpu in your phone that is doing that...

I am not an apple fan, but i am glad they do something powerful because we need competitions. AMD is starting to compete again and there is a lot more performance gain each year than when Intel and Nvidia had 0 competition. 

Good stuff indeed, good stuff...


----------



## Chrispy_ (Nov 17, 2020)

TheLostSwede said:


> It's easy for Apple to squeeze out some extra performance though, as they control the compilers, the OS, the driver layer, the drivers and now the hardware.


This is an important point that I think people are overlooking. 

The closest thing to compare that to in the PC space is game consoles vs PC gaming; Last Gen XBox hardware is pitiful by modern standards but if you took the equivalent Radeon R7 260 DDR3 that the XBox One has in it and tried to run the PC version of an Xbox One game on that R7 260, you'd be greeted by a low-res, low-quality slideshow. 

Meanwhile the same hardware in the XBone is getting 1080p30 with improved graphics quality. That's the power of optimising and compiling for a single-purpose, single-spec platform.


----------



## Nordic (Nov 17, 2020)

I really want to see what they can pull of in a 30w, 60w, and 90w package. This is some cool stuff.


----------



## TheoneandonlyMrK (Nov 17, 2020)

Chrispy_ said:


> This is an important point that I think people are overlooking.
> 
> The closest thing to compare that to in the PC space is game consoles vs PC gaming; Last Gen XBox hardware is pitiful by modern standards but if you took the equivalent Radeon R7 260 DDR3 that the XBox One has in it and tried to run the PC version of an Xbox One game on that R7 260, you'd be greeted by a low-res, low-quality slideshow.
> 
> Meanwhile the same hardware in the XBone is getting 1080p30 with improved graphics quality. That's the power of optimising and compiling for a single-purpose, single-spec platform.


Exactly and the 10Watts the M1 pulls is also shared across the entire Soc so the CPU and GPU performance will be heavily compromised during actual gaming for example or any great use of both at the same time, where's the performance at then?!.


----------



## Vya Domus (Nov 17, 2020)

Valantar said:


> Also, if it's such an inefficient design, how come they're beating every other architecture in efficiency, even when factoring in the node advantage? One would expect a "pretty terrible and inefficient" L1 cache to be rather harmful to overall SoC efficiency, right?



Clock speed and voltage, it's that simple. L1 caches basically have to run close to the clock speed of the core so their power scales right along with it (badly), that being said at 3 Ghz it's kept in check, for now. It's still inefficient though considering the performance gains from having up to 6 times more memory, you can bet the hit rate didn't go up by 600%. That's why L3 caches grew so much larger over the years because they don't have to scale along side with the cores themselves and why large L1 caches are avoided like the plague even outside x86.



Valantar said:


> There's another factor to the equation though: Intel (not really, but not _that_ far behind) and AMD are delivering the same performance with much smaller cores and on a bigger node _but with several times the power consumption_.



Of course because again, Intel and AMD design their cores with the goal of being able to fit as many as possible on a single package, hence smaller but higher clocked cores which are also less power efficient.

But there is another company which proves my point more than anyone else, ARM themselves, they're also close to extracting similar performance out of cores that are even smaller and consume way less power and area than Apple's. I'll maintain my opinion that Apple's approach is the wrong one long term.


----------



## TheLostSwede (Nov 17, 2020)

theoneandonlymrk said:


> Exactly and the 10Watts the M1 pulls is also shared across the entire Soc so the CPU and GPU performance will be heavily compromised during actual gaming for example or any great use of both at the same time, where's the performance at then?!.


Let's also not forget that the memory is shared between all the parts inside the SoC, which might affect performance negatively in some scenarios as well.



Nordic said:


> I really want to see what they can pull of in a 30w, 60w, and 90w package. This is some cool stuff.


Why would Apple ever go as high as 90W? We might see some 15-25W parts next, but I doubt we'll ever see anything in the 90W range from Apple.


----------



## hurakura (Nov 17, 2020)

The marketing is strong in this one


----------



## Sandbo (Nov 17, 2020)

Ashtr1x said:


> *M1 won't do anything*, Apple users will buy their BGA riddled walled garden Macs no matter what, and Windows machines will be sold as they are, people need GPUs and OS supporting their Software requirements, until Apple catches up to AMD or Nvidia that day is not going to come.



Just to share another side of the story, I have ordered an M1 Pro as my first mac, which I otherwise use Windows/Ubuntu at home and work for almost 3 decades.
What interested me in is its promising performance (if software transitions well) coupled with impressive looking battery life (guess it needs to be tested).
While I need my GPU to play ray-traced games, that's on my desktop; my laptop could be something else and now I am motivated to give it a shot.



TheLostSwede said:


> Let's also not forget that the memory is shared between all the parts inside the SoC, which might affect performance negatively in some scenarios as well.
> 
> 
> Why would Apple ever go as high as 90W? We might see some 15-25W parts next, but I doubt we'll ever see anything in the 90W range from Apple.


Maybe one day they scale up M1 to something that can compete with a 5950X?
Not sure if it is technically possible with their design, but ARM definitely is picking up some momentum lately.


----------



## TheLostSwede (Nov 17, 2020)

Sandbo said:


> Maybe one day they scale up M1 to something that can compete with a 5950X?
> Not sure if it is technically possible with their design, but ARM definitely is picking up some momentum lately.


Oh, I have no doubt they'll improve the performance, but Wattage doesn't equal performance.
Depending on what Apple's plan is, they're obviously going to have to scale the performance upwards.
How they'll do this, I guess we're going to have to wait and see.


----------



## Aquinus (Nov 17, 2020)

I think people are forgetting that this 10w chip is competing with chips that have TDPs as high as 35-45 watts. Come on, let's gain a little bit of perspective here. It's not the best, but it's pretty damn good for what it is. If this is what Apple can do with a 10w power budget, imagine what they can do with 45 watts.


----------



## Smartcom5 (Nov 17, 2020)

btarunr said:


> Maxon ported *the* */* *its* latest Cinebench R23 benchmark to the macOS "Big Sur" Apple M1 platform, and the performance results are groundbreaking.


You couldn't decide right away, could you?! I'd say another classical case of; “How I'm supposed to know what I think before I hear _read_ what I said _wrote_.“

Smartcom


----------



## Vya Domus (Nov 17, 2020)

Aquinus said:


> I think people are forgetting that this 10w chip



That's not true, Anandtech estimates it's really more like 20-24W.

Let's get down back to earth people.



Aquinus said:


> imagine what they can do with 45 watts.



Burn a hole through your laptop.  

There is another reason besides just battery life and efficiency that they chose not to give these more power, those dense over sized cores with huge caches probably become impossible to cool realistically and keep in check under a high power budget.


----------



## Selaya (Nov 17, 2020)

Nordic said:


> I really want to see what they can pull of in a 30w, 60w, and 90w package. This is some cool stuff.


Honestly, I'd expect the same as would happen if you would feed a 5950X 600W - it'll go boom.


----------



## Searing (Nov 17, 2020)

Vya Domus said:


> That's not true, Anandtech estimates it's really more like 20-24W.
> 
> Let's get down back to earth people.
> 
> ...



I see a lot silly people quoting Anandtech to support their false claims today, you are insulting Anandtech by lying about what they said. Anandtech's review specifically says "wall power to the PSU". That's the entire system and before efficiency loss from the power supply. Including the GPU and ram most importantly. The old Mac Mini easily used 85W. So it uses 1/3 of the old Mini and is way faster, including having an up to GTX 1650 GPU, and you are trying to say the opposite. How much do the comparable CPU cores use vs Ryzen and Intel? Much less.


----------



## Vya Domus (Nov 17, 2020)

Searing said:


> I see a lot silly people quoting Anandtech to support their false claims today, you are insulting Anandtech by lying about what they said. Anandtech's review specifically says "wall power to the PSU". That's the entire system and before efficiency loss from the power supply. Including the GPU and ram most importantly. The old Mac Mini easily used 85W. So it uses 1/3 of the old Mini and is way faster, including having an up to GTX 1650 GPU, and you are trying to say the opposite. How much do the comparable CPU cores use vs Ryzen and Intel? Much less.











						The 2020 Mac Mini Unleashed: Putting Apple Silicon M1 To The Test
					






					www.anandtech.com
				




Read the damn thing before talking.



> These figures are generally what you’d like to compare to “TDPs” of other platforms, *although again to get an apples-to-apples comparison you’d need to further subtract some of the overhead as measured on the Mac mini here – my best guess would be a 20 to 24W range.*


----------



## Aquinus (Nov 17, 2020)

Vya Domus said:


> That's not true, Anandtech estimates it's really more like 20-24W.
> 
> Let's get down back to earth people.


You mean like how the "45 watt TDP" with the i9 9880H in my laptop actually draws more like 65-68 watts under full load?  

20-24 watts is still a lot less than 65-68 watts. That's like 1/3 of the power for 90% of the performance.


----------



## Vya Domus (Nov 17, 2020)

Aquinus said:


> You mean like how the "45 watt TDP" with the i9 9880H in my laptop actually draws more like 65-68 watts under full load?
> 
> 20-24 watts is still a lot less than 65-68 watts.



What I meant had to do with power density,  25W on TSMC's 5nm is very different to 25W on Intel's 14nm.


----------



## TheoneandonlyMrK (Nov 17, 2020)

Aquinus said:


> You mean like how the "45 watt TDP" with the i9 9880H in my laptop actually draws more like 65-68 watts under full load?
> 
> 20-24 watts is still a lot less than 65-68 watts. That's like 1/3 of the power for 90% of the performance.


It is not the 10 Watts constantly touted either, is it.


----------



## Aquinus (Nov 17, 2020)

theoneandonlymrk said:


> It is not the 10 Watts constantly touted either, is it.


Sure, if you pay attention to just that single piece of information in a vacuum.


Vya Domus said:


> What I meant had to do with power density,  25W on TSMC's 5nm is very different to 25W on Intel's 14nm.


Doesn't power density become a bigger problem with smaller nodes? I'm not exactly sure where you're going with this.


----------



## TheoneandonlyMrK (Nov 17, 2020)

Aquinus said:


> Sure, if you pay attention to just that single piece of information in a vacuum.
> 
> Doesn't power density become a bigger problem with smaller nodes? I'm not exactly sure where you're going with this.


I'm not the deluded, apple have worked a miracle guy here though, am I.


----------



## Aquinus (Nov 17, 2020)

theoneandonlymrk said:


> I'm not the deluded, apple have worked a miracle guy here though, am I.


What I see is Apple releasing a chip that does almost as well as what I have but with a 1/3 of the power. That's an accomplishment whether you want to accept that or not.


----------



## TheoneandonlyMrK (Nov 17, 2020)

Aquinus said:


> What I see is Apple releasing a chip that does almost as well as what I have but with a 1/3 of the power. That's an accomplishment whether you want to accept that or not.


I am happy to say this is a good chip and very capable, it'll be put into something I won't ever use though.
But I am admittedly an apple hater so there's that.
This CPU is good but let's be honest and realistic about it's capabilities that's all. .
Eh I'm honest about the hate at least.

And let's see how well it does in your actual use before lauding it up too that's all.


----------



## R0H1T (Nov 17, 2020)

Aquinus said:


> What I see is Apple releasing a chip that does almost as well as what I have but with a 1/3 of the power. That's an accomplishment whether you want to accept that or not.


It is an achievement but again as AT stated an *Apples to Apples comparison* is nearly impossible here, what you're basically comparing is a Mac (ecosystem) with a competing performant one, in that I'd argue zen3 mobile APUs would get damn close to that efficiency with a node disadvantage. Chips at higher TDPs & clocks are actually less efficient, Zen has always had much better efficiency under 3GHz & same goes for Intel.

Regardless of the opinions about Apple though it is something to celebrate in & of itself, if this doesn't wake Intel up & kick them where it really hurts you can bet Intel is going the way of IBM.


----------



## Valantar (Nov 17, 2020)

Punkenjoy said:


> The performance of the M1 in 5 nanometer is where i would expect it to be. I don't find it game breaking and they already implemented most of the tricks that x86 use. The architecture is getting mature and they aren't really way more performance than x86.
> 
> modern processors are just so much more complex than just the instruction set that in the end it do not really matter. at least for pure performance. for low power, x86 still seem to have a bit higher overhead...
> 
> ...


I have to say you must be the most optimistic person I've seen towards ARM, if this is in line with your expectations. Up until now even large, high-powered server ARM chips have only competed with x86 on absolute performance in scenarios where they have had a core count advantage and the workload is heavily multithreaded, and that includes 7nm. So if your expectation from 5nm was for an ARM SoC to suddenly take the lead in single-threaded performance, that's quite the jump you were expecting!

This obviously isn't the most revolutionary thing ever, but it's a much bigger achievement than you're giving them credit for.


Vya Domus said:


> Clock speed and voltage, it's that simple. L1 caches basically have to run close to the clock speed of the core so their power scales right along with it (badly), that being said at 3 Ghz it's kept in check, for now. It's still inefficient though considering the performance gains from having up to 6 times more memory, you can bet the hit rate didn't go up by 600%. That's why L3 caches grew so much larger over the years because they don't have to scale along side with the cores themselves and why large L1 caches are avoided like the plague even outside x86.


There's definitely an open question of whether such an architecture can scale to higher clocks and power levels at all - I'm rather skeptical of that, at least for this design, though I'd be surprised if whatever they whip up for the Mac Pro doesn't hit ~4GHz at least in low-threaded boosts - there's definitely power and cooling to spare for that in those cases. As for L3 caches, AT reports the LLC on the M1 as 16MB, so that's half the size of Zen 3 on the desktop, though also 4x the size of a Renoir CCX. The more interesting thing is how Apple shares their L2 cache between cores, making comparisons difficult of course. (Not to mention the LLC being shared across all parts of the SoC further making comparisons to current X86 SoCs and CPUs difficult.) You're likely entirely right that the increase in cache hits from the 6x increase in L1 cache is nowhere near 1:1, but it's obviously still worth it in enough workloads for Apple to be willing to go that route, and also clearly efficient enough to not hurt them.





Vya Domus said:


> Of course because again, Intel and AMD design their cores with the goal of being able to fit as many as possible on a single package, hence smaller but higher clocked cores which are also less power efficient.
> 
> But there is another company which proves my point more than anyone else, ARM themselves, they're also close to extracting similar performance out of cores that are even smaller and consume way less power and area than Apple's. I'll maintain my opinion that Apple's approach is the wrong one long term.


ARM is nowhere close to the performance of this, or even the mobile A14. Even the X1 cores will be way, way behind. Sure, AT's comparison numbers are just from A77 cores, but look at those performance differences! Sure, peak power is higher, but we've seen plenty of examples of how poorly A-series ARM cores scale upwards in power in various poorly optimized phones ("gaming" phones with high-clocked SoCs etc.). I'm optimistic that the X1 will be a first step towards getting non-Apple ARM cores that are at least in the same ballpark as Apple's cores, but current options are nowhere close to what Apple delivers. Also, the X1 is supposedly a much bigger core than A-series cores.


TheLostSwede said:


> Let's also not forget that the memory is shared between all the parts inside the SoC, which might affect performance negatively in some scenarios as well.
> 
> Why would Apple ever go as high as 90W? We might see some 15-25W parts next, but I doubt we'll ever see anything in the 90W range from Apple.


Shared memory is definitely going to be a severe bottleneck, as is the measly iGPU bandwidth. That might explain a lot of the delta between synthetic/compute workloads (and very light gaming like 3DMark Ice Storm) vs. real world gaming such as SotTR in AT's numbers. It'll definitely be interesting to see how their chips for MBP 16" and iMacs look in this regard - will they go with some sort of dual memory interface? Will they go stupid wide LPDDR4X? DDR4 would frankly shock me at this point.

As for seeing 15-25W parts next ... isn't that what this is? ~<10W in the MBA, probably ~15-20W in the MBP, ~20-24W in the MM. And, seemingly at 3.0/3.1/3.2GHz, which doesn't bode well for the frequency scaling for this design, though of course we don't have low enough level access to actually know for sure. I would expect the next part to be for the MBP 16", in the 30-50W range, and likely with a _much_ bigger GPU. 8 big cores, 4 small ones, 16-64GB of RAM and ~32 GPU cores?


Aquinus said:


> I think people are forgetting that this 10w chip is competing with chips that have TDPs as high as 35-45 watts. Come on, let's gain a little bit of perspective here. It's not the best, but it's pretty damn good for what it is. If this is what Apple can do with a 10w power budget, imagine what they can do with 45 watts.


~20-25W, not 10W.


----------



## Punkenjoy (Nov 17, 2020)

Valantar said:


> I have to say you must be the most optimistic person I've seen towards ARM, if this is in line with your expectations. Up until now even large, high-powered server ARM chips have only competed with x86 on absolute performance in scenarios where they have had a core count advantage and the workload is heavily multithreaded, and that includes 7nm. So if your expectation from 5nm was for an ARM SoC to suddenly take the lead in single-threaded performance, that's quite the jump you were expecting!
> 
> This obviously isn't the most revolutionary thing ever, but it's a much bigger achievement than you're giving them credit for.



Well you overthink this. CPU architecture are all about compromise and design choice. High count arm cpu were made to be in datacenter and have as many core for hyperscaller. Not necessarely to compete on single thread performance. Design choice have been made to have more core even if that imply less single core performance. 

Apple with their M1 and other Apple CPU have a different focus, they are looking for single core performance (but not much on multithread performance) as they think it's what deserve them the most. It do not means that a16 core M1 would 1. be doable commercially and 2. beat the 5950x.

It also do not means that AMD or Intel can't do better single core performance, but when you have a limited amount of power and a limited amount of transitors, it's all a matter of choice. 

AMD and Intel use mostly the same architecture for laptop up to datacenter. Some focus on datacenter and right now, Apple designed it's M1 for customer devices and they made their design in consequence. 

There is also the process node difference, i think that a AMD or Intel CPU on 5nm would face them way better than right now.  The CPU instruction set right now it's more a religion or something to cheer for than a real thing when it come to end performance. Both ARM and x86 have a front end to decode instruction, both have backend execution units, both use SIMD. 

In the end it's a matter on how well you use your transitor and what is your end goal. For Apple, it's integration + Single Thread performance. For AMD, it's flexibility (chiplets) and maximum performance). 

For intel, well i am not sure even intel know right now but that is another subject...


----------



## Smartcom5 (Nov 17, 2020)

dyonoctis said:


> It's a design philosophy that works well for those machine, but AMD already said that they are not interested in an Hybrid design.


They recently patented technological approaches and algorithms to shift compute-threads in a heterogeneous environment (aka _hybrid architectures_) according to their needed instruction-set extensions – by arrange the threads across given big.LITTLE-cores, based on their instruction set fully autonomously in the silicon itself.

Thus, they patented a way to make scheduler-fixes needless in any heterogenous environment – they wouldn't've had done that if they don't plan to also make big.LITTLE designs, like for real.

Those official statements are just there to appease the competition and cosy Intel, nVidia, ARM et al. all along.
Just like how they lulled Intel into that (false) sense of security when they proclaimed their own official capitulation in '11 and how they would strike sail and that they henceforth, awed before Big Blue™, would content theirselves with just getting the fallen breadcrumbs – just to strike back even harder out of nowhere like they did with their Ryzen.

Seems I'm the only one having the firm believe, that AMD – under the condition that ARM/RISC-V reaches any greater significance and/or broader adaptation (read: market-saturation) – could rather spontaneously come up with some ARM-bases (RISC-V-) designs on their own _pretty quick_, likely even following such a hybrid nomenclature.

Since they never ever stopped their work on their K12 in the first place, it was just postponed indefinitely and put on hold in favour of what we now call _Zen_ – and curiously enough, just a few months ago, AMD's K12 popped up again out of nowhere … Now remember _who_ came to visit AMD to work on the AMD _ARMv8-A_-based _K12_-design … K, jk! 
That prospect, it's thrilling already, isn't it? ツ

Smartcom


----------



## InVasMani (Nov 17, 2020)

Chrispy_ said:


> If the M1 does anything, it might finally get Microsoft to hurry the hell up and ditch all the legacy crap that is bogging down x86 Windows.
> 
> With x86 and Windows playing a chicken-and-egg game, it's never going to become a more streamlined architecture until someone takes the first step. AMD and Intel can't afford to cull features in hardware until they are dropped from the OS, and Microsoft is still unwilling to completely let go of 32-bit OS even in this day and age....


 AMD and Intel could just do a 32-bit/64-bit bigLITTLE approach make one chip 32-bit only and the other 64-bit only.


----------



## Chrispy_ (Nov 17, 2020)

R0H1T said:


> If this doesn't wake Intel up & kick them where it really hurts you can bet Intel is going the way of IBM.


If intel fails to remain relevant in the consumer PC industry you just know they'll become patent trolls instead.


----------



## Searing (Nov 17, 2020)

Vya Domus said:


> The 2020 Mac Mini Unleashed: Putting Apple Silicon M1 To The Test
> 
> 
> 
> ...



Yes you should read the darn thing before talking. You are trying to mislead people.

Intel's 18W chips don't draw 18W running AVX Prime95. You can't say "TDP vs TDP", the 20W estimate is total power draw under max compute. We aren't dumb enough to consider the 10900k a ~100W part either.

Anandtech are the experts here, and they wrote an entire article about Intel's TDP shenanigans, and yet you are using them to attack Apple's power consumption, oh the irony.

Take a look at this graph in comparison, here is the compute power consumption for Tiger Lake, just the package power. Peak power consumption is double the M1. And the M1 is faster. Sure nobody knows the exact package power for the M1, that is a problem with Apple's locked down approach, but let's not pretend the M1 uses more power than it does. You don't want to see how much power my Tiger Lake laptop uses in comparison, and it drops below 2.5ghz at 18W and gets crushed by the M1. If you want to be scientific about it we'll have to wait until we can run the same workloads and record the joules used


----------



## Bansaku (Nov 17, 2020)

So the M1 scores 7508 points in the multi-core test. Wow, I get close to14,000 in R23 in Mac OS Big Sur with my 3700X Hackintosh. Let's hope Apple has something else up their sleeve as the iMacs and Mac Pros will be severely gimped in performance compared to equivalent  X64 desktops.


----------



## phanbuey (Nov 17, 2020)

Bansaku said:


> So the M1 scores 7508 points in the multi-core test. Wow, I get close to14,000 in R23 in Mac OS Big Sur with my 3700X Hackintosh. Let's hope Apple has something else up their sleeve as the iMacs and Mac Pros will be severely gimped in performance compared to equivalent  X64 desktops.



You mean like putting 8 of these in a mac pro? Pretty sure that will be the plan.  Single thread performance is already good, all they need is more cores:





 mini tower.


----------



## Bansaku (Nov 17, 2020)

phanbuey said:


> You mean like putting 8 of these in a mac pro? Pretty sure that will be the plan.  Single thread performance is already good, all they need is more cores:
> 
> View attachment 176052
> 
> mini tower.



Touchè! That is something Apple would do.


----------



## Fourstaff (Nov 17, 2020)

I see a lot of "what if"* comments. They fact that M1 is not cleanly swept under the rug with no arguments means that Apple is getting within striking range of x86-64 champions.

*what if = 7nm vs 5nm? Equal power draw? This and that benchmark? Mobile/Server/HEDT/? vs Intel Gen8/9/10/11 or Zen1/2/3/4? etc.


----------



## Bansaku (Nov 17, 2020)

Fourstaff said:


> I see a lot of "what if"* comments. They fact that M1 is not cleanly swept under the rug with no arguments means that Apple is getting within striking range of x86-64 champions.
> 
> *what if = 7nm vs 5nm? Equal power draw? This and that benchmark? Mobile/Server/HEDT/? vs Intel Gen8/9/10/11 or Zen1/2/3/4? etc.



It also has a lot  to do with mac OS itself. I just upgraded my Hackintosh to Big Sur and even on X64 everything, including Metal performance, jumped %50+ depending on the app/workload. It's too bad that Cinebench auto-updated itself to R23 so my old baseline of R20 5095 Multi-core in Catalina is meaningless.


----------



## Fourstaff (Nov 17, 2020)

Bansaku said:


> It also has a lot  to do with mac OS itself. I just upgraded my Hackintosh to Big Sur and even on X64 everything, including Metal performance, jumped %50+ depending on the app/workload. It's too bad that Cinebench auto-updated itself to R23 so my old baseline of R20 5095 Multi-core in Catalina is meaningless.



They have a big advantage in OS and hardware integration, sure. However we will rarely if ever use one without the other so M1+macOS needs to be taken as one package instead measuring each discretely.


----------



## TheoneandonlyMrK (Nov 17, 2020)

phanbuey said:


> You mean like putting 8 of these in a mac pro? Pretty sure that will be the plan.  Single thread performance is already good, all they need is more cores:
> 
> View attachment 176052
> 
> mini tower.


Well at least they would start having a legit reason for bonkers builds and absurd pricing, I doubt these chips are cheap, the rest of the BOM is popcorn pricing.


----------



## Searing (Nov 18, 2020)

Macbook Air base model seems to settle around 10.015 Watts at 2.650 Ghz while doing Cinebench R23 once. Then drops to around 8 Watts and 2.460 Ghz after a second round. Where is that guy saying 20+ watts, which should have been obviously ridiculous just by looking at the increases in battery life vs the Intel ones.


----------



## Nordic (Nov 18, 2020)

TheLostSwede said:


> Why would Apple ever go as high as 90W? We might see some 15-25W parts next, but I doubt we'll ever see anything in the 90W range from Apple.


I don't have an answer for that. The hardware geek in me just wants to see it happen. There are many things I want but will not have.



Selaya said:


> Honestly, I'd expect the same as would happen if you would feed a 5950X 600W - it'll go boom.


I don't mean this chip specifically. I would like to see what Apple could achieve if they tried.


----------



## techisfun (Nov 18, 2020)

I bought an M1 Mac mini today and returned it a few hours later. The hardware is revolutionary. Unfortunately, macOS is still a pile of crap. The M1 isn't going to help Apple grow their Mac user base. The OS needs a complete redesign. Big Sur is a coat of paint.

Here are some observations:

It only draws about 28w from the wall when the CPU is stressed (which is what my PC draws while idle.) It draws only 6w when streaming video.

The M1 doesn't render the desktop at 144fps smoothly. I had to set my refresh rate to 120hz, and there was still some stuttering when dragging windows.

The x86 versions of Firefox and Chrome failed to stream video from Twitch, but they were fast enough.

Safari is very fast. Jetstream 2: 202, Octane: 63K, Kraken: 450ms. But I don't like Safari.

Resuming is fast, but boot times are slow because macOS is bloated.


----------



## dicobalt (Nov 18, 2020)

Apple's 5nm 3GHz processor beats a 10nm 1.69GHz processor, that's not a victory in my book.


----------



## Valantar (Nov 18, 2020)

Punkenjoy said:


> Well you overthink this. CPU architecture are all about compromise and design choice. High count arm cpu were made to be in datacenter and have as many core for hyperscaller. Not necessarely to compete on single thread performance. Design choice have been made to have more core even if that imply less single core performance.
> 
> Apple with their M1 and other Apple CPU have a different focus, they are looking for single core performance (but not much on multithread performance) as they think it's what deserve them the most. It do not means that a16 core M1 would 1. be doable commercially and 2. beat the 5950x.
> 
> ...


Sorry, but you're quite mistaken here. Let's take Zen 3 as an example: increasing single thread performance was the explicit main goal of that design - which is a whole new architecture with every part of the core changed from Zen 2 - and it also did increase ST performance in a very impressive way compared to its predecessors. Yet still just barely beats the M1. AMD has a 105W TDP and ~144W total package power draw to work with. If there was more ST scaling to be found, they have all the headroom they need to exploit it. Yet their cores individually max out at ~20W. Why? Because the architecture doesn't scale past that, it either grows unstable or overheats. Of course Apple saves a lot on their baseline power from basing this on a mobile architecture and not having something power hungry like multiple IF links, PCIe controllers and external memory, which helps them gain a lot of baseline efficiency. But it's undeniable that the cores in the M1 are _massively _efficient _and_ performant at the same time.

It's obvious that AMD _could_ have made a wider core with tons of transistors and made a higher IPC, lower clocking design like this. But could they have done so at the same level of efficiency? AnandTech suggests no. Of course Apple has a major advantage here in being vertically integrated and as such not caring that much about SoC costs as long as they can preserve their margins. Neither AMD nor Intel can operate that way, pushing them towards smaller and more affordable core designs. But quite frankly, that isn't much of an argument against the M1 being a major achievement, it just shows that Apple's tactics are working. Too bad for us non-Apple users, really.

As for a 16-core M1 being doable? It would definitely be a gargantuan piece of silicon, likely comparable to the Xbox Series X SoC in area, though of course on 5nm and not 7. I don't see Apple having a problem with that, given that it would be - at the low end - for >$2000 laptops and desktops (with very cut down chips at that price, allowing for salvaging a lot of faulty chips), scaling to well above $5000 for top configurations. The margins are more than there to pay for a big chip. As for performance scaling: they'll of course need to change their memory architecture and design an interconnect that works for that many cores. But that isn't _that _hard. In terms of pure performance, if the M1 nearly matches the 5950X at <1/4 the per-core power, there's little reason why a bigger chip wouldn't keep that performance at a minimum. Heat density will definitely be an issue, but one that can be solved by spreading core clusters out across the SoC or adding a vapor chamber.

As for AMD or Intel on 5nm being more competitive: well, obviously to some degree, but I wouldn't expect current TSMC 5nm to clock even close to as high as current TSMC 7nm, so that move might actually lose them performance unless it's also a wider architecture. Would it allow them to catch up in perf/W? Not even close. A single node change doesn't get you 75% power savings.


Smartcom5 said:


> They recently patented technological approaches and algorithms to shift compute-threads in a heterogeneous environment (aka _hybrid architectures_) according to their needed instruction-set extensions – by arrange the threads across given big.LITTLE-cores, based on their instruction set fully autonomously in the silicon itself.
> 
> Thus, they patented a way to make scheduler-fixes needless in any heterogenous environment – they wouldn't've had done that if they don't plan to also make big.LITTLE designs, like for real.
> 
> ...


Wasn't AMD's quite about not wanting hybrid architectures on the desktop, referring to Alder Lake? Hybrid for mobile makes perfect sense, and I don't doubt AMD could scale down Zen to a low power design for that use quite easily. That being said, that patent doesn't describe a method for entirely obviating the need for an architecture-aware scheduler; only allocating threads based on the instruction set only works for workloads where only one set of cores supports that instruction set, such as power hungry AVX loads. You'll still need the scheduler to know to move high performance threads to high performance cores even if they use instruction sets common to the two clusters.


InVasMani said:


> AMD and Intel could just do a 32-bit/64-bit bigLITTLE approach make one chip 32-bit only and the other 64-bit only.


Oh dear lord no. The majority of applications today are 64-bit. That would mean the "little" chip _couldn't run them at all_. Windows 10 is AFAIK 64-bit _only_, so it couldn't even run the OS! No. Just no.


Bansaku said:


> So the M1 scores 7508 points in the multi-core test. Wow, I get close to14,000 in R23 in Mac OS Big Sur with my 3700X Hackintosh. Let's hope Apple has something else up their sleeve as the iMacs and Mac Pros will be severely gimped in performance compared to equivalent  X64 desktops.


You really shouldn't be surprised that a ~20-24W hybrid 4c (big) + 4c (little) CPU lags significantly behind an 8c16t all-big-core 65W (88W under all-core loads) CPU. What makes this impressive is that they're managing half your score with less than a third of the power, half the high performance cores, and no SMT.


Searing said:


> Macbook Air base model seems to settle around 10.015 Watts at 2.650 Ghz while doing Cinebench R23 once. Then drops to around 8 Watts and 2.460 Ghz after a second round. Where is that guy saying 20+ watts, which should have been obviously ridiculous just by looking at the increases in battery life vs the Intel ones.


20-24W is for the Mac Mini, not the Macbook Air.


----------



## Vya Domus (Nov 18, 2020)

Searing said:


> Where is that guy saying 20+ watts, which should have been obviously ridiculous just by looking at the increases in battery life vs the Intel ones.



This confirms you are an arrogant fanboy, who despite knowing nothing at all, keeps taking jabs at me even though you have been proven wrong and this is beginning to look really pathetic on your end.

*Anandtech reviewed the Mac Mini not the Macbook Air and he estimated 20W power draw from the SOC at full load.*

For your sake I hope you are just a massive fanboy and are not actually this dumb. Regardless I am not sticking around to find out, off to the ignore list you go.


----------



## dyonoctis (Nov 18, 2020)

Valantar said:


> Sorry, but you're quite mistaken here. Let's take Zen 3 as an example: increasing single thread performance was the explicit main goal of that design - which is a whole new architecture with every part of the core changed from Zen 2 - and it also did increase ST performance in a very impressive way compared to its predecessors. Yet still just barely beats the M1. AMD has a 105W TDP and ~144W total package power draw to work with. If there was more ST scaling to be found, they have all the headroom they need to exploit it. Yet their cores individually max out at ~20W. Why? Because the architecture doesn't scale past that, it either grows unstable or overheats. Of course Apple saves a lot on their baseline power from basing this on a mobile architecture and not having something power hungry like multiple IF links, PCIe controllers and external memory, which helps them gain a lot of baseline efficiency. But it's undeniable that the cores in the M1 are _massively _efficient _and_ performant at the same time.
> 
> It's obvious that AMD _could_ have made a wider core with tons of transistors and made a higher IPC, lower clocking design like this. But could they have done so at the same level of efficiency? AnandTech suggests no. Of course Apple has a major advantage here in being vertically integrated and as such not caring that much about SoC costs as long as they can preserve their margins. Neither AMD nor Intel can operate that way, pushing them towards smaller and more affordable core designs. But quite frankly, that isn't much of an argument against the M1 being a major achievement, it just shows that Apple's tactics are working. Too bad for us non-Apple users, really.
> 
> ...


Iirc, that's also the same reason as to why qualcomm isn't expected to ever come close to the A chips, because they can't afford to make a chip that's so expensive.

The 15w zen2 are not too shabby for an x86, the single core perf didn't suffer that much. we have yet to see if low power zen 3 can do the same.


----------



## Valantar (Nov 18, 2020)

dyonoctis said:


> Iirc, that's also the same reason as to why qualcomm isn't expected to ever come close to the A chips, because they can't afford to make a chip that's so expensive.
> 
> The 15w zen2 are not too shabby for an x86, the single core perf didn't suffer that much. we have yet to see if low power zen 3 can do the same.
> View attachment 176119View attachment 176120


You're right, the difference in ST perf is very small - as I said, even a 5950X barely exceeds 20W in 1-core power (which drops to 6-7W at ~3.8GHz at a full 16c load), so we should expect similar ST perf (at least in 25W configurations) for mobile Zen 3 too. Still, the relatively low ST power is a clear indication that there isn't anyting "left in the tank" in terms of ST performance for Zen 3 - to increase that they need either much higher clocks (not feasible without exotic cooling and tons of power) or a more fundamental reworking of the architecture for IPC increases. They have a 144W power limit to work with after all, so if pushing more power into a single core made a difference, they would obviously be doing it. If a full, top-to-bottom redesign, changing every single part of the core from Zen 2 netted them a ~19% increase, they're going to need drastic changes to come close to the IPC Apple is demonstrating here. Of course it's entirely possible to compensate for IPC by increasing clocks - which is what AMD and Intel are both doing when compared to the M1 - and each approach has clear limitations. It's highly unlikely Apple can scale this design past 4GHz, and even that might require _a lot_ of power. But nonetheless, the competitive path forward for AMD and Intel both became much more difficult with the launch of this.

You're probably right about QC not daring to make anything that expensive though. IIRC the cost of a high end smartphone chip is typically in the $50 range (or at least it was a few years back). Even doubling that would likely be _very_ close to the pure silicon (no processing, packaging, binning, etc.) cost of the M1. Which would leave QC taking a net loss on every part they sell. And given how slim margins are in the mobile industry (for everyone but Apple, that is), going higher in chip prices for a non-integrated company is likely not feasible. Though mixed designs with 1-2 X1 cores might serve as a good middle ground.


----------



## Searing (Nov 18, 2020)

Vya Domus said:


> This confirms you are an arrogant fanboy, who despite knowing nothing at all, keeps taking jabs at me even though you have been proven wrong and this is beginning to look really pathetic on your end.
> 
> *Anandtech reviewed the Mac Mini not the Macbook Air and he estimated 20W power draw from the SOC at full load.*
> 
> For your sake I hope you are just a massive fanboy and are not actually this dumb. Regardless I am not sticking around to find out, off to the ignore list you go.



And he didn't show the terminal power draw. I did. I'll show you the mini results later. Keep up the lies and FUD. It sure makes sense that Apple has the longest battery life while you pretend the M1 uses more power than Intel. /s


----------



## Valantar (Nov 18, 2020)

Searing said:


> And he didn't show the terminal power draw. I did. I'll show you the mini results later. Keep up the lies and FUD. It sure makes sense that Apple has the longest battery life while you pretend the M1 uses more power than Intel. /s


The MBA you showed data from is a passively cooled laptop. The MM AT tested is an actively cooled desktop with a higher clock speed spec. It stands to reason that the latter consumes more power. More than Intel, though? Not really, given that Intel has stopped specifying TDP beyond a range, which is 10-25W for their latest mobile parts. And 25W configurations are quite common in premium thin-and-lights like the XPS 13 - but those are again actively cooled. 10W Intel configurations are passively cooled too, but cut their base clocks a lot compared to 15 or 25W configurations.


----------



## seth1911 (Nov 19, 2020)

It would be faster maybe with MAC OSX and the Cinebench for this OS, but it can`t run any Windows or Linux Distri.

Its a Consumer thing, wanna a Golden Cage:
Yes
No

If u buy Software in the Appstore for a few 100 or 1000$ u can`t switch back to Windows or Linux or u need to buy the Software again.
In a few Years Apple can set the Prices higher and higher cause u can`t go out from the Apple Economic.


----------



## Nygma (Nov 19, 2020)

seth1911 said:


> It would be faster maybe with MAC OSX and the Cinebench for this OS, but it can`t run any Windows or Linux Distri.
> 
> Its a Consumer thing, wanna a Golden Cage:
> Yes
> ...



Runs Linux just fine.


----------



## dragontamer5788 (Nov 19, 2020)

Valantar said:


> Nope, not the only one noticing that, and there's no doubt that these chips are really expensive compared to the competition. Sizeable silicon, high transistor counts, and a very expensive node should make for quite a high silicon cost. There's another factor to the equation though: Intel (not really, but not _that_ far behind) and AMD are delivering the same performance with much smaller cores and on a bigger node _but with several times the power consumption_. That's a notable difference. Of course comparing a ~25W mobile chip to a 105W desktop chip is an unfair metric of efficiency, but even if Renoir is really efficient for X86, this still beats it.



This M1 core is basically twice the size of an x86 core, be it Intel Skylake or AMD Zen. Twice the reorder-buffer, twice the L1 cache, twice the execution ports, twice the decode width. 8-instructions per clock tick vs 4-instructions.

Apple then downclocked the design to 2.8GHz. There's no mystery here: Apple designed the *widest* core in all of computing history (erm... maybe 2nd widest. The Power9 8-SMT was 12 instructions / clock wide. But this M1 is the widest chip ever on the consumer market).

The "secret" to silicon power consumption is quite simple. Power consumption = O(voltage ^ 2 * frequency), and guess what? Frequency is related to voltage, so that's really O(voltage^3). Smaller transistors can operate at lower voltage (5nm advantage). Lower frequency means lower voltage, and then lower voltage means far less power. By making a wide (but low-frequency) core, Apple beats everyone in single-threaded performance and power-efficiency simultaneously.

Same thing with AVX512 (though different). Intel is willing to downclock their processor to perform 16 x 32-bit -operations simultaneously, because low-frequency / low-voltage but lots of work just scales better.

Anyway, you downclock and lower the voltage for power-efficiency. Then increase the width of the core to compensate for speed (and make things faster). Your processor needs to search harder for instruction-level parallelism, but that's a well known trick at this point. (Aka: out of order execution).

That's all there is to it. Everything after that is hypotheticals about the future. Can Intel / AMD make an 8-wide decoder? Or would the x86's variable-length instructions make such parallelism hard to accomplish?

-------------

Vya Domus has a point: traditionally, you'd just make a 2nd core instead of doubling the core size. (or, you go wide SIMD, which does scale well). What Apple is doing here, is betting that customers do want that faster single-core performance instead of shuffling data to a 2nd or 3rd core.

I agree with Apple however. I think the typical consumer would prefer a slightly faster (and slightly more efficient) single-core, rather than more cores. If given the choice between 8 cores (all 8-wide decode/execute), or 16-cores (of 4-wide decode/execute), the typical consumer probably wants the 8-core x 8-wide execution. Note: 8-wide execution is NOT 2x faster than 4-wide execution.


----------



## Aquinus (Nov 19, 2020)

Nygma said:


> Runs Linux just fine.


I was unaware that Linux had support for the M1 chip.  I'm going to cry foul on that one. I haven't heard of anyone having cracked the bootloader to run Linux yet on these new machines.


----------



## Valantar (Nov 19, 2020)

Aquinus said:


> I was unaware that Linux had support for the M1 chip.  I'm going to cry foul on that one. I haven't heard of anyone having cracked the bootloader to run Linux yet on these new machines.


I guess it theoretically does, but there's no way on earth Apple is letting anyone boot an M1-equipped machine into Linux.


----------



## Aquinus (Nov 19, 2020)

Valantar said:


> I guess it theoretically does, but there's no way on earth Apple is letting anyone boot an M1-equipped machine into Linux.


I've read that it's already being worked on. I honestly don't think it's impossible, although there is no way that it's going to be as clean of an experience as just using OS X.


----------



## Steevo (Nov 19, 2020)

Aquinus said:


> I think people are forgetting that this 10w chip is competing with chips that have TDPs as high as 35-45 watts. Come on, let's gain a little bit of perspective here. It's not the best, but it's pretty damn good for what it is. If this is what Apple can do with a 10w power budget, imagine what they can do with 45 watts.




So 20W of fixed function (the essential definition of RISC ARM architecture) hardware on a 5nm node. It's impressive, but not ground breaking. If we go by the rule of thumb we are seeing with TSMCs node shrinks we would see it falls in line with good design on a good process, thermals possibly being the largest limiting factor. 

And your other comment about power density, yes, the W/Sqcm is an issue for low power designed devices, which is why 7nm TSMC can run 5Ghz on 1.5v or just 2.5Ghz on 1v it's about the design, putting space between the hottest parts so they can flux heat into other cooler areas without letting the magic smoke out.


----------



## Aquinus (Nov 19, 2020)

Steevo said:


> So 20W of fixed function (the essential definition of RISC ARM architecture) hardware on a 5nm node. It's impressive, but not ground breaking.


...but:


Searing said:


> Macbook Air base model seems to settle around 10.015 Watts at 2.650 Ghz while doing Cinebench R23 once. Then drops to around 8 Watts and 2.460 Ghz after a second round. Where is that guy saying 20+ watts, which should have been obviously ridiculous just by looking at the increases in battery life vs the Intel ones.
> 
> View attachment 176098


----------



## seth1911 (Nov 19, 2020)

i think there is something not right with the anandtech test, cause the Apple IGP powers nearly a mobile GTX 1650  

Nvidia need 50w for 2,4 TFLOPS FP32 inc. 4GB GDDR5 and now Apple will do it similar with 7w on its IGP


All are stupid,
Nvidia,
ARM with Mali,
Qualcomm with Adreno,
AMD wit RDNA

Only apple got the holy grail with theyr IGP to perform nearly *8 times* better than anyone above


----------



## dyonoctis (Nov 20, 2020)

seth1911 said:


> i think there is something not right with the anandtech test, cause the Apple IGP powers nearly a mobile GTX 1650
> 
> Nvidia need 50w for 2,4 TFLOPS FP32 inc. 4GB GDDR5 and now Apple will do it similar with 7w on its IGP
> 
> ...


AMD is still using good ol vega for their IGP, lack of competion yada yada...now they might have the incentive to be more bleeding edge.
As for Nvidia, they don't develop for low power first, their small gpu are just a heavily cut down big gpu. Apple is doing the reverse, they started from something made for efficiency, and are making it bigger and bigger. 

Their best low power soc is still the Tegra X1 from 2015 who's still based on maxwell. Them owning ARM won't have any effect until several years from now, but who know ? Nvidia has been marketing their mobile and desktop gpu as the holy grail of content creation so much, the revival of the mac might make them try harder


----------

