Tuesday, August 18th 2020

Apple A14X Bionic Rumored To Match Intel Core i9-9880H

The Apple A14X Bionic is an upcoming processor from Apple which is expected to feature in the upcoming iPad Pro models and should be manufactured on TSMC's 5 nm node. Tech YouTuber Luke Miani has recently provided a performance graph for the A14X chip based on "leaked/suspected A14 info + average performance gains from previous X chips". In these graphs, the Apple A14X can be seen matching the Intel Core i9-9880H in Geekbench 5 with a score of 7480. The Intel Intel Core i9-9880H is a 45 W eight-core mobile CPU found in high-end notebooks such as the 2019 16-inch MacBook Pro and requires significant cooling to keep thermals under control.

If these performance estimates are correct or even close then Apple will have a serious productivity device and will serve as a strong basis for Apple's transition to custom CPU's for it's MacBook's in 2021. Apple may use a custom version of the A14X with slightly higher clocks in their upcoming ARM MacBooks according to Luke Miani. These results are estimations at best so take them with a pinch of salt until Apple officially unveils the chip.
Source: @LukeMiani
Add your own comment

85 Comments on Apple A14X Bionic Rumored To Match Intel Core i9-9880H

#51
Darmok N Jalad
I don’t really see how it’s that far of a stretch at this point. This A14X is Apple’s latest and greatest on a 5nm node, getting to scale up in power, versus a tired, Skylake-based chip on a very old 14nm node and crammed down to its thermal minimum. Now if the A14X matched the 9900K, that would be much harder to believe. I don’t think Apple would make this move unless they had something solid lined up. I guess it won’t be too much longer before we find out, and benchmarks beyond Geekbench will be available on ARMacOS.
Posted on Reply
#52
dragontamer5788
theoneandonlymrkYou realise all modern processor's are designed for Turbo, and dash to rest operation.
And I realize that this is a useful feature for rendering webpages on cellphones. And therefore, a benchmark that measures this behavior (especially since rendering webpages is probably 90% of what cellphones and laptops do all day) is a critical measurement of performance for the modern consumer.
Any bench shorter than the Tau value isn't worth shit regardless IMHO
Explain. Why? The situation is clearly common. I can look at "top" or Windows Process Manager and see that my utilization is damn near 0% most of the time when.

We hardware nerds love to pretend that we're running our systems at high utilization with high efficiency, as if we were bitcoin miners or Fold@Home geeks all the time. But that's just not the reality of the day-to-day. Even programming at work has started to get offloaded to dedicated "build servers" and continuous integration facilities, off of the desktop / workstation at my workdesk.

Browsing HTML documentation for programming is hugely important, and a lot of that is "TURBO race to idle" kind of workloads. The chip idles, then suddenly the HTML5 DOM shows up, maybe with a bit of Javascript to run before reaching its final form. But take this webpage for instance: This forum page is ~18.8 KBs HTML5, which is small enough to fit inside the L1 cache of all of our machines.

That's the stuff Geekbench is measuring: opening PDF documents, browsing HTML5, parsing DOM, interpreting JPEG images. It seems somewhat realistic to me, with various internal webpages constantly open at my work computer and internal wikis I'm updating constantly.

---------

I don't even like Apple. I don't own a single Apple product. But you're going to have to explain the flaws of Geekbench if you really want to support your discussion points.
Posted on Reply
#53
TheoneandonlyMrK
dragontamer5788And I realize that this is a useful feature for rendering webpages on cellphones. And therefore, a benchmark that measures this behavior (especially since rendering webpages is probably 90% of what cellphones and laptops do all day) is a critical measurement of performance for the modern consumer.



Explain. Why? The situation is clearly common. I can look at "top" or Windows Process Manager and see that my utilization is damn near 0% most of the time when.

We hardware nerds love to pretend that we're running our systems at high utilization with high efficiency, as if we were bitcoin miners or Fold@Home geeks all the time. But that's just not the reality of the day-to-day. Even programming at work has started to get offloaded to dedicated "build servers" and continuous integration facilities, off of the desktop / workstation at my workdesk.

Browsing HTML documentation for programming is hugely important, and a lot of that is "TURBO race to idle" kind of workloads. The chip idles, then suddenly the HTML5 DOM shows up, maybe with a bit of Javascript to run before reaching its final form. But take this webpage for instance: This forum page is ~18.8 KBs HTML5, which is small enough to fit inside the L1 cache of all of our machines.

That's the stuff Geekbench is measuring: opening PDF documents, browsing HTML5, parsing DOM, interpreting JPEG images. It seems somewhat realistic to me, with various internal webpages constantly open at my work computer and internal wikis I'm updating constantly.

---------

I don't even like Apple. I don't own a single Apple product. But you're going to have to explain the flaws of Geekbench if you really want to support your discussion points.
You ignored my point because it doesn't fit your perspective , but my point is quite simple and is already explained adequately.

I also said it's my opinion sooo.

And my pc has said 80-100% load for years now, as I also said.

You know what gets hardly any abuse, my phone it's surfed on and fit's your skewed perspective.

Geek bench on it makes sense.
Posted on Reply
#54
Vayra86
SearingEvery forum is full of ignorant people like you just ignoring every benchmark (it is Geekbench 5 now, and there are many other benchmarks you can use), ignoring every expert Anandtech included, ignoring actual real world results. World's fastest computer is ARM based? Ignore it. Amazon offering ARM server instances? Ignore it. This is why the world passes some people by. They just can't accept that something has changed. There is an interesting question about psychology here, why does ARM being fast bother you? Why do you not accept basic reality? ARM is just an ISA, 68000 was fast, PowerPC was fast, x86 was fast, ARM was fast, it is just an ISA.

"Come on. Sustained means nothing, right, the one thing that you know Apple's chips are horrible at in terms of scalability means nothing. Got it." Any chip can run with sustained performance with a bit more cooling and power, yes it means nothing. We are comparing the CPUs, not the form factor.
You're saying it yourself and others have said it too, you just fail to realize it.

'PowerPC was fast'... it was even part used in a Playstation, yet today you don't see a single one in any gaming or consumer machine. In enterprise though? Yep. Its a tool that works best in that setting.

'ARM is fast'... correct. We have ThunderX chips that offer shitloads of cores and can use lots of RAM. They're not the ones we see in a phone though. We also have Apple's low-core-count, single-task optimized mobile chips. You won't see those in an enterprise environment. That's not 'ignoring it', it is separating A from B correctly.

Sustained means nothing in THE USE CASE Apple has selected for these chips. That is where all chips are going. More specialized. More specific to optimal performance in a desired setting. Even Intel's own range of CPUs, even in all those years they were 'sleeping' have pursued that goal. They are still re-using the same core design in a myriad of power envelopes and make it work top to bottom - in Enterprise, in laptop, and they've been trying to get there on mobile. The latter is the ONE AREA where they cannot seem to succeed, a bit similar to Nvidia's Tegra designs that are always somewhat too high power and perform well, but are too bulky after all to be as lean as ARM is under 5W. End result: Nvidia still didn't get traction with its ARM CPUs for any mobile device.

In the meantime, Apple sees Qualcomm and others develop chips towards the 'x86 route'. They get higher core counts, more and more hardware thrown at ever less efficient software, expanded functions. That is where the direction of ARM departs from Apple's overall strategy - they want optimized hard- and software systems. You seem to fail to make that distinction thinking Apple's ARM approach is 'The ARM approach'. Its not, the ISA is young enough to make fundamental design decisions.

Like @theoneandonlymrk said eloquently: philips head? Philips screwdriver.
dragontamer5788And I realize that this is a useful feature for rendering webpages on cellphones. And therefore, a benchmark that measures this behavior (especially since rendering webpages is probably 90% of what cellphones and laptops do all day) is a critical measurement of performance for the modern consumer.



Explain. Why? The situation is clearly common. I can look at "top" or Windows Process Manager and see that my utilization is damn near 0% most of the time when.

We hardware nerds love to pretend that we're running our systems at high utilization with high efficiency, as if we were bitcoin miners or Fold@Home geeks all the time. But that's just not the reality of the day-to-day. Even programming at work has started to get offloaded to dedicated "build servers" and continuous integration facilities, off of the desktop / workstation at my workdesk.

Browsing HTML documentation for programming is hugely important, and a lot of that is "TURBO race to idle" kind of workloads. The chip idles, then suddenly the HTML5 DOM shows up, maybe with a bit of Javascript to run before reaching its final form. But take this webpage for instance: This forum page is ~18.8 KBs HTML5, which is small enough to fit inside the L1 cache of all of our machines.

That's the stuff Geekbench is measuring: opening PDF documents, browsing HTML5, parsing DOM, interpreting JPEG images. It seems somewhat realistic to me, with various internal webpages constantly open at my work computer and internal wikis I'm updating constantly.

---------

I don't even like Apple. I don't own a single Apple product. But you're going to have to explain the flaws of Geekbench if you really want to support your discussion points.
There you go and that is why I said, Apple is going to offer you terminals, not truly powerful devices. Intel laptop CPUs are not much different, very bursty and slow as shit under prolonged loads. I haven't seen a single one that doesn't throttle like mad after a few minutes. They do it decently... but sustained performance isn't really there.

I will underline this again
Apple found a way to use ARM to guarantee their intended user experience.
This is NOT a performance guarantee. Its an experience guarantee.

You need to place this in the perspective of how Apple phones didn't really have true multitasking while Android did. Apple manages its scheduler in such a way that it gets the performance when the user demands it. They path out what a user will be looking at and make sure they show him something that doesn't feel like waiting to load. A smooth animation (that takes almost a second), for example, is also a way to remove the perception of latency or lacking performance. CPU won't burst? Open the app and show an empty screen, fill data points later. Its nothing new. Websites do it too irrespective of architecture, especially the newer frameworks are full of this crap. Very low information density and large sections of plain colors are not just a design style, its a way to cater to mobile limitations.

If you use Apple devices for a while, take a long look at this and monitor load and you can see how it works pretty quickly. There is no magic sauce.
Posted on Reply
#55
Vya Domus
Vayra86Apple found a way to use ARM to guarantee their intended user experience.
This is NOT a performance guarantee. Its an experience guarantee.
Indeed, a considerable chunk of their SoCs are just dedicated signal processors for different purposes transistor budget wise. To put things into perspective, it has 8.5 billion transistors, that's as much as an average GPU ... it better be fast.

Not to mention that they don't just put large L1 caches, everything related to on chip memory is colossal in size. And again, everyone can do that, that's not the merit of an ARM design or not.
Posted on Reply
#56
dragontamer5788
theoneandonlymrkYou ignored my point because it doesn't fit your perspective
Vayra86There you go and that is why I said, Apple is going to offer you terminals, not truly powerful devices. Intel laptop CPUs are not much different, very bursty and slow as shit under prolonged loads. I haven't seen a single one that doesn't throttle like mad after a few minutes. They do it decently... but sustained performance isn't really there.
I'm not sure if you guys know what I know.

Lets take a look at a truly difficult benchmark. One that takes over 20 seconds so that "Turbo" isn't a major factor.

www.cs.utexas.edu/~bornholt/post/z3-iphone.html

Though 20 seconds is slow, the blogpost indicates that they continuously ran the 3-SAT solver over and over again, so the iPhone was behaving at its thermal limits. The Z3 solver tries to solve the 3-SAT NP Complete problem. At this point, it has been demonstrated that Apple's A12 has a faster L1, L2, and memory performance than even Intel's chips in a very difficult, single-threaded task.

----------

Apple's chip team has demonstrated that its small 5W chip is in fact pretty good at some very difficult benchmarks. It shouldn't be assumed that iPhones are slower anymore. They're within striking distance of Desktops in single-core performance in some of the densest compute problems.
Posted on Reply
#57
TheoneandonlyMrK
dragontamer5788I'm not sure if you guys know what I know.

Lets take a look at a truly difficult benchmark. One that takes over 20 seconds so that "Turbo" isn't a major factor.

www.cs.utexas.edu/~bornholt/post/z3-iphone.html

Though 20 seconds is slow, the blogpost indicates that they continuously ran the 3-SAT solver over and over again, so the iPhone was behaving at its thermal limits. The Z3 solver tries to solve the 3-SAT NP Complete problem. At this point, it has been demonstrated that Apple's A12 has a faster L1, L2, and memory performance than even Intel's chips in a very difficult, single-threaded task.

----------

Apple's chip team has demonstrated that its small 5W chip is in fact pretty good at some very difficult benchmarks. It shouldn't be assumed that iPhones are slower anymore. They're within striking distance of Desktops in single-core performance in some of the densest compute problems.
20 seconds, nuff said I'm out, to I'll leave you to stroke apples ego.

Another great example of a waste of time.

A true, great benchmark , 20 seconds.

Single core, dense compute problems, lmfao.
Posted on Reply
#58
dragontamer5788
SMT Solvers, like Z3, solve a class of NP complete problems with pretty dense compute characteristics. Or are you unaware of what Z3 is?

Or are you unaware what "dense compute" means? HCPG is sparse compute (memory intensive), while Linpack is dense (cpu intensive). Z3 probably is in the middle, more dense than HCPG but not as dense as Linpack (I don't know for sure. Someone else may correct me on that).

--------

When comparing CPUs, its important to choose denser compute problems, or else you're just testing the memory interface (ex: STREAM benchmark is as sparse as you can get, and doesn't really test anything aside from your DDR4 clockrate). I'd assume Z3 satisfies the requirements of dense compute for the purposes of making a valid comparison between architectures. But if we go too dense, then GPUs win (which is also unrealistic. Linpack is too dense and doesn't match anyone's typical computer use. Heck, its too dense to be practical for supercomputers)
Posted on Reply
#59
TheoneandonlyMrK
dragontamer5788SMT Solvers, like Z3, solve a class of NP complete problems with pretty dense compute characteristics. Or are you unaware of what Z3 is?

Or are you unaware what "dense compute" means? HCPG is sparse compute (memory intensive), while Linpack is dense (cpu intensive). Z3 probably is in the middle, more dense than HCPG but not as dense as Linpack (I don't know for sure. Someone else may correct me on that).

--------

When comparing CPUs, its important to choose denser compute problems, or else you're just testing the memory interface (ex: STREAM benchmark is as sparse as you can get, and doesn't really test anything aside from your DDR4 clockrate). I'd assume Z3 satisfies the requirements of dense compute for the purposes of making a valid comparison between architectures. But if we go too dense, then GPUs win (which is also unrealistic. Linpack is too dense and doesn't match anyone's typical computer use. Heck, its too dense to be practical for supercomputers)
When comparing CPU, it's important to stick to the same CPU your comparing, and not to then revert to a two generation older CPU when you're argument is failing, he tested verses a 7700K.

In the note's

"This benchmark is in the QF_BV fragment of SMT, so Z3 discharges it using bit-blasting and SAT solving.
This result holds up pretty well even if the benchmark runs in a loop 10 times—the iPhone can sustain this performance and doesn’t seem thermally limited.1 That said, the benchmark is still pretty short.
Several folks asked me if this is down to non-determinism—perhaps the solver takes different paths on the different platforms, due to use of random numbers or otherwise—but I checked fairly thoroughly using Z3’s verbose output and that doesn’t seem to be the case.
Both systems ran Z3 4.8.1, compiled by me using Clang with the same optimization settings. I also tested on the i7-7700K using Z3’s prebuilt binaries (which use GCC), but those were actually slower.
What’s going on?
How could this be possible? The i7-7700K is a desktop CPU; when running a single-threaded workload, it draws around 45 watts of power and clocks at 4.5 GHz. In contrast, the iPhone was unplugged, probably doesn’t draw 10% of that power, and runs (we believe) somewhere in the 2 GHz range. Indeed, after benchmarking I checked the iPhone’s battery usage report, which said Slack had used 4 times more energy than the Z3 app despite less time on screen.

Apple doesn’t expose enough information to understand Z3’s performance on the iPhone,




This result holds up pretty well even if the benchmark runs in a loop 10 times—the iPhone can sustain this performance and doesn’t seem thermally limited.1 That said, the benchmark is still pretty short.

He said prior it uses one core only on Apple, really leverage what's there eh, or the light load might sustain a boost better cos of that single core use but this leads to my point B.

He doesn't know how it's actually running on the apple, so can't know if it is leveraging accelerator's to hit that target.

Still 7700k verses A12 ,14nm+(only one plus not mine) verses 5Nm .

That's not telling anyone much about how the A14 would compare to a CPU out today not Eol, never mind the next generation Ryzen and Cove cores it would face.

All in fail.

Do I know what Z3 is, wtaf does it matter.

We are discussing CPU performance not , coder pawn.

I don't use it 99.9% of users also don't, I am aware of it though and aware of the fact it too is irrelevant like geek bench.
Posted on Reply
#60
dragontamer5788
theoneandonlymrkHe doesn't know how it's actually running on the apple, so can't know if it is leveraging accelerator's to hit that target.
Uh huh.
Do I know what Z3 is, wtaf does it matter.
Well, given your statement above, I'm pretty sure you don't know what Z3 is. Z3 solves NP-complete optimization problems. Knapsack, Traveling Salesman, etc. etc. There's no "accelerator" chip for this kind of problem, not yet anyway. So you're welcome to take your foot out of your mouth now.

Z3 wasn't even made by Apple. Its a Microsoft Research AI project that happens to run really, really well on iPhones. (Open Source too. Feel free to point out where in the Github code this "accelerator chip" is being used. Hint: you won't find it, its a pure C++ project with some Python bits)
We are discussing CPU performance not , coder pawn.
The CPU performance on the IPhone is surprisingly good in Z3. I'd like to see you explain why this is the case. There's a whole bunch of other benchmarks too that the iPhone does well on. But unlike Geekbench, something like Z3 actually has coder-ethos as a premier AI project. So you're not going to be able to dismiss Z3 as easily as Geekbench.
Posted on Reply
#61
TheoneandonlyMrK
dragontamer5788Uh huh.



Well, given your statement above, I'm pretty sure you don't know what Z3 is. Z3 solves NP-complete optimization problems. Knapsack, Traveling Salesman, etc. etc. There's no "accelerator" chip for this kind of problem, not yet anyway. So you're welcome to take your foot out of your mouth now.

Z3 wasn't even made by Apple. Its a Microsoft Research AI project that happens to run really, really well on iPhones. (Open Source too. Feel free to point out where in the Github code this "accelerator chip" is being used. Hint: you won't find it, its a pure C++ project with some Python bits)



The CPU performance on the IPhone is surprisingly good in Z3. I'd like to see you explain why this is the case. There's a whole bunch of other benchmarks too that the iPhone does well on. But unlike Geekbench, something like Z3 actually has coder-ethos as a premier AI project. So you're not going to be able to dismiss Z3 as easily as Geekbench.
apples to oranges.

Can you explain where I said I was an expert in your coding speciality to put my foot in my mouth!?

I'm aware of f£#@£g Bigfoot but do I know many facts on him?.


So getting back to the 9900k it's 10-20% better than the 7700k your on about now.

And the Cove and zen3 cores are a good 17% better again (partly alleged).
That's at least 30% on the 7700K and Intel especially work hard to optimise for some code type's.

And you are still at it with second timed benches.

If I do something on a computer that takes time to run but it's a one off and a few seconds, I wouldn't even count it as a workload, at all.

Possibly a step in a process, but not a workload.

I have a workload or two and they don't Finish in 20 seconds.

Recent changes to GPU architecture and boost algorithms puts most GPU benchmarks and some people's benchmarking in the same light to me, Now, tests have to be sustained for a few minutes minimum or they're not good enough for me.

I'm happy to just start calling each other names if you want but best pm the mods don't like it.
Posted on Reply
#62
Vya Domus
theoneandonlymrkHow could this be possible? The i7-7700K is a desktop CPU; when running a single-threaded workload, it draws around 45 watts of power and clocks at 4.5 GHz.
Compilers, even with the same flags can generate different optimized code for each ISA. Also CPUs have many quirks, for instance while 64bit floating point multiplies might be faster on some architecture, divides might be painfully slower compared to others and such (I gave this example because divides are notoriously slow). That benchmark looks to be intensive in terms of integer arithmetic, it's well known Apple has a wide integer unit and the cache advantage has been talked about to death already. There is nothing impressive about that, you can run some vector linear algebra with 10^4 x 10^4 matrices and I am sure the Intel CPU would be faster then by quite a margin.

It's a waste of time to look at independent problems, because that's not what is found in the real world. We don't solve SMT, we don't run linear algebra all the time, most user software is mixed in terms of workloads and doesn't follow regular patterns such that caches are very effective.
Posted on Reply
#63
TheoneandonlyMrK
Vya DomusCompilers, even with the same flags can generate different optimized code for each ISA. Also CPUs have many quirks, for instance while 64bit floating point multiplies might be faster on some architecture, divides might be painfully slower compared to others and such (I gave this example because divides are notoriously slow). That benchmark looks to be intensive in terms of integer arithmetic, it's well known Apple has a wide integer unit and the cache advantage has been talked about to death already. There is nothing impressive about that, you can run some vector linear algebra with 10^4 x 10^4 matrices and I am sure the Intel CPU would be faster then by quite a margin.

It's a waste of time to look at independent problems, because that's not what is found in the real world. We don't solve SMT, we don't run linear algebra all the time, most user software is mixed in terms of workloads and doesn't follow regular patterns such that caches are very effective.
That was part of the article he linked, it was in quotations , I agree with you though.
And that's similar to my point , one 20sec workload is no workload at all.

And the comparisons are all over the show.
I would argue the performance of a 7700k is hard core irrelevant, few are buying quad cores to game on now, even Intel stopped pushing quads A12=(1core) 7700k. = 9900k = 11900(shrug) apparently, shrug.
Posted on Reply
#64
Searing
[/QUOTE]
Darmok N JaladI don’t really see how it’s that far of a stretch at this point. This A14X is Apple’s latest and greatest on a 5nm node, getting to scale up in power, versus a tired, Skylake-based chip on a very old 14nm node and crammed down to its thermal minimum. Now if the A14X matched the 9900K, that would be much harder to believe. I don’t think Apple would make this move unless they had something solid lined up. I guess it won’t be too much longer before we find out, and benchmarks beyond Geekbench will be available on ARMacOS.
Yeah the anti-ARM people are basically like "it isn't fast, because I say it isn't" ... i'm done arguing, wait for MacOS ARM and they'll see. They'll probably find a way to ignore the 100 other ways it is fast and focus on their hobby horse of dissing a benchmark or two. Even though they are perfectly able to get an iPad Pro and see it do all sorts of tasks at high speed. My PC hasn't gotten faster in almost 5 years now... basically I don't use more than 8 threads for the most part, so the 6700k is about the same as my 10900 computer (you can watch a lot of ~30 percent utilization with a 10900). I'm happy to see Apple actually speed things up.
Posted on Reply
#65
TheoneandonlyMrK
Yeah the anti-ARM people are basically like "it isn't fast, because I say it isn't" ... i'm done arguing, wait for MacOS ARM and they'll see. They'll probably find a way to ignore the 100 other ways it is fast and focus on their hobby horse of dissing a benchmark or two. Even though they are perfectly able to get an iPad Pro and see it do all sorts of tasks at high speed. My PC hasn't gotten faster in almost 5 years now... basically I don't use more than 8 threads for the most part, so the 6700k is about the same as my 10900 computer (you can watch a lot of ~30 percent utilization with a 10900). I'm happy to see Apple actually speed things up.
[/QUOTE]


That's daft, point to someone other than you that's said arm is not fast.


That's exactly the point, some of us can use an iPad pro too, many have, and put it back down, hence the opinions,

You pointed to two short busrt benchmarks as your hidden unseen by plebs truth?!, one of which was so obscure you think you got one over on me, yeah 98% of nerds haven't ran that bench ffs, yeah guy you told me, you think I couldn't show a few benches with PC's beating iPhone chip's?

I wouldn't waste my time, I have said before I have no doubt these have their place and will sell well but they are not talking the performance crown and real work will stay on x86 or PowerPC.
Posted on Reply
#66
dragontamer5788
Vya DomusCompilers, even with the same flags can generate different optimized code for each ISA. Also CPUs have many quirks, for instance while 64bit floating point multiplies might be faster on some architecture, divides might be painfully slower compared to others and such (I gave this example because divides are notoriously slow). That benchmark looks to be intensive in terms of integer arithmetic, it's well known Apple has a wide integer unit and the cache advantage has been talked about to death already. There is nothing impressive about that, you can run some vector linear algebra with 10^4 x 10^4 matrices and I am sure the Intel CPU would be faster then by quite a margin.
I agree with everything you said above.
It's a waste of time to look at independent problems, because that's not what is found in the real world. We don't solve SMT, we don't run linear algebra all the time, most user software is mixed in terms of workloads and doesn't follow regular patterns such that caches are very effective.
I disagree with your conclusion however.

A programmer working on FEA (Ex: simulated car crashes), Weather Modeling, or Neural Networks will constantly run large matrix-multiplication problems. Over, and over, and over again for days or months. In these cases, GPUs, and wide-SIMD (like 512-bit AVX on Intel or A64FX) will be a huge advantage. If GPUs are a major player, you still need a CPU with high I/O (ie: EPYC or POWER9 / OpenCAPI) to service the GPUs fast enough.

A programmer working on CPU-design will constantly run verification / RTL proofs, of which are coded very similarly to Z3 or other automated solvers. (And unlike matrix-multiplication, Z3 and other automated-logic code is highly divergent and irregular. Its very difficult to write multithreaded code and load-balance the work between multiple cores. There's a lot of effort in this area, but from my understanding, CPUs are still > GPUs in this field). Strangely enough, A12 is one of the best chips here, despite it being a tiny 5W processor.

A programmer working on web servers will run RAM-constrained benchmarks, like Redis or Postgresql. (And thus POWER9 / POWER10 will probably be the best chip: big L3 cache and huge RAM bandwidth).

--------

We have many computers to choose from. We should pick the computer that best matches our personal needs. Furthermore, looking at specific problems (like Z3 in this case), gives us an idea of why the Apple A12 performs the way it does. Clearly the large 128kB L1 cache plays to the A12's advantage in Z3.
Posted on Reply
#67
TheoneandonlyMrK
dragontamer5788I agree with everything you said above.



I disagree with your conclusion however.

A programmer working on FEA (Ex: simulated car crashes), Weather Modeling, or Neural Networks will constantly run large matrix-multiplication problems. Over, and over, and over again for days or months. In these cases, GPUs, and wide-SIMD (like 512-bit AVX on Intel or A64FX) will be a huge advantage. If GPUs are a major player, you still need a CPU with high I/O (ie: EPYC or POWER9 / OpenCAPI) to service the GPUs fast enough.

A programmer working on CPU-design will constantly run verification / RTL proofs, of which are coded very similarly to Z3 or other automated solvers. (And unlike matrix-multiplication, Z3 and other automated-logic code is highly divergent and irregular. Its very difficult to write multithreaded code and load-balance the work between multiple cores. There's a lot of effort in this area, but from my understanding, CPUs are still > GPUs in this field). Strangely enough, A12 is one of the best chips here, despite it being a tiny 5W processor.

A programmer working on web servers will run RAM-constrained benchmarks, like Redis or Postgresql. (And thus POWER9 / POWER10 will probably be the best chip: big L3 cache and huge RAM bandwidth).

--------

We have many computers to choose from. We should pick the computer that best matches our personal needs. Furthermore, looking at specific problems (like Z3 in this case), gives us an idea of why the Apple A12 performs the way it does. Clearly the large 128kB L1 cache plays to the A12's advantage in Z3.
So of all those examples, the mighty A12 can almost keep up with a two year old intel quad in one, while any of the other brands of CPU you mentioned do them all pretty good(to say the least in some cases), and because of this 20seconds of greatness this proves apple are nearly there beating Intel...

I don't see it.

Damn the irony

"We have many computers to choose from. We should pick the computer that best matches our personal needs. "

Philips head=Philips screwy.

How many Devs are on this ? What proportion of computing device users do they make up 0.00021% is it mostly just a niche?
Posted on Reply
#68
dragontamer5788
theoneandonlymrkHow many Devs are on this ? What proportion of computing device users do they make up 0.00021% is it mostly just a niche?
I mean, if we're talking about applicability to the largest audience, Geekbench is testing HTML5 DOM traversals and Javascript code. (Stuff that SPECInt, Drystone, and other benchmarks fail to test for).

Wanna go back to discussing Geekbench Javascript / Web tests? That's surely the highest proportion of users. In fact, everyone browsing this forum is probably performing AES-encryption/decryption (for HTTPS), and HTML5 DOM rendering. Please explain to me how such an AES-Decryption + HTML5 DOM test is unreasonable or inaccurate.
the mighty A12 can almost keep up with a two year old intel quad in one,
Your hyperbole is misleading. The A12 was 11% faster than the Intel in the said Z3 test. I'm discussing a situation (Z3) where the Apple chip on 5W in the iPhone form factor is outright beating a full sized 91W desktop processor.

Yes. Its a comparison between apples and peanuts (and ironically, Apple a much smaller peanut in this analogy). But Apple is soon to launch a laptop-class A14 chip. Some of us are trying to read the tea-leaves for what that means. The A14 is almost assuredly built on top of the same platform as the A12 chip. Learning what the future A14 laptop will be good or bad will be an important question as the A14 based laptops hit the market.
Posted on Reply
#69
Vya Domus
dragontamer5788We have many computers to choose from. We should pick the computer that best matches our personal needs.
Some computers are better than others at specific tasks but some provide very good general performance across the board. I have trust in an Intel or AMD processor to be good enough at everything, I don't place however the same trust in an Apple mobile chip because I know it wont be. This is why I said it's a waste of time to look at this very specific scenarios.
Posted on Reply
#70
TheoneandonlyMrK
dragontamer5788I mean, if we're talking about applicability to the largest audience, Geekbench is testing HTML5 DOM traversals and Javascript code. (Stuff that SPECInt, Drystone, and other benchmarks fail to test for).

Wanna go back to discussing Geekbench Javascript / Web tests? That's surely the highest proportion of users. In fact, everyone browsing this forum is probably performing AES-encryption/decryption (for HTTPS), and HTML5 DOM rendering. Please explain to me how such an AES-Decryption + HTML5 DOM test is unreasonable or inaccurate.



Your hyperbole is misleading. The A12 was 11% faster than the Intel in the said Z3 test. I'm discussing a situation (Z3) where the Apple chip on 5W in the iPhone form factor is outright beating a full sized 91W desktop processor.

Yes. Its a comparison between apples and peanuts (and ironically, Apple a much smaller peanut in this analogy). But Apple is soon to launch a laptop-class A14 chip. Some of us are trying to read the tea-leaves for what that means. The A14 is almost assuredly built on top of the same platform as the A12 chip. Learning what the future A14 laptop will be good or bad will be an important question as the A14 based laptops hit the market.
Your hyperbole is pure bullshit 1 core being used on even the 7700k isn't 75 Watts.
People use such to browse the web yes, we agree there, a light use case most do on their phones .

You have failed to address the point that there are three newer generations of intel chip with an architectural change on the way.

And none of you have explained how they can stay on the latest low power mode yet somehow mythically clock as high as a high power device.

All while arguing against your test being lightweight and irrelevant, which you just agreed it's use case typically is.

Look at what silicon they will make it on ,the price they want to hit and you get a CX8 8 core 4/4 hybrid.

Great.
Posted on Reply
#71
dragontamer5788
Your hyperbole is pure bullshit
Couldn't have said it better myself.

My facts are facts. Your "discussion" is buillshit hyperbole written as a spewing pile of one-sentence long paragraphs.

--------

Since you're clearly unable to take down my side of the discussion, I'll "attack myself" on behalf of you.

SIMD units are incredibly important to modern consumer workloads. From Photoshop, to Video Editing, to audio encoding, multimedia is very commonly consumed AND produced even by the most casual of users. With only 3 FP/Vector pipelines of 128-bit width, the Apple A12 (and future chips) will simply be handicapped in this entire class of important benchmarks. Even worse: these 128-bit SIMD units are hampered by longer latencies (2-clocks) compared to Intel's 512-bit wide x 1-clock latency (or AMD Zen2's 256-bit wide by 1-clock latency).

Future users expecting strong multimedia performance: be it Photoshop filters, messing with electronic Drums or other audio-processing, and simple video editing, will simply be unable to compete against current generation x86 systems, let alone next-gen's Zen3 or Icelake.
Posted on Reply
#72
TheoneandonlyMrK
dragontamer5788Couldn't have said it better myself.

My facts are facts. Your "discussion" is buillshit hyperbole written as a spewing pile of one-sentence long paragraphs.

--------

Since you're clearly unable to take down my side of the discussion, I'll "attack myself" on behalf of you.

SIMD units are incredibly important to modern consumer workloads. From Photoshop, to Video Editing, to audio encoding, multimedia is very commonly consumed AND produced even by the most casual of users. With only 3 FP/Vector pipelines of 128-bit width, the Apple A12 (and future chips) will simply be handicapped in this entire class of important benchmarks. Even worse: these 128-bit SIMD units are hampered by longer latencies (2-clocks) compared to Intel's 512-bit wide x 1-clock latency (or AMD Zen2's 256-bit wide by 1-clock latency).

Future users expecting strong multimedia performance: be it Photoshop filters, messing with electronic Drums or other audio-processing, and simple video editing, will simply be unable to compete against current generation x86 systems, let alone next-gen's Zen3 or Icelake.
I rebutled many of your points, you're blind to it , 7700k /10700k performance increases for one, next generation for two.

The Fact that the main use case You posited for your benchmark viability was web browser action!.

Getting technical Intel have foveros and FPGA tech, as soon as they have a desktop CPU with an FPGA and HBM, it could be game over so far as benches go against anything on anything, enabled by one API.
Power pc simply are in another league.
AMD will iterate core count way beyond apples horizon and incorporate better fabrics and Ip..

And despite it all most will still just surf on their phones.

And less not more people will do real work on ios.

I'm getting off this roundabout, are opinions differ let's see where five years gets us.

You don't answer any questions just epeen your leet Dev skills like we should give a shit.
We shouldn't , it's irrelevant to me what they/you like to use , it only matters to me and 99% of the public what we want to do, and our opinions differ on the scale of variability of workload here it seems.

Bye.
Posted on Reply
#73
Darmok N Jalad
dragontamer5788Couldn't have said it better myself.

My facts are facts. Your "discussion" is buillshit hyperbole written as a spewing pile of one-sentence long paragraphs.

--------

Since you're clearly unable to take down my side of the discussion, I'll "attack myself" on behalf of you.

SIMD units are incredibly important to modern consumer workloads. From Photoshop, to Video Editing, to audio encoding, multimedia is very commonly consumed AND produced even by the most casual of users. With only 3 FP/Vector pipelines of 128-bit width, the Apple A12 (and future chips) will simply be handicapped in this entire class of important benchmarks. Even worse: these 128-bit SIMD units are hampered by longer latencies (2-clocks) compared to Intel's 512-bit wide x 1-clock latency (or AMD Zen2's 256-bit wide by 1-clock latency).

Future users expecting strong multimedia performance: be it Photoshop filters, messing with electronic Drums or other audio-processing, and simple video editing, will simply be unable to compete against current generation x86 systems, let alone next-gen's Zen3 or Icelake.
Ironically, I’m a hobbyist photographer who does all his edits on an iPad Pro. It’s just as fast as anything I’ve used on a desktop—sliders apply pretty much in real time. That’s even with an occasional 80MP RAW. I’ve also made and exported movies in iMovie on the iPad, and it was seamless and fast on export. Would a pro want to do this? Probably not, but that might be more how iOS isn’t as ideal as a desktop OS for repetitive tasks, so I’m curious to see how Apple’s SOCs will handle even larger RAW files, imports of hundreds of images and batch updates. I guess my point is that the “feel” today isn’t so far off, but I don’t know how well that will translate from iOS to MacOS, where true multitasking is an everyday expectation versus the limited experience that it is on iOS today.
Posted on Reply
#74
squallheart
SearingA14X will run Shadow of the Tomb Raider and any other PS4/Xbox game. Not surprising since it will match a GTX 1060 easily enough, but only need 10-15W. The Switch is circa 2014 hardware (Galaxy Note 4 CPU plus half a GTX 750) so imagine a Switch that is 6 years more advanced and there you have it.
I am curious how you came to that conclusion.
I actually looked up GFXbench, which is cross-platform and fairly well regarded and there is no known bias to any platform.

gfxbench.com/device.jsp?benchmark=gfx50&os=iOS&api=metal&cpu-arch=ARM&hwtype=iGPU&hwname=Apple%20A12Z%20GPU&did=83490021&D=Apple%20iPad%20Pro%20(12.9-inch)%20(4th%20generation)
gfxbench.com/device.jsp?benchmark=gfx50&os=Windows&api=dx&cpu-arch=x86&hwtype=dGPU&hwname=NVIDIA%20GeForce%20GTX%201060%206GB&did=36085769&D=NVIDIA%20GeForce%20GTX%201060%206GB

I am comparing the A12Z (from the 4th gen Ipad pro , faster than the A12X) to the 1060

For Aztec high offscreen, the most demanding test, A12Z recorded 133.8 vs 291.1 on the GTX1060.

So my question to you is, are you expecting the A14X to more than double in graphics performance?
Posted on Reply
#75
dragontamer5788
theoneandonlymrkBye.
Is it really bye? Or are you one of those dudes who don't really mean what they say?
theoneandonlymrkGetting technical Intel have foveros and FPGA tech, as soon as they have a desktop CPU with an FPGA and HBM, it could be game over so far as benches go against anything on anything, enabled by one API.
Power pc simply are in another league.
AMD will iterate core count way beyond apples horizon and incorporate better fabrics and Ip..
Have you ever used an FPGA? Have you ever used Power? Power has anemic SIMD units. The benefit of Power9 / Power10 is memory bandwidth and L3.

FPGAs are damn hard to program for. Not only is Verilog / VHDL rarely taught, synthesis takes a lot longer than compiles. OpenCL / CUDA is far simpler, honest. We'll probably see more from Intel's Xe than from FPGAs.

Not to mention, high-end FPGAs are like $4000+ if we're talking things competitive vs GPUs or CPUs.
Posted on Reply
Add your own comment
Jul 20th, 2024 08:29 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts