Tuesday, April 3rd 2018

Apple to End the x86 Mac Era in 2020

One of the biggest tech stories of the 2000s was Apple's transition from the PowerPC machine architecture to Intel x86, which brought the Mac closer to being the PC it so loathed. The transition wasn't smooth, as besides the operating system, practically every third-party software developer (eg: Adobe), had to rewrite their software for the new architecture, with new APIs, and new runtime environments. Apple could be bringing about a similar change before the turn of the decade.

Apple already builds its own application processors for iOS devices, and some of the newer chips such as the A11 Bionic and A10 Fusion have already reached the performance levels of entry-level x86 desktop processors. It's only a matter of time before Apple can build its own SoCs for Macs (that's not just iMac desktops, but also Mac Pro workstations, MacBook, MacBook Air, and MacBook Pro). That timeline is expected to be around 2020. Since these chips are based on the ARM machine architecture, they will mandate a major transformation of the entire software ecosystem Apple built over the past decade and a half. Intel shares dropped by as much as 9.2 at the first reports of this move.
Source: Bloomberg
Add your own comment

48 Comments on Apple to End the x86 Mac Era in 2020

#26
R-T-B
FordGT90Conceptx86 is awesome because it's CISC.
You missed the part where x86 translates instructions to RISC in the micro-ops.

Instruction sets are overrated and mean little... All the big names are more than mature enough now. The microarchitecture behind them matters more.
Posted on Reply
#27
timta2
AssimilatorI have to agree. While ARM remains, quite frankly, s**t, if there's any company with the resources to make it competitive with x86, it's Apple. And they wouldn't be making this announcement unless they've already made significant progress in that regard.

... alternatively, they might have realised that it's not possible and are just trying to squeeze Intel into giving them a better deal on the next few generations of chips. We'll see.



Apple's business model is selling overpriced tat to morons, and that extends to the software for that platform. If you write software for Mac, you can charge a lot more for it than you could if you wrote it for Windows. So developers will pay out of their own pockets to port their code from x86 to ARM for Apple, because they will make more money in the long run.

And for Apple, the beauty is that they don't have to care about any of this. When the people outside your walled garden are shouting to get in because the garden is made of solid gold, you can afford to sit back and let survival of the fittest (or deepest pockets) win.
At least those "morons" how to spell and proofread.
Posted on Reply
#28
trparky
evernessinceEven if we do assume devs move over it will not happen instantly. How long are creatives willing to wait while devs migrate their code? Given the complexity of modern adobe software, I don't expect it would be quick. Sure you can emulate x86 but the performance is terrible.
Correct me if I'm wrong (because I probably am) but unless Adobe used low-level Assembly code to do some of the hard core image and video manipulation magic that their tools do, wouldn't it be just as simple as recompiling the C/C++ code against a new ISA (Instruction Set Architecture)? I mean, that is the reason why we have compilers right? So that we can write in higher level languages and be able to port it over to other architectures, right?
Posted on Reply
#29
Vya Domus
R-T-BYou missed the part where x86 translates instructions to RISC in the micro-ops.

Instruction sets are overrated and mean little... All the big names are more than mature enough now. The microarchitecture behind them matters more.
Doesn't really matter , it's still CISC at a higher level of abstraction and those micro-ops still constitute more robust instructions. For example , ARM needs separated instructions for memory operations whereas with x86 pretty much all instructions allow for complex addressing methods encoded within the instruction itself. Memory operations are slow , which means that you would have to find ways to keep an ARM core busy more often than an x86 one and inevitably you wont achieve the same efficiency.

ARM designs will always inherent disadvantages which just can't be mitigated due to it's RISC nature , instruction sets do matter , quite a lot.

Than being said either Apple will be shooting themselves in the foot attempting to become independent in a way which simply ins't fit for their current product stack , or they will just change said products , aka turning them into glorified iOS devices.
trparkyI mean, that is the reason why we have compilers right? So that we can write in higher level languages and be able to port it over to other architectures, right?
And with a potential overhead , which can be significant in some cases. He is right , x86 software on ARM will be atrocious.
Posted on Reply
#30
trparky
Vya Domusturning them glorified iOS devices as laptops/desktops
That's what I see will eventually happen, not just to the Mac but to all general purpose computing. If you ask most industry pundits and analysts who are far smarter than I am they will tell you that the desktop as we know it today will be dead within the next ten years. The majority of us will be using mobile devices with walled gardens that can become a desktop using cradle-like accessories.
Posted on Reply
#31
R-T-B
Vya DomusDoesn't really matter , it's still CISC at a higher level of abstraction and those micro-ops still constitute more robust instructions. For example , ARM needs separated instructions for memory operations whereas with x86 pretty much all instructions allow for complex addressing methods encoded within the instruction itself. Memory operations are slow , which means that you would have to find ways to keep an ARM core busy more often than an x86 one and inevitably you wont achieve the same efficiency.

ARM designs will always inherent disadvantages which just can't be mitigated due to it's RISC nature , instruction sets do matter , quite a lot.
I think I know a bit more about this than you are giving me credit for (I've actually written assembly level code for several platforms, all the way back to my NES which lacked a multiplication instruction in fact).

What you just said made no sense. If it's being translated to RISC how in the world can a CISC instruction run as anything but RISC at final runtime? If it was going to be busy during a memory access it will be busy. As in, it's all the same at the end game, it's just easier on the compiler if anything to "think" in CISC.

Illustration: I wrote a mulitplication code macro for my NES using a very light derivitave of homebrew basic someone made for it way back when. It multiplied using the old school "additive method," adding the first number over and over the set number of times in the second. It could be called in one line, but it still tied up the CPU for a godawful length of time. Conceptually, this was a "CISC" instruction of sorts, but the backend RISC was holding it up.
Posted on Reply
#32
FordGT90Concept
"I go fast!1!11!1!"
*cough* en.wikipedia.org/wiki/X86_instruction_listings

RISC is silicon efficient; CISC is process efficient.

Case in point, ARM has no instructions dedicated to virtual machines. I'm pretty sure that Windows 10 natively runs in a virtual machine on systems that support it for security reasons (you can't disable it).
Posted on Reply
#33
Vya Domus
trparkyWhat you just said made no sense.
All I said is arithmetic instructions that also perform memory access calls for better efficiency. You are insinuating , as far as I can tell , that it does not and that nowadays x86 is pretty much interchangeable with ARM because they both use RISC-like micro-ops.
R-T-BIf it was going to be busy during a memory access it will be busy. .
Yes , but you need to fetch more instructions in order to achieve the same thing.
R-T-BConceptually, this was a "CISC" instruction of sorts, but the backend RISC was holding it up.
Of course that would happen , you are emulating CISC behavior on hardware that was not made for that. x86 cores , despite being RISC-like under the hood now are still designed with the complex instructions in mind and accompanying microcode optimizations wheres that NES was clearly not.
Posted on Reply
#34
FordGT90Concept
"I go fast!1!11!1!"
Nothing RISC about x86. Compare the above link with ARM:
infocenter.arm.com/help/topic/com.arm.doc.dui0068b/DUI0068.pdf

When processing SIMD, some x86 instructions hijack the FPUs and ALUs. Sure, ALUs and FPUs only understand a reduced set of instructions but it's the instruction decoder at the top of the processor that determines RISC/CISC, not components inside.
Posted on Reply
#35
Bansaku
Bwahahahaha.... Nothing but conjecture! Every few months this story pops up and still NOTHING official from Apple nor it's developers. It makes zero sense for Apple to ditch the X86 arch. simply because the whole reason Apple switched to Intel in the first place was for 100% comparability with Windows software (all while making it easier on developers). I will bet the house come 2020 yet another "Apple to ditch Intel" story pops up (TPU will jump on the bandwagon for clicks) stating that by 2024 Apple might use their own CPUs in limited capacity for the notebook line.
Posted on Reply
#36
R-T-B
You know, of all the instruction sets I played with, x86 ironically is not one of them. My assembly knowledge may be slightly out of play here, so I'll admit I could be out of my element and completely wrong. I will defer to those more in the know, as my knowledge is only second hand and conceptual.
Posted on Reply
#37
CheapMeat
Never owned anything Apple and probably never will but still, this is exciting, just for the fact it'll shake up the industry a bit.
Posted on Reply
#38
efikkan
trparkyCorrect me if I'm wrong (because I probably am) but unless Adobe used low-level Assembly code to do some of the hard core image and video manipulation magic that their tools do, wouldn't it be just as simple as recompiling the C/C++ code against a new ISA (Instruction Set Architecture)? I mean, that is the reason why we have compilers right? So that we can write in higher level languages and be able to port it over to other architectures, right?
In theory, normal programs can be recompiled like you say. In-line assembly is not that common, but nearly all high-performance programs rely on intrinsics for AVX, SSE, FMA, etc. These intrinsics are low-level macros which directly maps to assembly. If these features are implemented as optional features, the developer can of course just disable them and recompile. But in cases such as programs from Adobe, the performance will be terrible. Rewriting the program to use different intrinsics for a new architecture require some effort, but is not extremely hard.
FordGT90ConceptNothing RISC about x86. Compare the above link with ARM:

infocenter.arm.com/help/topic/com.arm.doc.dui0068b/DUI0068.pdf
You are still confusing ISA (Instruction Set Architecture) with CPU architectures. x86 in its pure form is CISC, but all current implementations of x86 are RISC implementations which translates x86 into RISC-style micro-operations.
Posted on Reply
#39
FordGT90Concept
"I go fast!1!11!1!"
efikkanYou are still confusing ISA (Instruction Set Architecture) with CPU architectures. x86 in its pure form is CISC, but all current implementations of x86 are RISC implementations which translates x86 into RISC-style micro-operations.
Fundamentally what separates CISC and RISC is that RISC must load the data into a register, execute an instruction on the registers, and store the result into memory (load-store). A CISC instruction can address a register or a memory address and the processor subsystems will pull the necessary data from the memory, execute the instruction, then the result can be pushed back to memory. Said differently, CISC memory operations are implicit where in RISC, they are explicit.
Posted on Reply
#40
trparky
efikkanRewriting the program to use different intrinsics for a new architecture require some effort, but is not extremely hard.
Wouldn't that be up to the compiler though? The compiler does the heavy lifting when it comes to optimizing the resulting machine code, humans write the C/C++ code and the compiler does the hard work. One would think that perhaps with the giant leaps that ARM has taken over the last couple of years that extensions like AVX, SSE, FMA, etc. are in the pipeline for ARM, it's just a matter of time until they reach the public. It would then be up to the compilers to take advantage of those new ARM extensions.
Posted on Reply
#41
efikkan
trparky
efikkanRewriting the program to use different intrinsics for a new architecture require some effort, but is not extremely hard.
Wouldn't that be up to the compiler though? The compiler does the heavy lifting when it comes to optimizing the resulting machine code, humans write the C/C++ code and the compiler does the hard work. One would think that perhaps with the giant leaps that ARM has taken over the last couple of years that extensions like AVX, SSE, FMA, etc. are in the pipeline for ARM, it's just a matter of time until they reach the public. It would then be up to the compilers to take advantage of those new ARM extensions.
I think you are misunderstanding how these intrinsics works. Compilers can introduce optimizations themselves at compile time, and this will work fine, but that's not what I'm talking about here.

Most intrinsics (that I'm familiar with at least) are closely or directly mapped to assembly instructions. If the specific ARM implementation have a comparable extension with matching parameters, then surely the compiler could convert them (in theory), but extensions like AVX etc. are closely linked to how AVX is implemented on x86 designs, an automatic translation to another vector extension could result in sub-optimal use or even performance loss vs. normal instructions.

It's important to understand that intrinsics are usually only used in the most performance critical part of a program's code. When used properly, the alignment of data in memory is meticulously designed in order to scale well with those specific intrinsics. Switching to another set of intrinsics may require realignment of data structures and code logic to get maximum performance. Vector extensions are especially sensitive, and using these well or not can easily make a >10× difference in performance.
Posted on Reply
#42
trparky
efikkanI think you are misunderstanding how these intrinsics works.
And I probably don't understand it at all, I've not written any C/C++ code; my experience is in much higher languages like C# and VB.NET. I always figured that when you write C/C++ code to do something the compiler ultimately decides how that job is done when it comes to the machine code that's generated. Multiple optimization paths will of course result in much more optimized machine code but of course that takes more time to compile.
Posted on Reply
#43
Vya Domus
trparkyWouldn't that be up to the compiler though? The compiler does the heavy lifting when it comes to optimizing the resulting machine code, humans write the C/C++ code and the compiler does the hard work. One would think that perhaps with the giant leaps that ARM has taken over the last couple of years that extensions like AVX, SSE, FMA, etc. are in the pipeline for ARM, it's just a matter of time until they reach the public. It would then be up to the compilers to take advantage of those new ARM extensions.
Most compilers do a pretty bad job at vectorization due to how convoluted one can write code which translates to data level parallelism. This was a rather unfortunate comparisons , an x86 compiler wont vectorize most workloads and neither will one that also translates x86 into ARM. And intrinsics are sparingly used anyway.
Posted on Reply
#44
trparky
Vya DomusMost compilers do a pretty bad job at vectorization due to how convoluted one can write code which translates to data level parallelism.
SISO which translates to "shit in, shit out". Yes, badly written C code is going to result in a badly compiled program (duh!). No amount optimizations at the compiler level is going to turn lead (badly written code) into gold. It's up to the human writing the C code to write better code, it's always come down to this very simple thing.
Posted on Reply
#45
efikkan
trparkyAnd I probably don't understand it at all, I've not written any C/C++ code; my experience is in much higher languages like C# and VB.NET. I always figured that when you write C/C++ code to do something the compiler ultimately decides how that job is done when it comes to the machine code that's generated. Multiple optimization paths will of course result in much more optimized machine code but of course that takes more time to compile.
It's important to understand that the realm of automatic optimizations the compilers can actually is very limited. It does of course help a bit, and in some cases gives a nice boost, but will never compare to writing proper low-level code.

Compilers are for instance very good at optimizing small things that comes down to syntax; like unrolling small loops, rearranging some accesses, etc. But they can never deal with the "big stuff", like scaling problems resulting from your design choices.

If you want to hear some good explanations about how efficient code works, take a look at these:
CppCon 2014: Mike Acton "Data-Oriented Design and C++"
code::dive conference 2014 - Scott Meyers: Cpu Caches and Why You Care
Even if you don't grasp all the details, it should still be an eye-opener of how much the structure of the code matters.
trparkyIt's up to the human writing the C code to write better code, it's always come down to this very simple thing.
Yes, it always comes down to the skills of the coder and the understanding of the problem to be solved.

As Vya Domus mentions, vector instructions exploit data level parallelism. No compiler can ever optimize your code to make this parallelism, you have to make tightly packed data structures which matches the way you are going to process them.

Let's say you have 100 calculations in the form of A + B = C, then usually this will be compiled to two instructions fetching A and B into registers, one instruction to do the addition, and then one instruction to copy the sum back to memory. If you want to exploit AVX, you'll first have to align your data structures,
not like this: A0 B0 C0 A1 B1 C1 …
But like this:
A0 A1 A2 A3 A4 …
B0 B1 B2 B3 B4 …
C0 C1 C2 C3 C4 …
If you are using AVX2 on 32-bit floats, you can compute 8 additions per cycle. But you can't do this if your data is fragmented, which it might be in a typical OOP structure with data scattered across hundreds of objects.

Applications using intrinsics may only use them in a few functions (typically some "tight" loops), but the data structure might be shared with major parts of the codebase. So the developers usually have to be aware of the constraints even when they are not touching these parts of the code.

I don't know what Vya Domus means by intrinsics being sparingly used. It is used in many applications that matter for productivity; like Adobe programs, (3D) modelers, simulators, encoders etc, and essential libraries for compression etc. It's rarely used in games, and even if used, "never" impacts rendering performance. But as I mentioned, even when it's used, it's usually just a small percentage of the code.

To get back on topic; many performance critical applications can't be recompiled to another architecture and maintain acceptable performance without optimizations.
Posted on Reply
#46
Assimilator
timta2At least those "morons" how to spell and proofread.
You talk to yourself often?
Posted on Reply
#47
Tom_
FordGT90Conceptx86 is awesome because it's CISC.
x86 is awesome, although it is CISC.
RISC is better and faster than CISC.
And x86/AMD64 CPUs are RISC nowadays.
Posted on Reply
#48
FordGT90Concept
"I go fast!1!11!1!"
Tom_RISC is better and faster than CISC.
RISC is cheap and low power, CISC is fast.
Tom_And x86/AMD64 CPUs are RISC nowadays.
Already explained why that is patently false.
Posted on Reply
Add your own comment
Jan 30th, 2025 12:27 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts