Tuesday, March 8th 2022
Apple Unveils M1 Ultra, the World's Most Powerful Chip For a Personal Computer
Apple today announced M1 Ultra, the next giant leap for Apple silicon and the Mac. Featuring UltraFusion — Apple's innovative packaging architecture that interconnects the die of two M1 Max chips to create a system on a chip (SoC) with unprecedented levels of performance and capabilities — M1 Ultra delivers breathtaking computing power to the new Mac Studio while maintaining industry-leading performance per watt.
The new SoC consists of 114 billion transistors, the most ever in a personal computer chip. M1 Ultra can be configured with up to 128 GB of high-bandwidth, low-latency unified memory that can be accessed by the 20-core CPU, 64-core GPU and 32-core Neural Engine, providing astonishing performance for developers compiling code, artists working in huge 3D environments that were previously impossible to render, and video professionals who can transcode video to ProRes up to 5.6x faster than with a 28-core Mac Pro with Afterburner."M1 Ultra is another game changer for Apple silicon that once again will shock the PC industry. By connecting two M1 Max die with our UltraFusion packaging architecture, we're able to scale Apple silicon to unprecedented new heights," said Johny Srouji, Apple's senior vice president of Hardware Technologies. "With its powerful CPU, massive GPU, incredible Neural Engine, ProRes hardware acceleration and huge amount of unified memory, M1 Ultra completes the M1 family as the world's most powerful and capable chip for a personal computer."
Groundbreaking UltraFusion Architecture
The foundation for M1 Ultra is the extremely powerful and power-efficient M1 Max. To build M1 Ultra, the die of two M1 Max are connected using UltraFusion, Apple's custom-built packaging architecture. The most common way to scale performance is to connect two chips through a motherboard, which typically brings significant trade-offs, including increased latency, reduced bandwidth and increased power consumption. However, Apple's innovative UltraFusion uses a silicon interposer that connects the chips across more than 10,000 signals, providing a massive 2.5 TB/s of low-latency, inter-processor bandwidth — more than 4x the bandwidth of the leading multi-chip interconnect technology. This enables M1 Ultra to behave and be recognised by software as one chip, so developers don't need to rewrite code to take advantage of its performance. There's never been anything like it.
Unprecedented Performance and Power Efficiency
M1 Ultra features an extraordinarily powerful 20-core CPU with 16 high-performance cores and four high-efficiency cores. It delivers 90 per cent higher multithreaded performance than the fastest available 16-core PC desktop chip in the same power envelope. Additionally, M1 Ultra reaches the PC chip's peak performance using 100 fewer watts. That astounding efficiency means less energy is consumed and fans run quietly, even as apps like Logic Pro rip through demanding workflows, such as processing massive amounts of virtual instruments, audio plug-ins and effects.
For the most graphics-intensive needs, like 3D rendering and complex image processing, M1 Ultra has a 64-core GPU — 8x the size of M1 — delivering faster performance than even the highest-end PC GPU available while using 200 fewer watts of power.
Apple's unified memory architecture has also scaled up with M1 Ultra. Memory bandwidth is increased to 800 GB/s, more than 10x the latest PC desktop chip, and M1 Ultra can be configured with 128 GB of unified memory. Compared with the most powerful PC graphics cards that max out at 48 GB, nothing comes close to M1 Ultra for graphics memory to support enormous GPU-intensive workloads like working with extreme 3D geometry and rendering massive scenes.
The 32-core Neural Engine in M1 Ultra runs up to 22 trillion operations per second, speeding through the most challenging machine learning tasks. And, with double the media engine capabilities of M1 Max, M1 Ultra offers unprecedented ProRes video encode and decode throughput. In fact, the new Mac Studio with M1 Ultra can play back up to 18 streams of 8K ProRes 422 video — a feat no other chip can accomplish. M1 Ultra also integrates custom Apple technologies, such as a display engine capable of driving multiple external displays, integrated Thunderbolt 4 controllers and best-in-class security, including Apple's latest Secure Enclave, hardware-verified secure boot and runtime anti-exploitation technologies.
macOS and Apps Scale Up to M1 Ultra
Deep integration between hardware and software has always been at the heart of the Mac experience. macOS Monterey has been designed for Apple silicon, taking advantage of M1 Ultra's huge increases in CPU, GPU and memory bandwidth. Developer technologies like Metal let apps take full advantage of the new chip, and optimisations in Core ML utilise the new 32-core Neural Engine, so machine learning models run faster than ever.
Users have access to the largest collection of apps ever for Mac, including iPhone and iPad apps that can now run on Mac, and Universal apps that unlock the full power of the M1 family of chips. Apps that have not yet been updated to Universal will run seamlessly with Apple's Rosetta 2 technology.
Another Leap Forward in the Transition to Apple Silicon
Apple has introduced Apple silicon to nearly every Mac in the current line-up, and each new chip — M1, M1 Pro, M1 Max and now M1 Ultra — unleashes amazing capabilities for the Mac. M1 Ultra completes the M1 family of chips, powering the all-new Mac Studio, a high-performance desktop system with a re-imagined compact design made possible by the industry-leading performance per watt of Apple silicon.
Apple Silicon and the Environment
The energy efficiency of Apple's custom silicon helps Mac Studio use less power over its lifetime. In fact, while delivering extraordinary performance, Mac Studio consumes up to 1,000 kilowatt-hours less energy than that of a high-end PC desktop over the course of a year.
Today, Apple is carbon-neutral for global corporate operations, and by 2030, plans to have net-zero climate impact across the entire business, which includes manufacturing supply chains and all product life cycles. This means that every chip Apple creates, from design to manufacturing, will be 100 per cent carbon-neutral.
Source:
Apple
The new SoC consists of 114 billion transistors, the most ever in a personal computer chip. M1 Ultra can be configured with up to 128 GB of high-bandwidth, low-latency unified memory that can be accessed by the 20-core CPU, 64-core GPU and 32-core Neural Engine, providing astonishing performance for developers compiling code, artists working in huge 3D environments that were previously impossible to render, and video professionals who can transcode video to ProRes up to 5.6x faster than with a 28-core Mac Pro with Afterburner."M1 Ultra is another game changer for Apple silicon that once again will shock the PC industry. By connecting two M1 Max die with our UltraFusion packaging architecture, we're able to scale Apple silicon to unprecedented new heights," said Johny Srouji, Apple's senior vice president of Hardware Technologies. "With its powerful CPU, massive GPU, incredible Neural Engine, ProRes hardware acceleration and huge amount of unified memory, M1 Ultra completes the M1 family as the world's most powerful and capable chip for a personal computer."
Groundbreaking UltraFusion Architecture
The foundation for M1 Ultra is the extremely powerful and power-efficient M1 Max. To build M1 Ultra, the die of two M1 Max are connected using UltraFusion, Apple's custom-built packaging architecture. The most common way to scale performance is to connect two chips through a motherboard, which typically brings significant trade-offs, including increased latency, reduced bandwidth and increased power consumption. However, Apple's innovative UltraFusion uses a silicon interposer that connects the chips across more than 10,000 signals, providing a massive 2.5 TB/s of low-latency, inter-processor bandwidth — more than 4x the bandwidth of the leading multi-chip interconnect technology. This enables M1 Ultra to behave and be recognised by software as one chip, so developers don't need to rewrite code to take advantage of its performance. There's never been anything like it.
Unprecedented Performance and Power Efficiency
M1 Ultra features an extraordinarily powerful 20-core CPU with 16 high-performance cores and four high-efficiency cores. It delivers 90 per cent higher multithreaded performance than the fastest available 16-core PC desktop chip in the same power envelope. Additionally, M1 Ultra reaches the PC chip's peak performance using 100 fewer watts. That astounding efficiency means less energy is consumed and fans run quietly, even as apps like Logic Pro rip through demanding workflows, such as processing massive amounts of virtual instruments, audio plug-ins and effects.
For the most graphics-intensive needs, like 3D rendering and complex image processing, M1 Ultra has a 64-core GPU — 8x the size of M1 — delivering faster performance than even the highest-end PC GPU available while using 200 fewer watts of power.
Apple's unified memory architecture has also scaled up with M1 Ultra. Memory bandwidth is increased to 800 GB/s, more than 10x the latest PC desktop chip, and M1 Ultra can be configured with 128 GB of unified memory. Compared with the most powerful PC graphics cards that max out at 48 GB, nothing comes close to M1 Ultra for graphics memory to support enormous GPU-intensive workloads like working with extreme 3D geometry and rendering massive scenes.
The 32-core Neural Engine in M1 Ultra runs up to 22 trillion operations per second, speeding through the most challenging machine learning tasks. And, with double the media engine capabilities of M1 Max, M1 Ultra offers unprecedented ProRes video encode and decode throughput. In fact, the new Mac Studio with M1 Ultra can play back up to 18 streams of 8K ProRes 422 video — a feat no other chip can accomplish. M1 Ultra also integrates custom Apple technologies, such as a display engine capable of driving multiple external displays, integrated Thunderbolt 4 controllers and best-in-class security, including Apple's latest Secure Enclave, hardware-verified secure boot and runtime anti-exploitation technologies.
macOS and Apps Scale Up to M1 Ultra
Deep integration between hardware and software has always been at the heart of the Mac experience. macOS Monterey has been designed for Apple silicon, taking advantage of M1 Ultra's huge increases in CPU, GPU and memory bandwidth. Developer technologies like Metal let apps take full advantage of the new chip, and optimisations in Core ML utilise the new 32-core Neural Engine, so machine learning models run faster than ever.
Users have access to the largest collection of apps ever for Mac, including iPhone and iPad apps that can now run on Mac, and Universal apps that unlock the full power of the M1 family of chips. Apps that have not yet been updated to Universal will run seamlessly with Apple's Rosetta 2 technology.
Another Leap Forward in the Transition to Apple Silicon
Apple has introduced Apple silicon to nearly every Mac in the current line-up, and each new chip — M1, M1 Pro, M1 Max and now M1 Ultra — unleashes amazing capabilities for the Mac. M1 Ultra completes the M1 family of chips, powering the all-new Mac Studio, a high-performance desktop system with a re-imagined compact design made possible by the industry-leading performance per watt of Apple silicon.
Apple Silicon and the Environment
The energy efficiency of Apple's custom silicon helps Mac Studio use less power over its lifetime. In fact, while delivering extraordinary performance, Mac Studio consumes up to 1,000 kilowatt-hours less energy than that of a high-end PC desktop over the course of a year.
Today, Apple is carbon-neutral for global corporate operations, and by 2030, plans to have net-zero climate impact across the entire business, which includes manufacturing supply chains and all product life cycles. This means that every chip Apple creates, from design to manufacturing, will be 100 per cent carbon-neutral.
122 Comments on Apple Unveils M1 Ultra, the World's Most Powerful Chip For a Personal Computer
Of course the real kicker is that Apple with their tight control, massive size, and vertical integration could implement such a system relatively easily. But they would much rather sell you an entire new device.
So yeah, Apple could modularize their hardware more, but at what cost? SD card adoption and replaceable memory is probably not the best choice for the same reason, it would hamper performance and result in bigger devices. That doesn't sound like a win to me.
One cool solution would be a hybrid memory architecture, with X amount of memory on-package (could be HBMx, could be LPDDR) and a desired number of ancillary channels of "second order" RAM for when that fills up. Of course this would be pretty difficult for the OS to manage without performance inconsistencies, but it's definitely doable. Didn't Samsung try to launch a UFS card standard at some point? I wonder if we'll ever see the (seemingly dead elsewhere) SD Express standard take root in phones - it essentially makes SD cards into PCIe 3.0x1 SSDs. Of course heat and controller complexity (and die size) would be an issue for this in MicroSD, but I think the standard supports that size. I don't think Apple is ever going to go modular storage for phones, unless they were mandated to do so by some major government. Heck, they don't even support dual SIM. PCs, though? Mandating that would actually be kind of reasonable (with some excemptions for industrial PCs or other special cases). Soldered storage in a PC just isn't acceptable. Soldered RAM kind of is, but IMO we really need some organized system for access to spare parts, repairs and upgrades. Plus, ideally, something like a mandated period of keeping motherboard designs (for portables) compatible - 3 years? It would be pretty awesome to be able to buy a laptop, then 3 years later either order a new motherboard or send it in to have it upgraded to a new CPU and RAM, with your existing board as a trade-in. Of course this is still more wasteful than upgrading individual components, but such a system would obviously need to be tied into spare parts networks and repair providers (and official refurbishers) to afford re-use of those traded-in boards. Even low power laptops today are so powerful that a decent 3-year-old laptop is more than enough for most users, so getting a system like this running would be immensely beneficial in many ways.
All in all, when you buy Apple, you're buying something that should "just work." That's part of what you're paying for by going into this ecosystem and it's not for everyone. Your typical Mac user doesn't want to screw with the hardware. Your typical enthusiast is going to hate Apple because of how locked down it is, but honestly, this is a case where you can't have your cake and eat it too if you want to control quality from top to bottom.
So tl;dr: I didn't feel like a 10% increase in price was worth justifying 2TB over 1TB when I didn't think (at the time,) that I needed it. I would have bought it though had I was able to have accounted for being interested in photography. Hindsight is always 20/20 though.
Edit: With that said though, I do still use my Linux tower for some games. It's just that the Mac is nice in the sense that it never gives me trouble and it always works when I need it to. That's really important when I'm traveling, working, or both.
As for editing: most apps load low(er) quality previews of images before loading the entire file, so it's likely not an issue. You're not likely to do tons of edits every second either, so the only use case where this becomes an issue is when blasting through your album quickly, in which case the previews are typically more than sufficient. Batch copying/import/export is where the bottleneck will be felt the most - and that's a large part of why many pro cameras are moving to CFExpress. Yeah, it's definitely a win for them. But I wouldn't call it a lack of standardization - they're using "standard" LPDDR5, after all (though seemingly in custom packages?). The issue is that you just can't reasonably make a standard for replaceable memory at those bandwidths - the pin counts, trace routing, and PCB quality demands would be astronomical, making it impossible in practice. That's why we have soldered standards like HBM and LPDDR. Yeah, they're definitely avoiding a lot of complexity goint this way. I can only imagine the issues of juggling data across two different memory architectures with drastically different speeds and latencies, and making this transparent to apps. In this way, on-board "RAM" in a system like this is likely better implemented as a (huge) L4 cache. At the time they first launched an 8TB option there weren't any 8TB NVMe options, no. And AFAIK all current ones are relatively slow QLC, but they do exist. Still, it's not like their soldered drives have any special NAND on them, so they could just as easily have made this into an NVMe card - but it might not have fit on a standard m.2 22xx form factor. There are other m.2 form factors they could have used though - they could have gone 30mm wide for more board space and still stayed compliant (and compatible with standard SSDs as long as they aren't too long). We could also look at the Mac Pro, which used what look like mSATA SSDs (though clearly a proprietary interface), but they are for some reason tied to the T2 security chip and non-replaceable. Apple claims this is a security feature, but ... that isn't reasonable. The data security implications of someone being able to steal or temporarily borrow and clone your SSD are near zero, as at that point they already need physical access to your device, so all bets are off at that point. Soldered storage is still denser - you can always pack everything more densely on one PCB rather than several + connectors - but the difference isn't really noticeable. They could have made it work if they wanted to. And that's the real problem. That's true, though again Apple is hardly the worst of the bunch here - at least their phones are reasonably easily disassembled and they consistently use stretch-release adhesive for their batteries, making swaps possible for someone with a few basic tools and a modicum of patience and hand-eye coordination. It's far from perfect, but there are far worse examples out there (looking at you, Samsung). IMO there are valid arguments for not designing your products around easy modularity - the benefits of non-modularity are very much real. What there aren't valid arguments for is not designing for repairability. There's no reason why a non-modular design can't still be relatively easily repairable. Of course, companies like Fairphone and Framework are demonstrating that you can still make highly repairable and upgradeable phones and laptops without that much of a sacrifice.
Making a device of any kind with a battery that can not be easily replaced is sheer stupidity and it's about as environmentally unfriendly as can be. But I digress, we've wandered a bit off topic..