Tuesday, July 30th 2024

Apple Trained its Apple Intelligence Models on Google TPUs, Not NVIDIA GPUs

Apple has disclosed that its newly announced Apple Intelligence features were developed using Google's Tensor Processing Units (TPUs) rather than NVIDIA's widely adopted hardware accelerators like H100. This unexpected choice was detailed in an official Apple research paper, shedding light on the company's approach to AI development. The paper outlines how systems equipped with Google's TPUv4 and TPUv5 chips played a crucial role in creating Apple Foundation Models (AFMs). These models, including AFM-server and AFM-on-device, are designed to power both online and offline Apple Intelligence features introduced at WWDC 2024. For the training of the 6.4 billion parameter AFM-server, Apple's largest language model, the company utilized an impressive array of 8,192 TPUv4 chips, provisioned as 8×1024 chip slices. The training process involved a three-stage approach, processing a total of 7.4 trillion tokens. Meanwhile, the more compact 3 billion parameter AFM-on-device model, optimized for on-device processing, was trained using 2,048 TPUv5p chips.

Apple's training data came from various sources, including the Applebot web crawler and licensed high-quality datasets. The company also incorporated carefully selected code, math, and public datasets to enhance the models' capabilities. Benchmark results shared in the paper suggest that both AFM-server and AFM-on-device excel in areas such as Instruction Following, Tool Use, and Writing, positioning Apple as a strong contender in the AI race despite its relatively late entry. However, Apple's penetration tactic into the AI market is much more complex than any other AI competitor. Given Apple's massive user base and millions of devices compatible with Apple Intelligence, the AFM has the potential to change user interaction with devices for good, especially for everyday tasks. Hence, refining AI models for these tasks is critical before massive deployment. Another unexpected feature is transparency from Apple, a company typically known for its secrecy. The AI boom is changing some of Apple's ways, and revealing these inner workings is always interesting.
Source: via Tom's Hardware
Add your own comment

28 Comments on Apple Trained its Apple Intelligence Models on Google TPUs, Not NVIDIA GPUs

#26
lexluthermiester
sethmatrix7Is that why it's not very good?
Nope, it something else.
Posted on Reply
#27
bug
This just dawned on me, is it going to be called iAI?
Posted on Reply
#28
trsttte
bugThis just dawned on me, is it going to be called iAI?
If only :D

No, they are instead redefining AI from Artificial Intelligence to now mean Apple Intelligence. Not terrible from a marketing standpoint, given the critical mass of Apple users
and their average iq :toast:
it's feasible for a large number people to start conflicting any AI application as some derivation of Apple's work (which is obviously ludicrous)
Posted on Reply
Add your own comment
Aug 5th, 2024 09:12 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts