
NVIDIA's Dominance Challenged as Largest AI Lab Adopts Google TPUs
NVIDIA's AI hardware dominance is challenged as the world's leading AI lab, OpenAI, taps into Google TPU hardware, showing significant efforts to move away from single-vendor solutions. In June 2025, OpenAI began leasing Google Cloud's Tensor Processing Units to handle ChatGPT's increasing inference workload. This is the first time OpenAI has relied on non-NVIDIA chips in large-scale production. Until recently, NVIDIA GPUs powered both model training and inference for OpenAI's products. Training large language models on those cards remains costly, but it was a periodic process. Inference, by contrast, runs continuously and carries its own substantial expense. ChatGPT now serves more than 100 million daily active users, including 25 million paid subscribers. Inference operations account for nearly half of OpenAI's estimated $40 billion annual compute budget. Google's TPUs, like v6e "Trillium" provide a more cost-effective solution for steady-state inference, as they are designed specifically for high throughput and low latency.
Beyond cost savings, this decision reflects OpenAI's desire to reduce reliance on any single vendor. Microsoft Azure has been its primary cloud provider since early investments and collaborations. However, GPU supply shortages and price fluctuations exposed a weakness in relying too heavily on a single source. By adding Google Cloud to its infrastructure mix, OpenAI gains greater flexibility, avoids vendor lock-in, and can scale more smoothly during usage peaks. For Google, winning OpenAI as a TPU customer offers strong validation for its in-house chip development. TPUs were once reserved almost exclusively for internal projects such as powering the Gemini model. Now, they are attracting leading organizations like Apple and Anthropic. Note that, beyond v6e inference, Google also designs TPUs for training (yet-to-be-announced v6p), which means companies can scale their entire training runs on Google's infrastructure on demand.
Beyond cost savings, this decision reflects OpenAI's desire to reduce reliance on any single vendor. Microsoft Azure has been its primary cloud provider since early investments and collaborations. However, GPU supply shortages and price fluctuations exposed a weakness in relying too heavily on a single source. By adding Google Cloud to its infrastructure mix, OpenAI gains greater flexibility, avoids vendor lock-in, and can scale more smoothly during usage peaks. For Google, winning OpenAI as a TPU customer offers strong validation for its in-house chip development. TPUs were once reserved almost exclusively for internal projects such as powering the Gemini model. Now, they are attracting leading organizations like Apple and Anthropic. Note that, beyond v6e inference, Google also designs TPUs for training (yet-to-be-announced v6p), which means companies can scale their entire training runs on Google's infrastructure on demand.