Wednesday, March 20th 2024
Tiny Corp. Pauses Development of AMD Radeon GPU-based Tinybox AI Cluster
George Hotz and his Tiny Corporation colleagues were pinning their hopes on AMD delivering some good news earlier this month. The development of a "TinyBox" AI compute cluster project hit some major roadblocks a couple of weeks ago—at the time, Radeon RX 7900 XTX GPU firmware was not gelling with Tiny Corp.'s setup. Hotz expressed "70% confidence" in AMD approving open-sourcing certain bits of firmware. At the time of writing this has not transpired—this week the Tiny Corp. social media account has, once again, switched to an "all guns blazing" mode. Hotz and Co. have publicly disclosed that they were dabbling with Intel Arc graphics cards, as of a few weeks ago. NVIDIA hardware is another possible route, according to freshly posted open thoughts.
Yesterday, it was confirmed that the young startup organization had paused its utilization of XFX Speedster MERC310 RX 7900 XTX graphics cards: "the driver is still very unstable, and when it crashes or hangs we have no way of debugging it. We have no way of dumping the state of a GPU. Apparently it isn't just the MES causing these issues, it's also the Command Processor (CP). After seeing how open Tenstorrent is, it's hard to deal with this. With Tenstorrent, I feel confident that if there's an issue, I can debug and fix it. With AMD, I don't." The $15,000 TinyBox system relies on "cheaper" gaming-oriented GPUs, rather than traditional enterprise solutions—this oddball approach has attracted a number of customers, but the latest announcements likely signal another delay. Yesterday's tweet continued to state: "we are exploring Intel, working on adding Level Zero support to tinygrad. We also added a $400 bounty for XMX support. We are also (sadly) exploring a 6x GeForce RTX 4090 GPU box. At least we know the software is good there. We will revisit AMD once we have an open and reproducible build process for the driver and firmware. We are willing to dive really deep into hardware to make it amazing. But without access, we can't."Another post provided a behind-the-scenes look at Hotz's diplomatic approach: "I have spoken with AMD on multiple occasions, we have gotten through to top people, and they have been quite nice to us. I believe they want to be more open, and obviously they don't want their driver to have bugs. Unfortunately, this access and responses prolonged this decision, part of me wishes they just said it's a consumer card, you get what you pay for and we could have switched earlier. We probably tried too hard to make it work. We have an amazing team at tinygrad. Someday, we are going to make our own chips, and I figure if we can make our own chips, we better be able to make the 7900XTX software great. But we can't if we don't have access. The firmware is complex, undocumented, closed source, and signed, all struggles we wouldn't have with our own hardware. If and when the firmware is open and installable, if we aren't too far along with a different chip, we are down to put resources into writing fuzzers and rewriting whatever needs to be rewritten. The 7900XTX hardware seems great, but we aren't going to put resources into fixing a black box."
Sources:
tinygrad Tweet, Tom's Hardware, Wccftech
Yesterday, it was confirmed that the young startup organization had paused its utilization of XFX Speedster MERC310 RX 7900 XTX graphics cards: "the driver is still very unstable, and when it crashes or hangs we have no way of debugging it. We have no way of dumping the state of a GPU. Apparently it isn't just the MES causing these issues, it's also the Command Processor (CP). After seeing how open Tenstorrent is, it's hard to deal with this. With Tenstorrent, I feel confident that if there's an issue, I can debug and fix it. With AMD, I don't." The $15,000 TinyBox system relies on "cheaper" gaming-oriented GPUs, rather than traditional enterprise solutions—this oddball approach has attracted a number of customers, but the latest announcements likely signal another delay. Yesterday's tweet continued to state: "we are exploring Intel, working on adding Level Zero support to tinygrad. We also added a $400 bounty for XMX support. We are also (sadly) exploring a 6x GeForce RTX 4090 GPU box. At least we know the software is good there. We will revisit AMD once we have an open and reproducible build process for the driver and firmware. We are willing to dive really deep into hardware to make it amazing. But without access, we can't."Another post provided a behind-the-scenes look at Hotz's diplomatic approach: "I have spoken with AMD on multiple occasions, we have gotten through to top people, and they have been quite nice to us. I believe they want to be more open, and obviously they don't want their driver to have bugs. Unfortunately, this access and responses prolonged this decision, part of me wishes they just said it's a consumer card, you get what you pay for and we could have switched earlier. We probably tried too hard to make it work. We have an amazing team at tinygrad. Someday, we are going to make our own chips, and I figure if we can make our own chips, we better be able to make the 7900XTX software great. But we can't if we don't have access. The firmware is complex, undocumented, closed source, and signed, all struggles we wouldn't have with our own hardware. If and when the firmware is open and installable, if we aren't too far along with a different chip, we are down to put resources into writing fuzzers and rewriting whatever needs to be rewritten. The 7900XTX hardware seems great, but we aren't going to put resources into fixing a black box."
36 Comments on Tiny Corp. Pauses Development of AMD Radeon GPU-based Tinybox AI Cluster
All this is is poor investment/research on their end in the pursuit of making easy money on the “next big thing”.
If he wants enterprise-class support and development resource, then pay the enterprise price.
If he wants to DIY the whole gap between consumer and enterprise, then hitting a brick wall is as expected.
I would say it is just him underestimated the cost of 'Breaching the gap between consumer and enterprise' at the first place.
There's no shame in the idea or that he tried to make it work, but making this public fuss about the consequences of his own silly decision... I remember this guy from many years ago and that I liked him so that's all I'll say. The main selling point was "nobody else cut this corner to lower the price" and there was a real reason why.
Also, not a chance in hell Intel or Nvidia do any more to help. The markup on enterprise cards is the cornerstone of Nvidia's business strategy for crying out loud.
He knows full well that NVIDIA WILL tell him to pound sand AND to cease and desist before forcing a driver update that will nerf AI usage for the workloads he's trying to run on their GPUs.
And Intel probably would only help him insofar as it benefits them, which really isn't much as Intel's current priority is AI on everything (but mostly CPUs and dedicated accelerators) and gaming on GPU. Esp. given the massive charm offensive they're doing trying to show that they're not going to abandon gaming GPUs; what with all the regular interviews with GN and others discussing the difficulties and what they're doing to catch up, on top of trying to be the fastest of the 3 GPU makers to release new driver updates for new, big-title games. Sure, there's some money in AI, but Intel would also be better served offering up dedicated enterprise solutions to it than risk unsavory headlines about selling their gaming GPUs to AI farms, much less gambling on a return from a startup.
I mean, what did he expect, to use public media to force a vendor to open our proprietary microcode for his own benefit and at expense of the said vendor?
Gee, who would have thought how this would end...
Let me help you help me, is what this is. I smell narcissism, bigtime.