Here’s a sentence that would have sounded absurd a decade ago: the future of artificial intelligence may depend on which country controls the most advanced sand-melting equipment.
Welcome to the AI hardware race. It’s stranger โ and more consequential โ than it looks.
The Accidental Monarch
NVIDIA did not set out to rule AI. Its GPUs were designed to make video games look better. The fact that the same parallel processing architecture turned out to be almost perfectly suited for deep learning matrix mathematics is one of the great happy accidents in tech history โ the equivalent of inventing a really good pasta strainer and discovering it’s also an excellent telescope.
The market has rewarded that accident generously. NVIDIA commands roughly 80% of the AI GPU segment. Its CUDA software ecosystem, built up over a decade, creates switching costs that pure hardware benchmarks don’t capture. AMD’s Instinct MI300X is technically competitive. But switching from CUDA to ROCm is a bit like convincing a city to drive on the other side of the road โ technically possible, practically painful.
The Rise of the Specialist
The more interesting development isn’t in the GPU duel. It’s one layer down.
ASICs โ chips built for one job and one job only โ are gaining ground fast. Google’s TPUs have run its AI workloads for years. Amazon’s Trainium2 (for training) and Inferentia (for inference) give AWS a way to offer competitive economics without depending entirely on NVIDIA. Microsoft’s Maia 100 powers its Azure AI services internally.
The logic is compelling: if you know exactly what you’re running, and you’re running a lot of it, a bespoke chip will always beat a general one. The constraint is volume โ custom silicon only makes economic sense at hyperscaler scale.
Startups are testing whether that constraint has a workaround. Groq’s Language Processing Unit is explicitly inference-only โ a deliberate limitation that lets it achieve remarkable speeds. That kind of focused bet is the pattern worth watching. The companies finding early traction in AI hardware all seem to have answered the same question first: what, specifically, are we better at?
Training vs. Inference: Not the Same Animal
One detail that gets lost in the GPU headlines โ training and running a model are fundamentally different problems, increasingly solved by different hardware.
Training is a long, memory-intensive construction project. Inference is the factory running 24 hours a day. The chips best suited to each are starting to diverge, and that divergence matters more as organisations shift from building models to deploying them at scale. As discussed in the earlier post on moving past AI pilot programmes, the real cost questions arrive at deployment โ and that’s exactly where the inference chip conversation lives.
Edge AI adds yet another wrinkle. The chip running a model inside a hospital device or a factory sensor needs to be tiny, cool, and miserly with power โ none of which describes a data centre GPU. The hardware requirements don’t just differ by degree. They differ by kind.
When Chips Became Geopolitics
And then there’s the subplot that quietly became the main act.
US export controls on advanced AI chips to China โ introduced in late 2022, tightened in 2023, expanded again in December 2024 โ have turned semiconductor policy into something resembling a chess match played across continents. Japan and the Netherlands were brought in to close the same doors. China responded with a $47.5 billion domestic semiconductor fund, launched in May 2024. Huawei’s Ascend programme keeps pushing forward despite constraints.
The practical implication: where a chip is made, and by whom, is no longer a supply chain footnote. It’s showing up in procurement conversations, compliance reviews, and infrastructure decisions โ particularly for organisations operating across regulated sectors or international markets. Hardware choices are no longer purely technical.
The Cautionary Tale
Intel’s experience is the one worth sitting with. Multiple AI chip strategies โ Nervana, Habana, Gaudi โ competed for internal resources without ever quite cohering. A $500 million Gaudi revenue forecast was abandoned in October 2024. A CEO departure followed in December.
The pattern is familiar in hardware history. Trying to be the general-purpose alternative to a dominant general-purpose platform is a difficult position to sustain. Specificity โ knowing exactly what problem you solve better than anyone else โ turns out to be more defensible than breadth.
That connects neatly to the Green AI post from earlier too. The chips designed with narrow focus tend to be the ones delivering energy efficiency gains โ because they’re not carrying capability they don’t need.
The silicon layer rarely makes it onto the AI strategy slide deck. But the hardware decisions being made in chip labs and trade ministries right now will quietly set the ceiling on what’s possible โ and affordable โ for the next decade.
What’s your read โ does hardware feel like an infrastructure detail in your organisation’s AI thinking, or has it started showing up in the strategy conversation?
Let’s keep learning โ together.
Share your thoughts