Monday, July 15, 2024

Microsoft’s $100 Billion Stargate AI Supercomputer Could Feature Over a Million AMD GPUs


  • Microsoft and OpenAI’s rumored $100 billion Stargate project to launch by 2028.
  • AMD’s MI500 GPUs could power a million-plus GPU training cluster.
  • The project aims to reduce dependency on Nvidia.

Microsoft and OpenAI are rumored to be working on a groundbreaking project called ‘Stargate.’ This ambitious initiative, reportedly backed by over $100 billion from Microsoft, aims to revolutionize data centers by 2028.

The primary goal is to reduce reliance on Nvidia, a common aim among many tech giants focused on artificial intelligence.

The Next Platform revealed in April that this AI supercomputer might utilize future generations of Cobalt Arm server processors and Maia XPUs.

The project envisions scaling Ethernet to support hundreds of thousands to potentially a million XPUs within a single machine.

Although many specifics remain unclear and the project’s realization is uncertain, recent statements from AMD hint at some fascinating possibilities.

Forrest Norrod, AMD’s Executive Vice President and General Manager of the Datacenter Solutions Group, shared some intriguing insights during a conversation with The Next Platform.

Timothy Prickett Morgan of The Next Platform asked Norrod about the largest AI training cluster someone has seriously considered, possibly involving AMD’s future Instinct MI500 GPU accelerators.

Norrod confirmed the scale, stating, “It’s in that range? Yes.” When pressed for further details, he emphasized, “I am dead serious, it is in that range” and clarified, “I’m talking about one machine… The scale of what’s being contemplated is mind-blowing. Now, will all of that come to pass? I don’t know. But there are public reports of very sober people contemplating spending tens of billions of dollars or even a hundred billion dollars on training clusters.”

Such a massive project, featuring over a million GPUs, would only be feasible for a few companies.

Given Microsoft’s significant investment in AI and data centers, it’s reasonable to speculate that AMD’s discussions about such a large-scale GPU deployment could be linked to the Stargate project.

By turning to AMD, Microsoft and other tech companies can potentially sidestep Nvidia, aligning with their broader strategic goals.

This collaboration would mark a significant shift in the tech industry, where Nvidia has long dominated the AI hardware market.

The partnership between Microsoft and AMD could pave the way for innovations and competitive dynamics, benefiting the broader AI ecosystem.

The Stargate project, if realized, could set new benchmarks for AI supercomputers, pushing the boundaries of what’s possible with machine learning and data processing.

While the project’s success is not guaranteed, the involvement of major players like Microsoft, OpenAI, and AMD underscores the immense potential and ambition behind it.

As the tech world watches closely, the next few years will reveal whether Stargate can live up to its promise and transform the landscape of AI and supercomputing.

For now, the idea of a supercomputer with over a million AMD GPUs remains a tantalizing prospect, symbolizing the future of technology and innovation.

