Each week we find a new topic for our readers to learn about in our AI Education column.
Yes, it’s another weekly edition of AI Education and this time we’re going to tackle a fundamental artificial intelligence topic that all of us should already know pretty well, it’s just that journalists and the technology industry have applied a confusing term to the topic that they never bother to define for anyone. I guess we could say that’s what we do almost every week right here in this column, but that’s what we get when we’re covering new technology in a pretty lazy journalistic environment.
This time around, we’re going to talk about AI hyperscalers, which are entities with the resources to build massive amounts of AI infrastructure, enabling themselves (or others) to rapidly scale up, or “hyperscale,” technology—and by technology here, we’re encompassing the whole range from a small office software stack to sophisticated large language models, but with a particular focus on the largest and most complex software projects. These hyperscalers fill an important role in the greater AI ecosystem—the demand for AI far out-scales the ability of almost every user and business to actually build AI for themselves.
I’ve already pretty much described why I picked this topic this week—in reading the daily poop on what’s going on in AI, it’s a term that I come across again and again, and, thankfully, it didn’t take very long before I’d seen it used in enough different contexts to understand it pretty well before actually trying to research it.
What Are AI Hyperscalers
Hyperscalers offer distributed computing environments, or clouds, that are built to enable the rapid transfer and processing of massive amounts of data. In IT speak, scalability describes a technology’s ability to expand or grow its capacity in accordance with the growing demands of its users. Hyperscalability is the ability to absorb exponential increases in workload.
In traditional computing and traditional data centers, scaling up has meant upgrading existing hardware to boost processing power or data storage efficiency. In the hyperscaler model, hardware is added instead—rather than replace the hardware in an existing data center, a hyperscaler is more likely to build an entirely new data center and split the computing or data storage workload between the two facilities. The difference is significant—hyperscalers are able grow much faster than traditional providers.
Hyperscalers are what save us from needing to physically access a supercomputer residing in a specific geographic space in order to enjoy the fruits of artificial intelligence. They also spare businesses from having to spend inordinate sums to buy hardware and build their own AI infrastructure, offering a cost-effective service that allows even small-and-medium-sized businesses to deploy their own generative AI.
Most hyperscalers offer custom and pre-trained models to their customers. Core to hyperscalers’ infrastructure are large data centers, often distributed near electrical power generation facilities, that employ artificial intelligence hardware like GPUs and AI chips. Hyperscalers also usually own the links between these data centers and the technology that distributes work between different datacenters. Hyperscalers also, at least to some extent, protect the data that their customers send and receive, and will often offer solutions compliant with various regulatory regimes.
Who Are the Hyperscalers
There are actually many companies that claim—or aim—to provide hyperscaling services, but most of the big incumbent technology companies are also the first true hyperscalers in AI, along with some of the leading AI developers. These companies include the big three:
- Amazon (Amazon Web Services)
- Microsoft (Azure)
- Google (Cloud Platform)
The big three account for around two-thirds of the global cloud market. Other hyperscalers include familiar names: Meta, IBM, Oracle, Apple, Alibaba and Nvidia.
These biggest of the biggest hyperscalers aren’t just in the AI business, in many cases they were already offering cloud data and computing services to businesses and individuals before the advent of generative AI using traditional processors, data storage and connectivity. When generative AI came around, hyperscalers were able to adapt much of this infrastructure to support and accelerate its early development as they upgraded to account for increased computing demands.
Hyperscalers are also key innovators in AI as they permeate different industries and compete with each other. Firms like Google, Meta, Amazon and Microsoft have accelerated the development of specialized AI tools built not just for customers, but also for developers, creating ecosystems of innovation.
We’ve talked a bit about the ongoing huge proliferation of data centers across the globe—in reality, the nine companies we’ve named here, with a handful of others (Huawei and Tencent come to mind) are responsible for most of that work.
What Services Do Hyperscalers Provide?
Infrastructure—access to data centers and the pipes that connect them, which means storage and processing power.
Platform—an environment to build, store and manage software without the need for additional hardware.
Software—applications available via browser
Content Delivery—extremely fast, stable and reliable hosting of web content
Analytics—ultra-sophisticated, highly automated tools to process and analyze data
What About the Edge?
Edge computing is a movement to put more of the computing work as close to the user as possible—and edge AI, which we’ve discussed in the past, describes AI that relies on local processors, like a personal computer’s GPU, for some or all of its workload. Moving forward, the cutting edge of AI will be built both on the edge and in the cloud, where AI-focused hardware will be able to intelligently distribute computing work in the most efficient manner between local and distributed processors.
Furthermore, there is a movement afoot among some businesses, particularly in highly regulated industries, to keep at least some of their AI workload in-house. It’s possible that AI’s future is an even more distributed one, where the influence of and demand for huge hyperscaler providers is watered down by a volume of smaller, more specialized providers offering point solutions to businesses and individuals—but I find that scenario unlikely.
Was that a “word salad?” Then think of that little girl in the classic Old El Paso commercial who said “Why not both?” to cheers and applause—the future of AI probably isn’t either cloud or edge computing, but an optimized combination of both.