What Are Neoclouds and Why Does AI Need Them?

PinIt

Neoclouds are new-generation of AI-focused cloud infrastructure providers. They differentiate themselves from traditional hyperscalers (such as AWS, Azure, and GCP) by focusing on optimized GPU availability, flexible pricing, and specialized performance for AI/ML workloads, among other key features.

AI workloads are becoming mainstream across virtually every industry. But as demand for AI grows, so does the strain on infrastructure. Unfortunately, traditional options for accessing high-performance compute are falling short, leading businesses to seek alternatives. Some are turning to neoclouds, a new type of compute service provider centered on GPU-centric offerings.

These new providers are gaining interest due to a shift in computing; one characterized by a growing reliance on GPUs for AI. While GPUs are well-suited to handle the parallel processing needs of AI training and inference, they are notoriously expensive and in short supply. A high-end GPU can cost tens of thousands of dollars, and that’s only if a company can find one. Procurement cycles are lengthy, lead times are unpredictable, and enterprise IT teams frequently struggle to acquire sufficient capacity to meet demand.

Hyperscale cloud providers, such as AWS, Google Cloud, and Azure, have sought to fill the gap by offering GPU instances and GPUs as a service. For many organizations, this model works, but only to a point.

Cloud-based GPU instances offer flexibility, scalability, and rapid deployment. But as usage grows, so does the bill. The pay-as-you-go model quickly becomes cost-prohibitive at scale, particularly for sustained workloads like large language model (LLM) training, fine-tuning, or real-time inference across enterprise applications.

As a result, many organizations find themselves caught between a rock (scarce and expensive on-premises infrastructure) and a hard place (high public cloud GPU costs).

Enter Neoclouds

In this environment, a new category of compute provider is emerging: the neocloud provider. Neocloud infrastructure providers offer high-performance compute, heavily centered on GPUs, at more affordable rates (at least according to them). They accomplish this by leveraging unconventional origins, new economic models, and different infrastructure strategies.

For example, some neoclouds trace their roots back to cryptocurrency mining. During the cryptocurrency boom of the last decade, thousands of miners worldwide built data centers equipped with GPU servers optimized for mining operations. As the cryptocurrency market cooled and mining became less profitable, these operators found themselves with large amounts of idle GPU capacity. Rather than let this hardware depreciate in a warehouse, some repurposed their equipment for AI workloads, and thus, the neocloud was born.

Others in the space are purpose-built startups that saw an opportunity to offer a more cost-effective alternative to the hyperscalers by optimizing for price-performance and bypassing the overhead of legacy cloud operations. Some build on open-source software stacks and colocate in low-cost data centers; others aggregate excess GPU capacity across decentralized networks, forming a kind of “GPU spot market” that enterprises can tap into at lower costs.

Benefits of the Neoclouds Model

For businesses seeking AI compute power without the exorbitant costs, neocloud providers claim they can deliver several benefits, including:

Lower Cost per GPU Hour: Neoclouds often charge a fraction of what hyperscalers do for comparable GPU instances. They claim that their lean operating models and hardware reuse strategies translate to real savings.

Dedicated Access: In many cases, neocloud providers offer dedicated bare metal access to GPUs, reducing contention and ensuring predictable performance.

Rapid Availability: With flexible procurement and provisioning processes, neocloud providers claim they can often deliver capacity much faster than traditional vendors, helping teams iterate and deploy AI models without delay.

Decentralization and Resilience: Some neoclouds operate on distributed models, sourcing compute from geographically diverse data centers or networks of independent operators.

Sustainability: Repurposing existing hardware, particularly from the cryptocurrency industry, reduces e-waste and promotes more sustainable IT practices. Some neo clouds also colocate in facilities powered by renewable energy, further lowering the carbon footprint.

See also: GPU Market Shift: Leveraging the Fall of Crypto Mining

GPUs-as-a-Service, Reimagined

At their core, neoclouds represent a fresh take on the GPU-as-a-Service model. Companies considered to be neocloud providers include CoreWeave, Crusoe, Lambda Labs, Nebius, Vast.ai, and others.

They’re less focused on bundling GPUs with proprietary services and more interested in delivering raw, high-performance compute at a price point that makes large-scale AI viable for more businesses.

By democratizing access to affordable GPUs, neocloud providers believe they can lower the barrier to entry for companies looking to build or scale AI applications. Small startups can train custom models without burning through their seed funding. Enterprises can fine-tune LLMs on their proprietary data without ceding control or overspending. And research institutions can run simulations and experiments without being limited by budget constraints.

Despite these advantages, neocloud providers are likely to compete for the near future with hyperscalers in the GPU-as-a-Service marketplace. That market was valued at $3.23 billion in 2023 and is projected to grow to $49.84 billion by 2032, representing a 36% growth rate, according to Fortune Business Insights. (That estimate includes both hyperscalers and neoclouds.)

Final Thoughts

The compute requirements for AI continue to grow. Neocloud providers are new-generation of AI-focused cloud infrastructure providers that aim to meet these needs. They differentiate themselves from traditional hyperscalers (such as AWS, Azure, and GCP) by focusing on optimized GPU availability, flexible pricing, and specialized performance for AI/ML workloads, among other key features.

Salvatore Salamone

About Salvatore Salamone

Salvatore Salamone is a physicist by training who has been writing about science and information technology for more than 30 years. During that time, he has been a senior or executive editor at many industry-leading publications including High Technology, Network World, Byte Magazine, Data Communications, LAN Times, InternetWeek, Bio-IT World, and Lightwave, The Journal of Fiber Optics. He also is the author of three business technology books.

Leave a Reply

Your email address will not be published. Required fields are marked *