Cloudflare recently announced that developers can now deploy AI applications on Cloudflare’s global network in one simple click directly from Hugging Face, the leading open and collaborative platform for AI builders.
With Workers AI generally available, developers can now deploy AI models in one click directly from Hugging Face, for the fastest way to access a variety of models and run inference requests on Cloudflare’s global network of GPUs.
Developers can choose one of the popular open-source models and then simply click “Deploy to Cloudflare Workers AI” to deploy a model instantly.
There are 14 curated Hugging Face models now optimised for Cloudflare’s global serverless inference platform, supporting three different task categories including text generation, embeddings, and sentence similarity.
Cloudflare is the first serverless inference partner integrated on the Hugging Face Hub for deploying models, enabling developers to quickly, easily, and affordably deploy AI globally, without managing infrastructure or paying for unused compute capacity.
Cloudflare now has GPUs deployed across more than 150 cities globally, most recently launching in Cape Town, Durban, Johannesburg, and Lagos for the first locations in Africa, as well as Amman, Buenos Aires, Mexico City, Mumbai, New Delhi, and Seoul, to provide low-latency inference around the world.
Workers AI is also expanding to support fine-tuned model weights, enabling organizations to build and deploy more specialized, domain-specific applications.
“We can solve this by abstracting away the cost and complexity of building AI-powered apps. Workers AI is one of the most affordable and accessible solutions to run inference. And with Hugging Face and Cloudflare both deeply aligned in our efforts to democratise AI in a simple, affordable way, we’re giving developers the freedom and agility to choose a model and scale their AI apps from zero to global in an instant,” Matthew Prince, CEO and co-founder, Cloudflare said.