We’re thrilled to announce that Public AI is now an officially supported Inference Provider on the Hugging Face Hub! This exciting integration adds to our ever-growing ecosystem, enhancing serverless inference capabilities directly from the Hub’s model pages. With Public AI, users can now easily access a wider range of models through their preferred inference providers, all seamlessly connected through our client SDKs for both JavaScript and Python.
This launch simplifies the process of accessing public and sovereign models from esteemed institutions like the Swiss AI Initiative and AI Singapore. You can explore Public AI’s organization on the Hub at huggingface.co/publicai and discover trending supported models at huggingface.co/models.
The Public AI Inference Utility is a nonprofit, open-source initiative dedicated to building products and advocating for public AI model developers, including partners like the Swiss AI Initiative and AI Singapore. By combining advanced infrastructure with community support, Public AI aims to democratize access to AI models.
The utility operates on a distributed architecture, leveraging a vLLM-powered backend alongside a resilient deployment layer, ensuring smooth operations across multiple partnerships. Inference requests are processed by servers that expose OpenAI-compatible APIs on vLLM, supported by clusters donated by industry and national partners. An efficient global load-balancing system ensures optimal handling of queries, no matter where they originate.
Thanks to generous GPU donations and advertising subsidies, Public AI provides free public access to its services. The initiative also seeks long-term stability through contributions from state entities and institutions. To learn more about Public AI’s platform and its infrastructure, check out platform.publicai.co.
As of now, users can leverage the Public AI Inference Utility directly on Hugging Face. We’re eager to see the innovative projects you’ll develop using this new provider!
For in-depth instructions on how to utilize Public AI as an Inference Provider, visit its dedicated documentation page. You can also view the comprehensive list of supported models.
How it works
In the website UI
- In your user account settings, you will be able to:
- Configure your own API keys for the inference providers you’ve registered with. If a custom key isn’t set, requests will default to being routed through Hugging Face.
- Prioritize providers according to your preference, which influences the widget and code snippets on the model pages.
- When calling Inference Providers, there are two operational modes:
- Custom Key: Here, calls go directly to the inference provider using your API key.
- Routed by Hugging Face: In this mode, you need not provide a token from the provider, with charges applied to your Hugging Face account instead of the provider’s.
- The model pages display third-party inference providers compatible with the model you are working with, arranged according to user preference.
From the client SDKs
from Python, using huggingface_hub
To demonstrate usage, here’s an example for using Swiss AI’s Apertus-70B via Public AI as the inference provider. You can choose to authenticate with a Hugging Face token, which will automatically route your requests, or utilize your own Public AI API key if available.
Note: Ensure that you are using a recent version of huggingface_hub (>= 0.34.6).
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="publicai",
api_key=os.environ["HF_TOKEN"],
)
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
completion = client.chat.completions.create(
model="swiss-ai/Apertus-70B-Instruct-2509",
messages=messages,
)
print(completion.choices[0].message)
from JS using @huggingface/inference
import { from "@huggingface/inference";
const client = new env.const chatCompletion = await client.model: ,
role: ,
"What is the capital of France?",
},
],
"publicai",
});
log(chatCompletion.0].
Billing
Currently, the use of the Public AI Inference Utility through Hugging Face Inference Providers is entirely free! However, be aware that pricing and availability may evolve over time.
Here’s how billing is structured for other providers on the platform:
For direct requests—where you use the key from an inference provider—you are billed by that provider. Hence, if you use a Public AI API key, the charges will apply to your Public AI account.
In contrast, for routed requests, when authenticating via the Hugging Face Hub, you only incur the standard provider API rates. There is no extra markup from Hugging Face; the charges are forwarded without alteration. Future plans may include revenue-sharing agreements with our Provider partners.
Important Note ‼️ Pro users receive $2 worth of inference credits every month applicable across providers. 🔥
Consider subscribing to the Hugging Face PRO plan for access to inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and additional perks.
Additionally, free inference is extended to signed-in free users, but we highly recommend upgrading to PRO for optimal benefits!
Feedback and next steps
Your insights mean a lot! We invite you to share your thoughts or any comments regarding this new feature at this discussion thread.
Inspired by: Source




