Exciting New Serverless Inference Providers on Hugging Face Hub
We’re thrilled to announce the latest enhancements to the Hugging Face Hub with the addition of three outstanding serverless inference providers: Hyperbolic, Nebius AI Studio, and Novita. These providers are joining our growing ecosystem, significantly broadening the capabilities of serverless inference directly on the Hub’s model pages. Additionally, they are seamlessly integrated into our client SDKs for both JavaScript and Python, making it incredibly easy to use a diverse array of models with your preferred providers.
Expanding the Ecosystem
Joining the ranks of our existing providers, such as Together AI, Sambanova, Replicate, fal, and Fireworks.ai, these new partners introduce a variety of innovative models, including DeepSeek-R1 and Flux.1. With these additions, developers and data scientists can now explore a wider selection of models and leverage advanced functionalities in their applications.
Supported Models
Each new provider enables access to unique models. For example:
- Hyperbolic: Known for its high-performance capabilities.
- Nebius AI Studio: Offers state-of-the-art machine learning models.
- Novita: Focuses on creative applications like image generation.
To find all the models supported by these new providers, users can easily navigate through the Hugging Face Hub.
How It Works
In the Website UI
Navigating through the user account settings on the Hugging Face Hub is straightforward and user-friendly:
-
Setting API Keys: Users can set their API keys for the providers they’ve signed up with. If no custom key is set, all requests will be routed through Hugging Face.
- Ordering Providers by Preference: Users can prioritize their preferred providers, which will be reflected in the widget and code snippets on the model pages.
- Model Pages: The model pages showcase third-party inference providers compatible with the current model, sorted according to user preferences.
From the Client SDKs
Using Python
For Python users, the integration is seamless with the huggingface_hub library. Here’s an example of how to use DeepSeek-R1 with Hyperbolic as the inference provider.
Before starting, ensure you have installed huggingface_hub. Official support will be available soon in version v0.29.0.
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hyperbolic",
api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
)
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
completion = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=messages,
max_tokens=500
)
print(completion.choices[0].message)
If you want to generate an image from a text prompt using FLUX.1 running on Nebius AI Studio, the code would look like this:
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="nebius",
api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
)
image = client.text_to_image(
"Bob Marley in the style of a painting by Johannes Vermeer",
model="black-forest-labs/FLUX.1-schnell"
)
Switching providers is as simple as changing the provider name in your code.
Using JavaScript
For JavaScript developers, using the @huggingface/inference library makes it easy to access inference capabilities. Here’s a quick example of how to perform a chat completion using DeepSeek-R1 with the Novita provider:
import { HfInference } from "@huggingface/inference";
const client = new HfInference("xxxxxxxxxxxxxxxxxxxxxxxx");
const chatCompletion = await client.chatCompletion({
model: "deepseek-ai/DeepSeek-R1",
messages: [
{
role: "user",
content: "What is the capital of France?"
}
],
provider: "novita",
max_tokens: 500
});
console.log(chatCompletion.choices[0].message);
Billing Information
When making direct requests using an inference provider’s key, users will be billed according to the respective provider’s rates. For example, if you are using a Nebius AI Studio key, the charges will apply to your Nebius account.
On the other hand, for routed requests where authentication is done via the Hugging Face Hub, users will only incur standard API rates without any additional markup from Hugging Face. In the future, potential revenue-sharing agreements may be established with provider partners.
Inference Credits for PRO Users
A noteworthy benefit for PRO users is the monthly allocation of $2 worth of inference credits, which can be utilized across various providers. Subscribing to the Hugging Face PRO plan also grants access to several advantages, including ZeroGPU, Spaces Dev Mode, and significantly higher usage limits.
For those who sign up as free users, there’s also a small quota for free inference, although upgrading to PRO is highly recommended for enhanced capabilities.
Feedback and Next Steps
We genuinely value your feedback on these new additions. Join the conversation and share your thoughts and experiences in our dedicated Hub discussion forum: Hugging Face Discussions.
The integration of these new serverless inference providers marks an exciting milestone in enhancing the Hugging Face Hub, empowering developers to create innovative solutions with ease. Whether you are building chatbots, generating images, or experimenting with cutting-edge models, the expanded capabilities offered by these providers will undoubtedly elevate your projects to new heights.
Inspired by: Source


