Exciting New Serverless Inference Providers on Hugging Face Hub

We’re thrilled to announce the latest enhancements to the Hugging Face Hub with the addition of three outstanding serverless inference providers: Hyperbolic, Nebius AI Studio, and Novita. These providers are joining our growing ecosystem, significantly broadening the capabilities of serverless inference directly on the Hub’s model pages. Additionally, they are seamlessly integrated into our client SDKs for both JavaScript and Python, making it incredibly easy to use a diverse array of models with your preferred providers.

Contents

Expanding the Ecosystem

Supported Models

How It Works

In the Website UI
From the Client SDKs

Using Python
Using JavaScript

Billing Information

Inference Credits for PRO Users

Feedback and Next Steps

Expanding the Ecosystem

Joining the ranks of our existing providers, such as Together AI, Sambanova, Replicate, fal, and Fireworks.ai, these new partners introduce a variety of innovative models, including DeepSeek-R1 and Flux.1. With these additions, developers and data scientists can now explore a wider selection of models and leverage advanced functionalities in their applications.

Supported Models

Each new provider enables access to unique models. For example:

Hyperbolic: Known for its high-performance capabilities.
Nebius AI Studio: Offers state-of-the-art machine learning models.
Novita: Focuses on creative applications like image generation.

To find all the models supported by these new providers, users can easily navigate through the Hugging Face Hub.

How It Works

In the Website UI

Navigating through the user account settings on the Hugging Face Hub is straightforward and user-friendly:

Setting API Keys: Users can set their API keys for the providers they’ve signed up with. If no custom key is set, all requests will be routed through Hugging Face.
Ordering Providers by Preference: Users can prioritize their preferred providers, which will be reflected in the widget and code snippets on the model pages.

Model Pages: The model pages showcase third-party inference providers compatible with the current model, sorted according to user preferences.

From the Client SDKs

Using Python

For Python users, the integration is seamless with the huggingface_hub library. Here’s an example of how to use DeepSeek-R1 with Hyperbolic as the inference provider.

Before starting, ensure you have installed huggingface_hub. Official support will be available soon in version v0.29.0.

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hyperbolic",
    api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
)

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?"
    }
]

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1", 
    messages=messages, 
    max_tokens=500
)

print(completion.choices[0].message)

If you want to generate an image from a text prompt using FLUX.1 running on Nebius AI Studio, the code would look like this:

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="nebius",
    api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
)

image = client.text_to_image(
    "Bob Marley in the style of a painting by Johannes Vermeer",
    model="black-forest-labs/FLUX.1-schnell"
)

Switching providers is as simple as changing the provider name in your code.

Using JavaScript

For JavaScript developers, using the @huggingface/inference library makes it easy to access inference capabilities. Here’s a quick example of how to perform a chat completion using DeepSeek-R1 with the Novita provider:

import { HfInference } from "@huggingface/inference";

const client = new HfInference("xxxxxxxxxxxxxxxxxxxxxxxx");

const chatCompletion = await client.chatCompletion({
    model: "deepseek-ai/DeepSeek-R1",
    messages: [
        {
            role: "user",
            content: "What is the capital of France?"
        }
    ],
    provider: "novita",
    max_tokens: 500
});

console.log(chatCompletion.choices[0].message);

Billing Information

When making direct requests using an inference provider’s key, users will be billed according to the respective provider’s rates. For example, if you are using a Nebius AI Studio key, the charges will apply to your Nebius account.

On the other hand, for routed requests where authentication is done via the Hugging Face Hub, users will only incur standard API rates without any additional markup from Hugging Face. In the future, potential revenue-sharing agreements may be established with provider partners.

Inference Credits for PRO Users

A noteworthy benefit for PRO users is the monthly allocation of $2 worth of inference credits, which can be utilized across various providers. Subscribing to the Hugging Face PRO plan also grants access to several advantages, including ZeroGPU, Spaces Dev Mode, and significantly higher usage limits.

For those who sign up as free users, there’s also a small quota for free inference, although upgrading to PRO is highly recommended for enhanced capabilities.

Feedback and Next Steps

We genuinely value your feedback on these new additions. Join the conversation and share your thoughts and experiences in our dedicated Hub discussion forum: Hugging Face Discussions.

The integration of these new serverless inference providers marks an exciting milestone in enhancing the Hugging Face Hub, empowering developers to create innovative solutions with ease. Whether you are building chatbots, generating images, or experimenting with cutting-edge models, the expanded capabilities offered by these providers will undoubtedly elevate your projects to new heights.

Inspired by: Source

Exploring Hyperbolic, Nebius AI Studio, and Novita: Innovations in AI Technology 🔥

Exciting New Serverless Inference Providers on Hugging Face Hub

Expanding the Ecosystem

Supported Models

How It Works

In the Website UI

From the Client SDKs

Using Python

Using JavaScript

Billing Information

Inference Credits for PRO Users

Feedback and Next Steps

Stay Connected

Explore Top AI Tools Instantly

Latest News

Pope Leo XIV Collaborates with Anthropic Co-Founder to Release Text on Human Dignity and Artificial Intelligence

LISTEN to Your Preferences: A Comprehensive LLM Framework for Effective Multi-Objective Selection

Poll Reveals One-Third of UK University Students Believe AI Job Losses Could Trigger Social Unrest

Key Google Updates and Announcements You Can Expect This Week

Leading global tech insights for 20M+ innovators

Quick Link

Support

Sign Up for Our Newsletter

Exciting New Serverless Inference Providers on Hugging Face Hub

Expanding the Ecosystem

Supported Models

How It Works

In the Website UI

More Read

From the Client SDKs

Using Python

Using JavaScript

Billing Information

Inference Credits for PRO Users

Feedback and Next Steps

Sign Up For Daily Newsletter

Get AI news first! Join our newsletter for fresh updates on open-source models.

Stay Connected

Explore Top AI Tools Instantly

Latest News

Pope Leo XIV Collaborates with Anthropic Co-Founder to Release Text on Human Dignity and Artificial Intelligence

LISTEN to Your Preferences: A Comprehensive LLM Framework for Effective Multi-Objective Selection

Poll Reveals One-Third of UK University Students Believe AI Job Losses Could Trigger Social Unrest

Key Google Updates and Announcements You Can Expect This Week