HuggingDiscussions
Discuss and provide feedback on Hugging Face Hub features
We're thrilled to share that Featherless AI is now a supported Inference Provider on the Hugging Face Hub! Featherless AI joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub’s model pages. Inference Providers are also seamlessly integrated into our client SDKs (for both JS and Python), making it super easy to use a wide variety of models with your preferred providers.
Featherless AI supports a wide variety of text and conversational models, including the latest open-source models from DeepSeek, Meta, Google, Qwen, and much more.
Featherless AI is a serverless AI inference provider with unique model loading and GPU orchestration abilities that makes an exceptionally large catalog of models available for users. Providers often offer either a low cost of access to a limited set of models, or an unlimited range of models with users managing servers and the associated costs of operation. Featherless provides the best of both worlds offering unmatched model range and variety but with serverless pricing. Find the full list of supported models on the models page.
We're super excited to see what you'll build with this new provider!
Read more about how to use Featherless as an Inference Provider in its dedicated documentation page.
The following example shows how to use DeepSeek-R1 using Featherless AI as the inference provider. You can use a Hugging Face token for automatic routing through Hugging Face, or your own Featherless AI API key if you have one.
Install or upgrade huggingface_hub to ensure you have version v0.33.0 or better: pip install --upgrade huggingface-hub
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="featherless-ai",
api_key=os.environ["HF_TOKEN"]
)
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
completion = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1-0528",
messages=messages,
)
print(completion.choices[0].message)
import { InferenceClient } from "@huggingface/inference";
const client = new InferenceClient(process.env.HF_TOKEN);
const chatCompletion = await client.chatCompletion({
model: "deepseek-ai/DeepSeek-R1-0528",
messages: [
{
role: "user",
content: "What is the capital of France?"
}
],
provider: "featherless-ai",
});
console.log(chatCompletion.choices[0].message);
For direct requests, i.e. when you use the key from an inference provider, you are billed by the corresponding provider. For instance, if you use a Featherless AI API key you're billed on your Featherless AI account.
For routed requests, i.e. when you authenticate via the Hugging Face Hub, you'll only pay the standard provider API rates. There's no additional markup from us, we just pass through the provider costs directly. (In the future, we may establish revenue-sharing agreements with our provider partners.)
Important Note ‼️ PRO users get $2 worth of Inference credits every month. You can use them across providers. 🔥
Subscribe to the Hugging Face PRO plan to get access to Inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and more.
We also provide free inference with a small quota for our signed-in free users, but please upgrade to PRO if you can!
We would love to get your feedback! Share your thoughts and/or comments here: https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49
Discuss and provide feedback on Hugging Face Hub features