huggingface/HuggingDiscussions · [FEEDBACK] Inference Providers

julien-c

Hugging Face org Jan 17

Any inference provider you love, and that you'd like to be able to access directly from the Hub?

reach-vb

Jan 28

•

edited Jan 28

Love that I can call DeepSeek R1 directly from the Hub 🔥

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="together",
    api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
)

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?"
    }
]

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1", 
    messages=messages, 
    max_tokens=500
)

print(completion.choices[0].message)

benhaotang

Jan 28

•

edited Jan 28

Is it possible to set a monthly payment budget or rate limits for all the external providers? I don't see such options in billings tab. In case a key is or session token is stolen, it can be quite dangerous to my thin wallet:(

julien-c

Hugging Face org Jan 28

@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future

benhaotang

Jan 28

•

edited Jan 28

@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future

Thanks for your quick reply, good to know!

sylanaustin

Jan 28

Would be great if you could add Nebius AI Studio to the list :) New inference provider on the market, with the absolute cheapest prices and the highest rate limits...

Hazzzardous

Jan 28

Could be good to add featherless.ai

teentitan

Jan 28

TitanML !!

139 hidden messages

Expand all

schuyler-gptproto

Sep 29

Hi Hugging Face team, 👋

I’m with GPT Porto (https://www.gptproto.com/), an AI API platform focused on providing safe, stable, fast, and affordable inference for developers and enterprises. With just one API key, our users can access most mainstream models, including Hugging Face models, while enjoying reliable infrastructure and cost efficiency.

We’ve seen strong adoption from teams who value predictable performance, security, and competitive pricing for both experimentation and production workloads. Many of them already integrate Hugging Face models through GPT Proto to streamline deployment and reduce costs.

We’d love to explore becoming an official Inference Provider on Hugging Face, so that more builders in your ecosystem can benefit from a secure, high-performance, and budget-friendly option for model inference.
Looking forward to collaborating!

Contact us: [email protected]

Best regards,
Team GPT Proto

vendors-boostrun

Sep 29

@julien-c

Hi Julien would I be able to reach out to you or someone from HF about the API Key management and Billing?

https://huggingface.co/docs/inference-providers/en/register-as-a-provider#4-billing

Thanks,
Boost Run

Prev: https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49#68d564aa7a7e392a927c11f6

hackyroot

Oct 9

Would be great to have Simplismart to the list!

masve123

Oct 23

•

edited Oct 23

Hi!

Would be great if Snowcell could be added to the list. We build complete inferencing solutions from the ground up.

I couldn't find a specific contact point to reach about this, but for any questions we are available at [email protected].

Best Regards

ChernovAndrei

Nov 5

Hello,

At FAIM, we are building an inference platform for time-series foundation models: https://faim.it.com/.
All models we currently support are available on Hugging Face.

I would like to clarify whether it’s possible for us to become an inference provider on Hugging Face for time-series models.

Thank you and best regards,
Andrei
[email protected]

vaughn

Nov 15

Hello Gatewayz is ready for integation. Please email me at [email protected]

gcube

16 days ago

Hi Hugging Face team!

We're gcube (https://gcube.ai), a GPU sharing platform from South Korea. We make AI inference super affordable by connecting idle GPUs from cloud providers and even PC cafes across Korea - basically turning unused computing power into a distributed GPU network.

Our customers are seeing 55-70% cost savings, and we work with major Korean cloud partners like Naver Cloud, NHN Cloud, and KT Cloud.

We'd love to become an official Inference Provider on Hugging Face. Would really appreciate any guidance on the next steps!

Our HF org: https://huggingface.co/gcube-ai (Team plan subscribed)

Thanks!

Best,
Koo
Data Alliance (gcube)
[email protected]

SimplismartAI

3 days ago

•

edited 3 days ago

Hello Hugging Face team 👋

We’re from Simplismart.ai, a Series A startup backed by Accel, building a modular MLOps platform focused on high-performance inference. We’re currently exploring the process of listing our inference APIs as a provider on Hugging Face.

We’ve gone through the inference provider documentation and are preparing for the next steps, but before raising a PR, we’d appreciate some clarity around the billing flow, specifically:

Questions:

What is the expected delay between a successful inference request and Hugging Face calling the billing endpoint?
If we’re unable to return cost details within one minute when Hugging Face hits the billing endpoint, does Hugging Face retry the request? If so, what’s the retry behavior?

We want to ensure our implementation aligns closely with Hugging Face’s billing expectations, so any guidance on the above would be very helpful.

Thanks in advance for the support! 🤗

-- Pratik Parmar
Developer Advocate @ Simplismart.ai
[email protected]