Add Hugging Face Hub SaaS Account #7698

danehans · 2025-01-20T18:49:19Z

The Gateway API Inference Extension project requires a Hugging Face Hub account to download LLMs such as meta-llama/Llama-2-7b-hf for running e2e tests in CI. The account must generate an access token and store the token in the CI cluster as an environment variable, e.g. HUGGING_FACE_TOKEN.

cc: @robscott @ahg-g

The text was updated successfully, but these errors were encountered:

ameukam · 2025-01-23T13:12:33Z

cc @kubernetes/sig-k8s-infra-leads

BenTheElder · 2025-01-23T19:43:04Z

Is this free?

If this is a typical user account we should probably manage the account in a SIG one-password vault and then populate the API key into one or more of the CI clusters.

We'll also have to figure out an email for the user, maybe one of the private [email protected] lists (assuming it may be sent password reset emails etc)

Currently we only have a few github robot accounts like this co-managed with SIG Testing + Contribex, most everything else is donated cloud SaaS where CNCF is primary on the account, then SIG K8s Infra, and projects / sub-accounts are provisioned for Kubernetes projects.

BenTheElder · 2025-01-23T19:44:09Z

account to download LLMs such as meta-llama/Llama-2-7b-hf for running e2e tests in CI.

This particular model appears to require signing a licensing agreement with Meta? Is this really the only way we can test our code?

BenTheElder · 2025-01-23T19:46:51Z

It seems like we should be able to e2e test routing serving requests without actually running any particular model? Just some trivial fake?

danehans · 2025-01-24T19:48:41Z

I’ve tried multiple open-source models that don’t require signing a license agreement (e.g., GPT-J, MPT, etc.), but each one either isn’t supported by vLLM or lacks LoRA compatibility. Additionally, the existing LoRA adapters are specifically trained for GPT-J or Llama 2, so they can’t be reused for other models. Since EPP (the reference inference extension) scrapes real metrics from vLLM to perform load balancing, substituting a fake model server won’t suffice for proper testing.

ahg-g · 2025-01-24T20:49:46Z

@danehans can we use GPT-J then? I assume it doesn't require signing an agreement

BenTheElder · 2025-01-24T21:17:02Z

Since EPP (the reference inference extension) scrapes real metrics from vLLM to perform load balancing, substituting a fake model server won’t suffice for proper testing.

We can't fake metrics for deterministic testing?

Independent of managing this account, ideally we don't have to spend $$$ running models where not necessary.
(This is where we build things like kind, kwowk etc to make testing more sustainable)

BenTheElder · 2025-01-24T21:36:11Z

I don't think any one of us can unilaterally sign up the Kubernetes organization to agree to some legal terms (as opposed to using software libraries under a CNCF approved license), nor should we personally agree and provide our personal account. cc @kubernetes/steering-committee

Annoyingly you can't even see the agreement terms without signing into an account.

For the other infra we do not have any agreements signed ourselves, the vendors have provided resources to the CNCF and the CNCF delegates resources to us.

Let's see what the others think though, cc @kubernetes/sig-k8s-infra-leads

danehans · 2025-01-24T21:53:29Z

@ahg-g GPT-J (EleutherAI/gpt-j-6B) does not support LoRA. From the vLLM logs:

ERROR 01-24 11:23:00 engine.py:366] AssertionError: GPTJForCausalLM does not support LoRA yet.

Do you agree with faking the model server for e2e testing?

ahg-g · 2025-01-24T22:46:12Z

@ahg-g GPT-J (EleutherAI/gpt-j-6B) does not support LoRA. From the vLLM logs:

@liu-cong mentioned that we can use mistral

danehans added the sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. label Jan 20, 2025

danehans mentioned this issue Jan 20, 2025

Setup at least one e2e test kubernetes-sigs/gateway-api-inference-extension#77

Closed

danehans mentioned this issue Jan 30, 2025

e2e CI Job kubernetes-sigs/gateway-api-inference-extension#259

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Hugging Face Hub SaaS Account #7698

Add Hugging Face Hub SaaS Account #7698

danehans commented Jan 20, 2025

ameukam commented Jan 23, 2025

BenTheElder commented Jan 23, 2025

BenTheElder commented Jan 23, 2025

BenTheElder commented Jan 23, 2025

danehans commented Jan 24, 2025

ahg-g commented Jan 24, 2025 •

edited

Loading

BenTheElder commented Jan 24, 2025 •

edited

Loading

BenTheElder commented Jan 24, 2025

danehans commented Jan 24, 2025

ahg-g commented Jan 24, 2025

Add Hugging Face Hub SaaS Account #7698

Add Hugging Face Hub SaaS Account #7698

Comments

danehans commented Jan 20, 2025

ameukam commented Jan 23, 2025

BenTheElder commented Jan 23, 2025

BenTheElder commented Jan 23, 2025

BenTheElder commented Jan 23, 2025

danehans commented Jan 24, 2025

ahg-g commented Jan 24, 2025 • edited Loading

BenTheElder commented Jan 24, 2025 • edited Loading

BenTheElder commented Jan 24, 2025

danehans commented Jan 24, 2025

ahg-g commented Jan 24, 2025

ahg-g commented Jan 24, 2025 •

edited

Loading

BenTheElder commented Jan 24, 2025 •

edited

Loading