Dynamic Model Loading with Docker Deployment #8077

Unanswered

ArijitSinghEDA asked this question in Q&A

ArijitSinghEDA
Sep 2, 2024

If I use the Dockerfile to deploy a vLLM server, does it only support single model deployment, or can I load models dynamically as well?

The Dockerfile contains only this line as its execution:
ENTRYPOINT ["python", "-m", "vllm.entrypoints.openai.api_server"]

In case it does support loading multiple model loading, how do I use it then?

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment