HTTP API for LLM with OpenAI compatibility
> pip install llm
> pip install llm-http-api
- Follow the directions from LLM
- Run the plugin
llm http-api
- Visit the OpenAPI documentation localhost:8080/docs
> llm http-api --help
Usage: llm http-api [OPTIONS]
Run a FastAPI HTTP server with OpenAI compatibility
Options:
-h, --host TEXT [default: 0.0.0.0]
-p, --port INTEGER [default: 8080]
-l, --log-level TEXT [default: info]
-r, --reload
-d, --reload-dirs LIST [default: src]
--help Show this message and exit.
> curl http://localhost:8080/v1/embeddings -X POST -H "Content-Type: application/json" -d '{
"input": "Hello world",
"model": "jina-embeddings-v2-small-en"
}'
{"object":"embedding","embedding":[-0.47561466693878174,-0.4471365511417389,...],"index":0}
A detailed list of unimplemented OpenAI endpoints can be found here
This repository manages the dev environment as a Nix flake and requires Nix to be installed
nix develop -c $SHELL
make deps.install
make deps.install/test
make run/dev
make test
make coverage
make lint
make format
make publish/pypi
make llm.install/mlc
make llm.setup/mlc
make llm.mlc.download
make llm.mlc.download/code_llama-34b-python-q4f16
make llm.mlc.download/code_llama-34b-instruct-q0f16
make llm.mlc.download/code_llama-13b-q4f16
make llm.mlc.download/code_llama-7b-q4f16
make llm.mlc.download/wizard-coder-15b-q4f32
make llm.mlc.download/open_hermes-2.5-mistral-7b-q4f16
make llm.mlc.download/mistral-7b-instruct-q4f16