Skip to content

Commit

Permalink
adapt warning; update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
michaelfeil committed Apr 8, 2024
1 parent 0a5f197 commit 55b2f56
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 5 deletions.
6 changes: 4 additions & 2 deletions docs/docs/deploy.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,11 @@ docker run \
--model-name-or-path $model --port $port
```

### Docker with offline mode
### Docker with offline mode and models with custom pip packages

If you want to run infinity in a location without internet access, you can pre-download the model into the dockerfile.
This is also the advised route to go, if you want to use infinity with models that require additional packages such as
`nomic-ai/nomic-embed-text-v1.5`.

```bash
# clone the repo
Expand All @@ -26,7 +28,7 @@ docker buildx build --target=production-with-download \
--build-arg MODEL_NAME=michaelfeil/bge-small-en-v1.5 --build-arg ENGINE=torch \
-f Dockerfile -t infinity-model-small .
```
You can also set an argument `EXTRA_PACKAGES` if you require to `--build-arg EXTRA_PACKAGES="einsum torch_geometric"`
You can also set an argument `EXTRA_PACKAGES` if you require to install any extra packages. `--build-arg EXTRA_PACKAGES="einsum torch_geometric"`

Rename and push it to your internal docker registry.

Expand Down
4 changes: 2 additions & 2 deletions libs/infinity_emb/infinity_emb/inference/batch_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,9 +81,9 @@ def __init__(
self._last_inference = time.perf_counter()

if batch_delay > 0.1:
logger.warn(f"high batch delay of {self._batch_delay}")
logger.warning(f"high batch delay of {self._batch_delay}")
if max_batch_size > max_queue_wait * 10:
logger.warn(
logger.warning(
f"queue_size={self.max_queue_wait} to small "
f"over batch_size={self.max_batch_size}."
" Consider increasing queue size"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -681,7 +681,7 @@ def quantize(
device: str = default_device,
) -> tuple[QuantHandler, dict]:
CHECK_TORCH.mark_required()
logger.warn(
logger.warning(
f"quantization to {mode} mode currently yields incorrect results. Do not use for production."
)
precision = torch.bfloat16
Expand Down

0 comments on commit 55b2f56

Please sign in to comment.