add top_k parameter to rerank endpoint #396

aloababa · 2024-10-04T10:43:27Z

This PR adds top_k parameter (number of top documents to return after reranking) to the rerank endpoint.

greptile-apps

PR Summary

This PR adds a 'top_k' parameter to the rerank endpoint, allowing users to specify the number of top documents to return after reranking, as requested in Issue #342.

Added 'top_k' parameter with default value 0 to RerankInput model in libs/infinity_emb/infinity_emb/fastapi_schemas/pymodels.py
Implemented 'top_k' functionality in rerank endpoint in libs/infinity_emb/infinity_emb/infinity_server.py
The new parameter aligns with industry standards for reranking APIs, improving flexibility and usability
Consider adding input validation for 'top_k' to ensure it's a non-negative integer
Recommend updating documentation and adding tests to cover the new functionality

_{2 file(s) reviewed, 1 comment(s)}
_{Edit PR Review Bot Settings}

greptile-apps · 2024-10-04T10:44:36Z

libs/infinity_emb/infinity_emb/infinity_server.py

+            if data.top_k > 0:
+                data.documents = data.documents[: data.top_k]
+                scores = scores[: data.top_k]


logic: Add error handling for case where top_k > len(data.documents)

michaelfeil · 2024-10-04T16:51:08Z

Thanks for opening the PR. There need quite some improvements unfortunate.

Align to one of the existing protocols (jina, cohere, voyage, tei), happy to discuss if this is a useful feature.

Expectation would be:

implementation in "AsyncEngine", and not infinity_server.py
full unit test.
- Covering int=0, int=1, int=2, int=very large
- int=0 is a bad default.
- Covering from engine side and covering from API side.

codecov-commenter · 2024-10-04T16:52:32Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.56%. Comparing base (2ff39c6) to head (47524b7).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

❗ There is a different number of reports uploaded between BASE (2ff39c6) and HEAD (47524b7). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (2ff39c6) HEAD (47524b7)

2 1

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #396      +/-   ##
==========================================
- Coverage   78.62%   72.56%   -6.06%     
==========================================
  Files          39       39              
  Lines        3008     3011       +3     
==========================================
- Hits         2365     2185     -180     
- Misses        643      826     +183

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

aloababa · 2024-10-05T19:19:34Z

I've updated the PR with the following changes:

top_k is now optional and must be greater than 0 on API side.
top_k is now implemented in AsyncEngine.
Added tests that covers Engine and API.

michaelfeil · 2024-10-05T21:16:48Z

libs/infinity_emb/tests/end_to_end/test_torch_reranker.py

+@pytest.mark.anyio
+async def test_reranker_top_k(client):
+    query = "Where is the Eiffel Tower located?"
+    documents = [


Potential flaw: If backend does not do sorting, top_k=1 will just take always the first result.

Better unit test: Wrap this with a unit test, and return_text. Make sure that topk=1 solution is always paris in this unit test. e.g. use https://docs.python.org/3/library/itertools.html

# someting like: for return_text in [true, False] for raw_score in [True, False]: for permutation in itertools.permutation([..paris, ..us, uk]): # you test above ```

michaelfeil · 2024-10-05T21:17:38Z

@aloababa Thanks for adding all the changes! One concern, I am not sure if the top_k is working as intended, I think it currently just filters the list, but not doing topk actually ( if it implements sort)

Some extra wish:

implement sorting
verify sorting works.
stay as close as possible to cohere: https://docs.cohere.com/reference/rerank

Example why this currently does not work:

curl -X 'POST' \
  'https://infinity.modal.michaelfeil.eu/rerank' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": "where is paris",
  "documents": [
    "italy", "us", "uk", "france"
  ],
  "return_documents": true,
  "raw_scores": false,
  "model": "mixedbread-ai/mxbai-rerank-xsmall-v1"
}'

{
  "object": "rerank",
  "results": [
    {
      "relevance_score": 0.018829345703125,
      "index": 0,
      "document": "italy"
    },
    {
      "relevance_score": 0.06927490234375,
      "index": 1,
      "document": "us"
    },
    {
      "relevance_score": 0.0240936279296875,
      "index": 2,
      "document": "uk"
    },
    {
      "relevance_score": 0.177001953125,
      "index": 3,
      "document": "france"
    }
  ],
  "model": "mixedbread-ai/mxbai-rerank-xsmall-v1",
  "usage": {
    "prompt_tokens": 71,
    "total_tokens": 71
  },
  "id": "infinity-14b91ebc-f8e2-4bfc-9593-bb11ec7ac95a",
  "created": 1728163265
}

* add top_n parameter to rerank endpoint * RerankInput: top_k as optional + add field validation * handle top_k parameter in AsyncEngine * add tests for top_k * MichaelFeil: add sorting, dtype, to rerank api * add parametrization for engine * fix: typing --------- Co-authored-by: Benjamin Gustin <[email protected]>

greptile-apps bot reviewed Oct 4, 2024

View reviewed changes

aloababa added 4 commits October 5, 2024 21:20

add top_k parameter to rerank endpoint

6b13f87

RerankInput: top_k as optional + add field validation

ca9b108

handle top_k parameter in AsyncEngine

cf4f97a

add tests for top_k

47524b7

aloababa force-pushed the top_k branch from 5526aa1 to 47524b7 Compare October 5, 2024 19:20

michaelfeil requested changes Oct 5, 2024

View reviewed changes

michaelfeil changed the base branch from main to tmp-top-n October 7, 2024 07:20

michaelfeil merged commit 7e0b5e5 into michaelfeil:tmp-top-n Oct 7, 2024
27 of 36 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add top_k parameter to rerank endpoint #396

add top_k parameter to rerank endpoint #396

aloababa commented Oct 4, 2024

greptile-apps bot left a comment

greptile-apps bot Oct 4, 2024

michaelfeil commented Oct 4, 2024

codecov-commenter commented Oct 4, 2024 •

edited

Loading

aloababa commented Oct 5, 2024

michaelfeil Oct 5, 2024

michaelfeil commented Oct 5, 2024 •

edited

Loading

add top_k parameter to rerank endpoint #396

add top_k parameter to rerank endpoint #396

Conversation

aloababa commented Oct 4, 2024

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

greptile-apps bot Oct 4, 2024

Choose a reason for hiding this comment

michaelfeil commented Oct 4, 2024

codecov-commenter commented Oct 4, 2024 • edited Loading

Codecov Report

aloababa commented Oct 5, 2024

michaelfeil Oct 5, 2024

Choose a reason for hiding this comment

michaelfeil commented Oct 5, 2024 • edited Loading

codecov-commenter commented Oct 4, 2024 •

edited

Loading

michaelfeil commented Oct 5, 2024 •

edited

Loading