-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add top_k parameter to rerank endpoint #396
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR adds a 'top_k' parameter to the rerank endpoint, allowing users to specify the number of top documents to return after reranking, as requested in Issue #342.
- Added 'top_k' parameter with default value 0 to RerankInput model in
libs/infinity_emb/infinity_emb/fastapi_schemas/pymodels.py
- Implemented 'top_k' functionality in rerank endpoint in
libs/infinity_emb/infinity_emb/infinity_server.py
- The new parameter aligns with industry standards for reranking APIs, improving flexibility and usability
- Consider adding input validation for 'top_k' to ensure it's a non-negative integer
- Recommend updating documentation and adding tests to cover the new functionality
2 file(s) reviewed, 1 comment(s)
Edit PR Review Bot Settings
if data.top_k > 0: | ||
data.documents = data.documents[: data.top_k] | ||
scores = scores[: data.top_k] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Add error handling for case where top_k > len(data.documents)
Thanks for opening the PR. There need quite some improvements unfortunate. Align to one of the existing protocols (jina, cohere, voyage, tei), happy to discuss if this is a useful feature. Expectation would be:
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files@@ Coverage Diff @@
## main #396 +/- ##
==========================================
- Coverage 78.62% 72.56% -6.06%
==========================================
Files 39 39
Lines 3008 3011 +3
==========================================
- Hits 2365 2185 -180
- Misses 643 826 +183 ☔ View full report in Codecov by Sentry. |
I've updated the PR with the following changes:
|
@pytest.mark.anyio | ||
async def test_reranker_top_k(client): | ||
query = "Where is the Eiffel Tower located?" | ||
documents = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential flaw: If backend does not do sorting, top_k=1 will just take always the first result.
Better unit test: Wrap this with a unit test, and return_text
. Make sure that topk=1 solution is always paris in this unit test. e.g. use https://docs.python.org/3/library/itertools.html
# someting like:
for return_text in [true, False]
for raw_score in [True, False]:
for permutation in itertools.permutation([..paris, ..us, uk]):
# you test above
```
@aloababa Thanks for adding all the changes! One concern, I am not sure if the Some extra wish:
Example why this currently does not work:
|
* add top_n parameter to rerank endpoint * RerankInput: top_k as optional + add field validation * handle top_k parameter in AsyncEngine * add tests for top_k * MichaelFeil: add sorting, dtype, to rerank api * add parametrization for engine * fix: typing --------- Co-authored-by: Benjamin Gustin <[email protected]>
This PR adds top_k parameter (number of top documents to return after reranking) to the rerank endpoint.
#342