Jina reranker(turbo/tiny) being classified as embedding models #325

John42506176Linux · 2024-07-25T19:10:01Z

System Info

System Info:

AWS EC2 G4dn

Amazon Linux

Model:jinaai/jina-reranker-v1-tiny-en or jinaai/jina-reranker-v1-turbo-en

Hardware: Nvidia-smi

Using latest docker version

Command:

port=7997
rerank_model=jinaai/jina-reranker-v1-tiny-en
volume=$PWD/data

sudo docker run -it --gpus all
-v $volume:/app/.cache
-p $port:$port
michaelf34/infinity:latest
v2
--batch-size 256
--model-id $rerank_model
--port $port

Information

Docker
The CLI directly via pip

Tasks

An officially supported command
My own modifications

Reproduction

Run the following command,

port=7997
rerank_model=jinaai/jina-reranker-v1-tiny-en
volume=$PWD/data

sudo docker run -it --gpus all
-v $volume:/app/.cache
-p $port:$port
michaelf34/infinity:latest
v2
--batch-size 256
--model-id $rerank_model
--port $port

Then attempt to use the /rerank endpoint with a simple body

{
"query": "test",
"documents": [
"test"
],
"return_documents": false,
"model": "jinaai/jina-reranker-v1-tiny-en"
}

and you will get the following error

"error": {
"message": "ModelNotDeployedError: model=jinaai/jina-reranker-v1-tiny-en does not support rerank. Reason: the loaded moded cannot fullyfill rerank.options are {'embed'}.",
"type": null,
"param": null,
"code": 400
}
}.

I've tested this with other inference servers like Text embedding inference and the same error occurs, however, it does not occur with the standard transformer library.

Expected behavior

Should be able to rerank with these models.

The text was updated successfully, but these errors were encountered:

greptile-apps · 2024-07-25T19:10:28Z

Resolution Steps

Update create_server function in /libs/infinity_emb/infinity_emb/infinity_server.py:

Modify _resolve_engine to include reranker models in its capabilities check.

def _resolve_engine(model: str) -> "AsyncEmbeddingEngine":
    try {
        engine: "AsyncEmbeddingEngine" = app.engine_array[model]  # type: ignore
    } except IndexError as ex {
        raise errors.OpenAIException(
            f"Invalid model: {ex}",
            code=status.HTTP_400_BAD_REQUEST,
        )
    }
    if engine.is_overloaded() {
        raise errors.OpenAIException(
            f"model {model} is currently overloaded",
            code=status.HTTP_429_TOO_MANY_REQUESTS,
        )
    }
    if 'rerank' not in engine.capabilities {
        raise errors.OpenAIException(
            f"ModelNotDeployedError: model=`{model}` does not support `rerank`.",
            code=status.HTTP_400_BAD_REQUEST,
        )
    }
    return engine
}

Update tests in /libs/infinity_emb/tests/end_to_end/test_torch_reranker.py:

Ensure reranker models are correctly identified and tested.

@pytest.mark.anyio
async def test_reranker(client, model_base, helpers) {
    query = "Where is the Eiffel Tower located?"
    documents = [
        "The Eiffel Tower is located in Paris, France",
        "The Eiffel Tower is located in the United States.",
        "The Eiffel Tower is located in the United Kingdom.",
    ]
    response = await client.post(
        f"{PREFIX}/rerank",
        json={"model": MODEL, "query": query, "documents": documents},
    )
    assert response.status_code == 200
    rdata = response.json()
    assert "model" in rdata
    assert "usage" in rdata
    rdata_results = rdata["results"]

    predictions = [
        model_base.predict({"text": query, "text_pair": doc}) for doc in documents
    ]

    assert len(rdata_results) == len(predictions)
    for i, pred in enumerate(predictions) {
        assert abs(rdata_results[i]["relevance_score"] - pred["score"]) < 0.01
    }
}

References

/docs/docs/index.md
/README.md
/libs/infinity_emb/tests/end_to_end/test_optimum_embedding.py
/libs/infinity_emb/infinity_emb/fastapi_schemas/convert.py
/.github/ISSUE_TEMPLATE/new-model-addition.yml
/docs/docs/deploy.md
/docs/assets/openapi.json
/docs/docs
/libs/infinity_emb/tests/data/benchmark

/docs/benchmarks/simple_app.py
/.github/ISSUE_TEMPLATE/bug-report.yml
/libs/infinity_emb/tests/end_to_end/test_torch_reranker.py
/libs/infinity_emb/tests/conftest.py
/docs/docs/python_engine.md
/docs/benchmarks
/docs/docs/benchmarking.md
/docs
/libs/infinity_emb/tests/end_to_end/test_authentication.py
/libs/infinity_emb/tests/data
/libs/infinity_emb
/.github/workflows/release.yaml
/docs/assets
/.github/ISSUE_TEMPLATE
/libs/infinity_emb/infinity_emb/infinity_server.py

About Greptile

This response provides a starting point for your research, not a precise solution.

Help us improve! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

Ask Greptile · Edit Issue Bot Settings

michaelfeil · 2024-07-25T19:16:36Z

Damn , Greptile is pretty useless.

michaelfeil · 2024-07-25T19:17:51Z

Does e.g. something like this work? https://huggingface.co/jinaai/jina-reranker-v1-turbo-en/discussions/10 aka --revision refs/pr/10 or so ? Seems like jina messed up their config. https://huggingface.co/mixedbread-ai/mxbai-rerank-xsmall-v1/blob/main/config.json mixedbread does it better here..

michaelfeil · 2024-07-25T19:25:39Z

@John42506176Linux The detail is that you need to set "architectures": ["JinaBertForSequenceClassification"], in the config.json to be recognized as reranker model. Would you be so kind and open PRs at the Jina models for this? I apprechiate your time

John42506176Linux · 2024-07-25T19:26:57Z

Assumed it was something simple, thanks for the quick response, I am testing your first comment rn.

John42506176Linux · 2024-07-25T19:31:48Z

K looks good. Thanks for the quick fix, I appreciate the quick response, I'll open a pr, for the tiny model soon. (Need to finish testing for the turbo model first), but thanks saved me some time :).

John42506176Linux · 2024-07-25T20:09:14Z

@michaelfeil Tiny gives the following error when making the config.json change RuntimeError: Error(s) in loading state_dict for JinaBertForSequenceClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([1, 384]) from checkpoint, the shape in current model is torch.Size([2, 384]).
size mismatch for classifier.bias: copying a param with shape torch.Size([1]) from checkpoint, the shape in current model is torch.Size([2]).
You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

michaelfeil · 2024-07-25T20:41:50Z

@John42506176Linux Seems like the reranker model has 2 outputs. That is not how rerankers are trained. Rerankers usually have one and only one output class.

With all respect, I don't have time to fix Jina's questionable choice for training here. The config file is ambiguous and leaves a lot of room for how to load the model.

John42506176Linux · 2024-07-25T20:46:27Z

No worries, you already saved me time, by helping with turbo. Thanks for the assistance.

wirthual · 2024-10-31T04:23:05Z

Hi @michaelfeil and @John42506176Linux ,

adding num_labels:1 additonally to the config seems to do the trick here.

I tested it with:

infinity_emb v2 --model-id jinaai/jina-reranker-v1-turbo-en --revision refs/pr/11

And it correclty shows up as a rerank model:

"capabilities":["rerank"]

michaelfeil · 2024-10-31T15:04:00Z

@wirthual Thanks so much! This is exactly the solution for this model!

wirthual mentioned this issue Oct 31, 2024

jinaai/jina-reranker-v1-*-en does not work with optimum #362

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jina reranker(turbo/tiny) being classified as embedding models #325

Jina reranker(turbo/tiny) being classified as embedding models #325

John42506176Linux commented Jul 25, 2024 •

edited

Loading

greptile-apps bot commented Jul 25, 2024

About Greptile

michaelfeil commented Jul 25, 2024

michaelfeil commented Jul 25, 2024

michaelfeil commented Jul 25, 2024

John42506176Linux commented Jul 25, 2024

John42506176Linux commented Jul 25, 2024

John42506176Linux commented Jul 25, 2024

michaelfeil commented Jul 25, 2024

John42506176Linux commented Jul 25, 2024

wirthual commented Oct 31, 2024

michaelfeil commented Oct 31, 2024

Jina reranker(turbo/tiny) being classified as embedding models #325

Jina reranker(turbo/tiny) being classified as embedding models #325

Comments

John42506176Linux commented Jul 25, 2024 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

greptile-apps bot commented Jul 25, 2024

Resolution Steps

References

About Greptile

michaelfeil commented Jul 25, 2024

michaelfeil commented Jul 25, 2024

michaelfeil commented Jul 25, 2024

John42506176Linux commented Jul 25, 2024

John42506176Linux commented Jul 25, 2024

John42506176Linux commented Jul 25, 2024

michaelfeil commented Jul 25, 2024

John42506176Linux commented Jul 25, 2024

wirthual commented Oct 31, 2024

michaelfeil commented Oct 31, 2024

John42506176Linux commented Jul 25, 2024 •

edited

Loading