fix inference quality caused by temperature parameter in bls #523
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When the prompt and parameters are the same, I use APIs of ensemble and tensorrt_llm_bls, the results are different.
And the result of
ensemble
is expected.I analyzed the code of bls and finally found that the inference quality dropped significantly in some scenarios, because the
temperature
parameters were not given.What's more, this problem has led to many bad cases in our prod services.
After fixing the
temperature
problem, the scores of blue and em are close tovllm
offp16
, here is the comparative data:Here is the code: name_map
And I have added an issue before.
#520