You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when i use medusa train, medusa0,medusa1,medusa2 acc has 0.95, train result is ok,
but i try vllm to delpoy medusa, deploy is ok,
but test sample result not has accelerate, draft acceptance rate is 0.0
🐛 Describe the bug
Speculative metrics: Draft acceptance rate: 0.000, System efficiency: 0.250, Number of speculative tokens: 3, Number of accepted tokens: 0, Number of draft tokens: 483, Number of emitted tokens: 161.
The text was updated successfully, but these errors were encountered:
Could you double check your medusa model config is compatible with vllm's requirements? As shown here, the model config is different from original model config.
Could you double check your medusa model config is compatible with vllm's requirements? As shown here, the model config is different from original model config.
thanks, i try again like your command, it's ok, i find remove typical_acceptance_sampler or use reject_sampler, it's work fine.
Your current environment
vllm==0.6.1
Model Input Dumps
when i use medusa train, medusa0,medusa1,medusa2 acc has 0.95, train result is ok,
but i try vllm to delpoy medusa, deploy is ok,
but test sample result not has accelerate, draft acceptance rate is 0.0
🐛 Describe the bug
Speculative metrics: Draft acceptance rate: 0.000, System efficiency: 0.250, Number of speculative tokens: 3, Number of accepted tokens: 0, Number of draft tokens: 483, Number of emitted tokens: 161.
The text was updated successfully, but these errors were encountered: