Getting an error while using a PEFT model as a reward model in PPO training. #2911
Labels
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
AttributeError in
DeepSpeedEngine
when accessingbase_model_prefix
for PPO training using trlDescription
When running PPO training with a PEFT-based reward model wrapped in DeepSpeed, an
AttributeError
occurs when trying to accessbase_model_prefix
from the model. It seems thatDeepSpeedEngine
does not expose this attribute, leading to a crash.Error Traceback
Code to Reproduce
The text was updated successfully, but these errors were encountered: