Getting an error while using a PEFT model as a reward model in PPO training. #2911

Tarak200 · 2025-02-20T06:58:14Z

AttributeError in `DeepSpeedEngine` when accessing `base_model_prefix` for PPO training using trl

Description

When running PPO training with a PEFT-based reward model wrapped in DeepSpeed, an AttributeError occurs when trying to access base_model_prefix from the model. It seems that DeepSpeedEngine does not expose this attribute, leading to a crash.

Error Traceback

Traceback (most recent call last):
[rank6]:   File "/raid/ganesh/nithin/trl/examples/scripts/ppo/ppo.py", line 211, in <module>
[rank6]:     trainer.train()
[rank6]:   File "/raid/ganesh/nithin/trl/trl/trainer/ppo_trainer.py", line 462, in train
[rank6]:     _, score, _ = get_reward(
[rank6]:                   ^^^^^^^^^^^
[rank6]:   File "/raid/ganesh/nithin/trl/trl/trainer/utils.py", line 1179, in get_reward
[rank6]:     lm_backbone = getattr(model, model.base_model_prefix)
[rank6]:                                  ^^^^^^^^^^^^^^^^^^^^^^^
[rank6]:   File "/raid/ganesh/nithin/trl/myenv/lib/python3.12/site-packages/deepspeed/runtime/engine.py", line 519, in __getattr__
[rank6]:     raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
[rank6]: AttributeError: 'DeepSpeedEngine' object has no attribute 'base_model_prefix'

Code to Reproduce

from transformers import AutoModelForSequenceClassification, BitsAndBytesConfig
from peft import PeftModel

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

reward_model_path = "Qwen/Qwen2.5-Coder-3B-Instruct"

reward_model = AutoModelForSequenceClassification.from_pretrained(
    reward_model_path,
    trust_remote_code=True,
    num_labels=1,
    quantization_config=quantization_config,
)

adapter_path = "./reward_model_SC-qwen-3B-4096/final_checkpoint"
reward_model_ft = PeftModel.from_pretrained(reward_model, adapter_path)


#### Expected Behavior
- The model should correctly load as a PEFT-based reward model.
- The training should proceed without errors related to `base_model_prefix`.

#### Actual Behavior
- The script crashes when trying to access `model.base_model_prefix` since `DeepSpeedEngine` does not have this attribute.

#### Environment
- **DeepSpeed Version:** `0.16.3`
- **Transformers Version:** `4.48.3`
- **TRL Version:** `0.15.0`
- **Python Version:** `3.12`
- **Model Used:** `qwen2.5-Coder-3B-Instruct`
- **Hardware:** `8 A100s`

Would appreciate any guidance or solutions. Thanks!

The text was updated successfully, but these errors were encountered:

github-actions bot added ⚡ PEFT Related to PEFT 🏋 PPO Related to PPO 🚀 deepspeed Related to deepspeed 🐛 bug Something isn't working labels Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting an error while using a PEFT model as a reward model in PPO training. #2911

Getting an error while using a PEFT model as a reward model in PPO training. #2911

Tarak200 commented Feb 20, 2025 •

edited

Loading

Getting an error while using a PEFT model as a reward model in PPO training. #2911

Getting an error while using a PEFT model as a reward model in PPO training. #2911

Comments

Tarak200 commented Feb 20, 2025 • edited Loading

AttributeError in DeepSpeedEngine when accessing base_model_prefix for PPO training using trl

Description

Error Traceback

Code to Reproduce

Tarak200 commented Feb 20, 2025 •

edited

Loading

AttributeError in `DeepSpeedEngine` when accessing `base_model_prefix` for PPO training using trl