generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
6
[Tracking issue] Wrong loss scaling when accumulating gradient
#2617
opened Jan 23, 2025 by
qgallouedec
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Is there any problem with GRPOtrainer’s memory usage?
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
#2927
opened Feb 21, 2025 by
Tuziking
NCCL timeout when GRPO training with vllm
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2923
opened Feb 21, 2025 by
edwardzjl
How to support multi-device VLLM inference in the GRPO Trainer
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2922
opened Feb 21, 2025 by
0x404
simple question: SFTTrainer ValueError
🐛 bug
Something isn't working
🏋 SFT
Related to SFT
#2920
opened Feb 21, 2025 by
jbw3016
Fine tuning "thinking"/"reasoning" models
✨ enhancement
New feature or request
🏋 SFT
Related to SFT
#2919
opened Feb 21, 2025 by
GhostDog98
GRPO from VLM models?
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2917
opened Feb 20, 2025 by
dipta007
Clarification on KL Divergence Computation in GRPOTrainer
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
#2914
opened Feb 20, 2025 by
zhaopku
5 tasks done
Getting an error while using a PEFT model as a reward model in PPO training.
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
#2911
opened Feb 20, 2025 by
Tarak200
L447 of GRPO trainer 'num_return_sequences=self.num_generations'
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2910
opened Feb 20, 2025 by
zhengqigao
5 tasks done
Cannot import name 'shard_checkpoint' (possibly deprecated in Something isn't working
🏋 GRPO
Related to GRPO
⚡ PEFT
Related to PEFT
transformers
)
🐛 bug
#2909
opened Feb 20, 2025 by
anshuln2
5 tasks done
DPOTrainer loss goes down to 0.0 while at the end it reports the train_loss is 0.15 - the loss during training and at end differs substantially
⚡accelerate
Related to accelerate
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
⚡ PEFT
Related to PEFT
#2907
opened Feb 19, 2025 by
KemalDrop
How to use GRPOTrainer to train a LLM for code generation? What is the format of the dataset?
#2905
opened Feb 19, 2025 by
xiangxinhello
Save memory when layers are shared with ref model?
🐛 bug
Something isn't working
✨ enhancement
New feature or request
#2904
opened Feb 19, 2025 by
raphael-sch
GRPO
completions skip special tokens ?
🏋 GRPO
#2897
opened Feb 18, 2025 by
MohamedAliRashad
[Qwen2.5] LoRA with SFT seems to be stuck forever with DeepSpeed
🚀 deepspeed
Related to deepspeed
⚡ PEFT
Related to PEFT
🏋 SFT
Related to SFT
#2891
opened Feb 18, 2025 by
sayakpaul
Resue the logits in the _prepare_inputs.
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2888
opened Feb 18, 2025 by
linkedlist771
Bottleneck in GRPO training
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2887
opened Feb 18, 2025 by
ZYM66
Cause transformers error when use official GRPO examples to exact trainer
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2886
opened Feb 18, 2025 by
vagitablebirdcode
5 tasks done
PPOTrainer save_model function get error when save; no attribute 'zero_gather_16bit_weights_on_model_save'
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 PPO
Related to PPO
#2885
opened Feb 18, 2025 by
Havefun404
Rewad oscillates while reproduing R1-zero with GRPO
🏋 GRPO
Related to GRPO
🏋 Reward
Related to Reward modelling
#2884
opened Feb 18, 2025 by
Dong237
5 tasks done
ORPO Shape Mismatches when using Accelerate/Deepspeed
⚡accelerate
Related to accelerate
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 ORPO
Related to ORPO
#2882
opened Feb 17, 2025 by
dannnnthemannnn
5 tasks done
tensor shape error occurs when training with GRPO and use_vllm = False
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2878
opened Feb 17, 2025 by
Saturnoul
5 tasks done
AssertionError grpo
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2877
opened Feb 17, 2025 by
GuodongFan
5 tasks done
I have this strange error with Something isn't working
🏋 GRPO
Related to GRPO
GRPO Trainer
🐛 bug
#2876
opened Feb 16, 2025 by
MohamedAliRashad
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-01-21.