Skip to content

Issues: huggingface/trl

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Is there any problem with GRPOtrainer’s memory usage? 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2927 opened Feb 21, 2025 by Tuziking
NCCL timeout when GRPO training with vllm 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2923 opened Feb 21, 2025 by edwardzjl
How to support multi-device VLLM inference in the GRPO Trainer ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2922 opened Feb 21, 2025 by 0x404
simple question: SFTTrainer ValueError 🐛 bug Something isn't working 🏋 SFT Related to SFT
#2920 opened Feb 21, 2025 by jbw3016
Fine tuning "thinking"/"reasoning" models ✨ enhancement New feature or request 🏋 SFT Related to SFT
#2919 opened Feb 21, 2025 by GhostDog98
GRPO from VLM models? ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2917 opened Feb 20, 2025 by dipta007
Clarification on KL Divergence Computation in GRPOTrainer 🏋 GRPO Related to GRPO ❓ question Seeking clarification or more information
#2914 opened Feb 20, 2025 by zhaopku
5 tasks done
Getting an error while using a PEFT model as a reward model in PPO training. 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed ⚡ PEFT Related to PEFT 🏋 PPO Related to PPO
#2911 opened Feb 20, 2025 by Tarak200
L447 of GRPO trainer 'num_return_sequences=self.num_generations' 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2910 opened Feb 20, 2025 by zhengqigao
5 tasks done
Cannot import name 'shard_checkpoint' (possibly deprecated in transformers) 🐛 bug Something isn't working 🏋 GRPO Related to GRPO ⚡ PEFT Related to PEFT
#2909 opened Feb 20, 2025 by anshuln2
5 tasks done
Save memory when layers are shared with ref model? 🐛 bug Something isn't working ✨ enhancement New feature or request
#2904 opened Feb 19, 2025 by raphael-sch
GRPO completions skip special tokens ? 🏋 GRPO Related to GRPO ⏳ needs more info Additional information or clarification is required to proceed 🏋 Reward Related to Reward modelling
#2897 opened Feb 18, 2025 by MohamedAliRashad
[Qwen2.5] LoRA with SFT seems to be stuck forever with DeepSpeed 🚀 deepspeed Related to deepspeed ⚡ PEFT Related to PEFT 🏋 SFT Related to SFT
#2891 opened Feb 18, 2025 by sayakpaul
Resue the logits in the _prepare_inputs. ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2888 opened Feb 18, 2025 by linkedlist771
Bottleneck in GRPO training ✨ enhancement New feature or request 🏋 GRPO Related to GRPO
#2887 opened Feb 18, 2025 by ZYM66
Cause transformers error when use official GRPO examples to exact trainer 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2886 opened Feb 18, 2025 by vagitablebirdcode
5 tasks done
Rewad oscillates while reproduing R1-zero with GRPO 🏋 GRPO Related to GRPO 🏋 Reward Related to Reward modelling
#2884 opened Feb 18, 2025 by Dong237
5 tasks done
Does the vllm 0.6 is ok? 🐛 bug Something isn't working
#2883 opened Feb 18, 2025 by catsled
ORPO Shape Mismatches when using Accelerate/Deepspeed ⚡accelerate Related to accelerate 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed 🏋 ORPO Related to ORPO
#2882 opened Feb 17, 2025 by dannnnthemannnn
5 tasks done
tensor shape error occurs when training with GRPO and use_vllm = False 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2878 opened Feb 17, 2025 by Saturnoul
5 tasks done
AssertionError grpo 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2877 opened Feb 17, 2025 by GuodongFan
5 tasks done
I have this strange error with GRPO Trainer 🐛 bug Something isn't working 🏋 GRPO Related to GRPO
#2876 opened Feb 16, 2025 by MohamedAliRashad
ProTip! What’s not been updated in a month: updated:<2025-01-21.