Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking β€œSign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸ‘¨β€πŸ‘¨β€πŸ‘§β€πŸ‘§ GRPO #2565

Merged
merged 66 commits into from
Jan 20, 2025
Merged
Changes from 1 commit
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
6f489b4
init grpo [ci skip]
qgallouedec Jan 12, 2025
5030c56
initial version
qgallouedec Jan 13, 2025
bec5491
refine args defs
qgallouedec Jan 13, 2025
47e5dcc
model card
qgallouedec Jan 13, 2025
0b484b2
initial doc
qgallouedec Jan 13, 2025
3a5ecb2
Merge branch 'main' into grpo
qgallouedec Jan 13, 2025
5893e29
fix badges
qgallouedec Jan 13, 2025
9ad30cf
Merge branch 'grpo' of https://github.com/huggingface/trl into grpo
qgallouedec Jan 13, 2025
205d817
fix spaces
qgallouedec Jan 13, 2025
5270694
try link to super in doc
qgallouedec Jan 13, 2025
f37e348
temperature, fix indexing, and std=0.0
qgallouedec Jan 13, 2025
ca5b388
grpo script for cli
qgallouedec Jan 13, 2025
bea601c
peft support
qgallouedec Jan 13, 2025
5848d35
move data preparation in `compute_loss`
qgallouedec Jan 13, 2025
5f1f8c1
weird doc trial
qgallouedec Jan 13, 2025
7ee4107
fix device and some logging
qgallouedec Jan 13, 2025
af704ce
unwrap_model_for_generation for distributed setting
qgallouedec Jan 13, 2025
14ac49f
Compat with distrib training
qgallouedec Jan 14, 2025
3ccf20a
revert grpo config doc trial (didn't work)
qgallouedec Jan 14, 2025
80a166d
test
qgallouedec Jan 14, 2025
ab29a79
allow model to be str and processing_class to be none; fix loss compu…
qgallouedec Jan 14, 2025
469ff33
advantage is always 0.0: don't log
qgallouedec Jan 14, 2025
14bf93f
fix peft not installed
qgallouedec Jan 14, 2025
7a5cb32
proper reward model for testing
qgallouedec Jan 14, 2025
7606ca5
fix script for cli
qgallouedec Jan 14, 2025
a7760d6
add trl grpo to cli doc
qgallouedec Jan 14, 2025
7223a21
test peft
qgallouedec Jan 14, 2025
c70402c
flush left
qgallouedec Jan 14, 2025
106d271
fix reward calculation
qgallouedec Jan 14, 2025
1a98c6e
new reward model
qgallouedec Jan 15, 2025
defa22d
support any reward model
qgallouedec Jan 15, 2025
b9035fd
fix reward processing class def
qgallouedec Jan 15, 2025
c597c62
log reward std
qgallouedec Jan 15, 2025
be1c21e
fix reward logging
qgallouedec Jan 16, 2025
071c19a
fix grad computation
qgallouedec Jan 16, 2025
16c9110
skip embed layer in test
qgallouedec Jan 17, 2025
7c5b2fb
remove optimizer_cls_and_kwargs
qgallouedec Jan 17, 2025
c2f0254
improve GRPO default args
qgallouedec Jan 17, 2025
aa739f1
Merge branch 'main' into grpo
qgallouedec Jan 17, 2025
4c8c3f2
reduce mem usage for grpo test
qgallouedec Jan 17, 2025
189a4ab
Merge branch 'grpo' of https://github.com/huggingface/trl into grpo
qgallouedec Jan 17, 2025
066ce33
reduce mem usage in test grpo
qgallouedec Jan 17, 2025
e894a6c
reduce memory usage for test
qgallouedec Jan 17, 2025
c6690da
Fix the test
qgallouedec Jan 17, 2025
c4422b9
Merge branch 'main' into grpo
qgallouedec Jan 17, 2025
431c396
remove redondant
qgallouedec Jan 17, 2025
5aa0c4f
fix min version
qgallouedec Jan 18, 2025
00e1e1a
Update test_grpo_trainer.py
qgallouedec Jan 18, 2025
3766535
Update test_grpo_trainer.py
qgallouedec Jan 18, 2025
73542d4
Fix test, finally found the solution!
qgallouedec Jan 18, 2025
2b060dd
some doc
qgallouedec Jan 18, 2025
3ee8816
Merge branch 'main' into grpo
qgallouedec Jan 20, 2025
de972cd
Update doc-builder workflow to use specific commit sha
qgallouedec Jan 20, 2025
9dcb231
Merge branch 'grpo' of https://github.com/huggingface/trl into grpo
qgallouedec Jan 20, 2025
6684bbd
more doc
qgallouedec Jan 20, 2025
49d8c7e
advantages
qgallouedec Jan 20, 2025
a974250
drop cancel fo no grad
qgallouedec Jan 20, 2025
dc65083
logged metrics [ci skip]
qgallouedec Jan 20, 2025
35599a4
completion col is ignored [ci skip]
qgallouedec Jan 20, 2025
73423e4
fix latex
qgallouedec Jan 20, 2025
3e944ce
double space? ~?
qgallouedec Jan 20, 2025
d48e0b3
try a latex fix
qgallouedec Jan 20, 2025
a8466ba
with branch
qgallouedec Jan 20, 2025
1154530
Empty commit
qgallouedec Jan 20, 2025
14b798b
Empty commit
qgallouedec Jan 20, 2025
cd45878
double space seems to be the solution
qgallouedec Jan 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Empty commit
qgallouedec committed Jan 20, 2025
commit 14b798b3f3fb4d8c4fb26db422ad2fbb965c34ba

No changes to show.

This commit has no content.