ORPO (Or DPO?) #1350
Closed
exdownloader
started this conversation in
General
ORPO (Or DPO?)
#1350
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've seen a few discussions about DPO for sd-scripts, specifically this and this.
However there hasn't been further movement on either, from what I can tell.
ORPO is related to DPO and some even consider it superior.
I was recently browsing the various forks of sd-scripts and found the following repo which appears to be under active development.
The branch doesn't function for me, erroring out with the following:
I think that any kind of preference training would be interesting to explore and would be happy to see this kind of feature in sd-scripts but I have not been able to successfully contact the developer of this fork and so I'm raising awareness here in case there is a chance to gain traction.
After speaking with other AI/ML researchers and developers, I have been informed that regular DPO training is "easy" to implement.
Beta Was this translation helpful? Give feedback.
All reactions