Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://arxiv.org/abs/2502.14538v1
LoRA-GGPO (Gradient-Guided Perturbation Optimization) suggests they can mitigate double descent in LoRA.
To do this they get the weight norm and gradient norm and use them to perturbed the activation's with a random matrix.
On the chart it shows LoRA+ and rsLoRA but they are not mutually exclusive and can be used together.
Downside is it can increase training time by a semi-significant margin currently. I was able to get it down into the 20% decreased training speed but maybe better code could make it faster. In the paper they suggest 5% but I was not able to hit this.
To improve speed:
Usage:
--network_args ggpo_sigma=0.03 ggpo_beta=0.01
Generally keeping it as the values above would be good and all they represented in the paper.
It is working at this stage and I have been using it. Originally implemented for the new Lumina LoRA but could be translated to SD1.5, SDXL LoRA's as well.
I also want to consider merging the norm calculations of scale_weight_norm with the norms in this, and modifying or making the schedule modifiable for how often it gets the weight norm norms for it. Displaying the weight and grad norms would also be a nice thing, as well as recording them in the logs.