-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
native_guidance_scale parameter for LCMs in StableDiffusionXLPipeline #6993
base: main
Are you sure you want to change the base?
Conversation
… models This parameter can be used with distilled LCM models to also perform regular classifier-free guidance as usual. This was not possible but can be useful in terms of the quality of the generations for some cases.
Cc: @patil-suraj |
…for LCM models This parameter can be used with distilled LCM models to also perform regular classifier-free guidance as usual. This was not possible but can be useful in terms of the quality of the generations for some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of introducing this parameter into SDXL - it is only applicable to LCM. is this something we can implement with callback_on_the_step_end
?
@yiyixuxu Unfortunately, it is not possible to implement it using
I understand why you don't like adding a new parameter that works just for LCMs, but I don't see good alternative solutions and the proposed one here is harmless. What do you think? |
@ivanprado |
cc @vladmandic @asomoza @DN6 here as well should we introduce a |
agree it is harmless, but if we follow such a principle to add parameters for any small use cases, we will quickly get overwhelmed - we introduced callback loops for this exact reason. note that we have some parameters such as It will be much easier for users to tweak our pipelines too, without having to submit PRs |
no issues with that on my side - and having more callbacks just gives more flexibility without too much complexities for normal user as they don't have to be used. |
Maybe I'm missing something but in this case wouldn't it be better to just remove the check I think this can be resolved with just documentation, in my code I don't have that check for the same reason. |
it will change the expected behavior and backward breaking then |
In that case I don't see any alternative than adding the callback, I agree with @vladmandic that more callbacks add more flexibility but personally I don't see a real use case for @a-r-r-o-w did an experiment with this in #7038 (comment) though. |
Something to keep in mind. Even if the
|
I'm working on changing the code. Particularly, I propose to add the following parameters:
Early feedback is welcome. |
…at pipeline_stable_diffusion_xl.py and pipeline_stable_diffusion_xl_img2img.py callback_on_step_end_also_at_init (`bool`, *optional*, defaults to False): If `True`, the `callback_on_step_end` function will also be called before the start of the inference. The callback will receive -1 as step to identify this particular case, in which some tensors might not be available. force_classifier_free_guidance (`bool`, *optional*, defaults to False): Forces the execution of classifier free guidance, even if the guidance scale is below 1 or the model is a LCM model.
@yiyixuxu this is ready for a re-review. I've removed the old parameters, and introduced the following ones that are also backward compatible:
The test cases has been modified accordingly. |
@ivanprado Great work with this! Just curious: can't the functionality of |
@a-r-r-o-w note that you could obtain almost the same effect that you get with steps = 10
def callback_on_step_end(pipe, step_index, timestep, callback_kwargs):
# Your implementation here
...
def callback_on_step_end(pipe, step_index, timestep, callback_kwargs):
nonlocal steps
timesteps =
step_index += 1
if step_index != steps:
callback_kwargs = callback_on_step_begin(pipe, step_index, timestep, callback_kwargs)
return callback_kwargs
result = pipe(
prompt=prompt, num_inference_steps=steps,
callback_on_step_end=callback_on_step_end,
callback_on_step_end_also_at_init=True,
).images[0] The only problem I see is with the timestep, which will be only right for the first step. The rest will have the timestep of the previous step. But if we would have access to the |
Hi @yiyixuxu I've already implemented the suggested changes. It would be nice if you can have a look. |
good candidate for #7761 |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
Distilled LCMs don't perform regular class-free guidance. Instead, they pass the
guidance_scale
as conditioning to the U-Net. This is cool because it reduces the computing required by 2x, given that the negative prediction is not required.But in practice, we have seen that being able to also perform regular classifier-free guidance in addition to the conditional
guidance_scale
can be useful:This PR introduces a new parameter,
native_guidance_scale
, that can be used with distilled LCM models to perform regular classifier-free guidance.An example
Code to test the change in Text2Image pipeline:
Resultant images:
Code to test the change in img2img pipeline:
Resultant images:
@patrickvonplaten and @sayakpaul