-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verbose graphic and layered block indices #110
Conversation
Some features that could provide visual indications of training progress. I am not a programmer, so I apologize for the roughness of the approach. This is meant as an idea or suggestions that could be implemented.
@@ -64,18 +64,20 @@ def from_spec(flux: "Flux1", training_spec: TrainingSpec) -> "LoRALayers": | |||
|
|||
return LoRALayers(weights=weights) | |||
|
|||
@staticmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accidentally added an extra @staticmethod
here that is not needed
@staticmethod | ||
def _construct_layers( | ||
block_spec: TransformerBlocks | SingleTransformerBlocks, | ||
blocks: list[JointTransformerBlock] | list[SingleTransformerBlock], | ||
block_prefix: str, | ||
) -> dict: | ||
start = block_spec.block_range.start | ||
end = block_spec.block_range.end | ||
block_indices = block_spec.block_range.get_blocks() # Usa il metodo con () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the code is mostly self explanatory, as I think in this case, then we don't need the comments.
If there is a need to have a comment, we always try to have them in English for the benefit of everyone.
@@ -54,10 +54,19 @@ class StatisticsSpec: | |||
state_path: str | None = None | |||
|
|||
|
|||
# indices and Optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment can also be removed
return self.indices | ||
if self.start is not None and self.end is not None: | ||
return list(range(self.start, self.end - 1)) | ||
raise ValueError("Devono essere forniti 'start' e 'end' o 'indices'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message here should be in English
removed staticmetod on line 67 and comment on 74
removed comment on line 57 translated error in English
Hi, I’m not sure what I need to do! :P It’s my first time doing something like this on GitHub, apart from fixing reported issues (and I haven’t figured out how to do that without messing things up) :P I’ve modified what you pointed out! Thank you; I don’t want to clutter your code too much—coding isn’t really my thing, but with this opportunity, I’m learning a bit more. |
import matplotlib.pyplot as plt | ||
|
||
from mflux.dreambooth.state.training_spec import TrainingSpec | ||
from mflux.dreambooth.state.training_state import TrainingState | ||
|
||
|
||
class Plotter: | ||
start_time = time.time() # Class variable to track time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not start a timer in this way. The plotter class should only be concerned with actual plotting of graphs and the various methods that would need to know the start_time
would probably need to have this passed in as an argument.
Also, doing it in this way would start a timer when the class is loaded, and it could lead to timing that is not what we want (actual training time which is what I would guess you intend here...). Also, with the elapsed time, it is a bit more complicated since we can stop and resume training at any time (as we talked about here #108), and this is something that should be considered when plotting the training time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity: Since handling the elapsed time part properly might be slightly bigger thing, I think it can be done in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re absolutely right, and I fully admit that my approach to handling time might seem a bit primitive. I didn’t want to make too many changes as I was constantly worried about breaking something. :P I opted for a method that kept everything neatly contained within the task execution.
Great first try contributing and a great initiative to try your own solution your original feature request. Some general notes:
Great job, and feel free to address any of changes that you feel comfortable with (I can help with git squash thing later) and I'll have another look :) |
@azrahello I took the liberty of rebasing your branch ontop of the latest mflux main commit. Since this requires a force push on your version of main, I don't think I have the access to do so, so I opened up this PR #121 with your changes (you are still author of the main commit there). Once we are happy with the updates, we can merge that PR and close this one. I have made a few small tweaks on top of your changes. One thing I just noticed with the updated loss graph, however, is that the y-axis shows a lot of empty space...I can try to fix this, since I don't think there is any reason to show a lot of empty space on the y-axis (e.g being 1.5 if the top loss value is around 0.6) |
@azrahello I think this looks good, and I have tried it with both the old rages and the new |
It's the first time I'm doing this operation and I'm afraid of breaking something! I've tried to be as non-intrusive as possible, especially since I didn't really have a clear understanding of what I was doing. I find the graphic particularly comfortable, but of course, it's just an idea