Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verbose graphic and layered block indices #110

Closed
wants to merge 12 commits into from

Conversation

azrahello
Copy link
Contributor

It's the first time I'm doing this operation and I'm afraid of breaking something! I've tried to be as non-intrusive as possible, especially since I didn't really have a clear understanding of what I was doing. I find the graphic particularly comfortable, but of course, it's just an idea

azrahello and others added 9 commits December 26, 2024 08:47
@@ -64,18 +64,20 @@ def from_spec(flux: "Flux1", training_spec: TrainingSpec) -> "LoRALayers":

return LoRALayers(weights=weights)

@staticmethod
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally added an extra @staticmethod here that is not needed

@staticmethod
def _construct_layers(
block_spec: TransformerBlocks | SingleTransformerBlocks,
blocks: list[JointTransformerBlock] | list[SingleTransformerBlock],
block_prefix: str,
) -> dict:
start = block_spec.block_range.start
end = block_spec.block_range.end
block_indices = block_spec.block_range.get_blocks() # Usa il metodo con ()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the code is mostly self explanatory, as I think in this case, then we don't need the comments.

If there is a need to have a comment, we always try to have them in English for the benefit of everyone.

@@ -54,10 +54,19 @@ class StatisticsSpec:
state_path: str | None = None


# indices and Optional
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can also be removed

return self.indices
if self.start is not None and self.end is not None:
return list(range(self.start, self.end - 1))
raise ValueError("Devono essere forniti 'start' e 'end' o 'indices'.")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message here should be in English

removed staticmetod  on line 67 and comment on 74
removed comment on line 57 
translated error in English
@azrahello
Copy link
Contributor Author

Hi, I’m not sure what I need to do! :P It’s my first time doing something like this on GitHub, apart from fixing reported issues (and I haven’t figured out how to do that without messing things up) :P I’ve modified what you pointed out! Thank you; I don’t want to clutter your code too much—coding isn’t really my thing, but with this opportunity, I’m learning a bit more.

import matplotlib.pyplot as plt

from mflux.dreambooth.state.training_spec import TrainingSpec
from mflux.dreambooth.state.training_state import TrainingState


class Plotter:
start_time = time.time() # Class variable to track time
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not start a timer in this way. The plotter class should only be concerned with actual plotting of graphs and the various methods that would need to know the start_time would probably need to have this passed in as an argument.

Also, doing it in this way would start a timer when the class is loaded, and it could lead to timing that is not what we want (actual training time which is what I would guess you intend here...). Also, with the elapsed time, it is a bit more complicated since we can stop and resume training at any time (as we talked about here #108), and this is something that should be considered when plotting the training time.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity: Since handling the elapsed time part properly might be slightly bigger thing, I think it can be done in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re absolutely right, and I fully admit that my approach to handling time might seem a bit primitive. I didn’t want to make too many changes as I was constantly worried about breaking something. :P I opted for a method that kept everything neatly contained within the task execution.

@filipstrand
Copy link
Owner

filipstrand commented Jan 21, 2025

Great first try contributing and a great initiative to try your own solution your original feature request.
Appreciate your transparency with this being your first time etc., it makes it easier to know how much effort to put into code-review etc :)

Some general notes:

  • As a good practice, it is nice to outline a bit in the PR what the solution is, what assumptions you have made etc. In this case, since you already have opened the issue Training on Non-Sequential Block Layers #109 some can be picked up from there, but still nice to have it in this PR. For example, a screenshot of how the updated plot looks like would be really great to have.

  • Some comments and error messages were in non-english which should be updated.

  • Maybe a bit more advanced, especially if you have not used git much before, but there is a nice ability in git to "squash commits" - this means that even though you might have made several back-and-forth changes during development, when you are happy with your efforts and want to present it for review, you can essentially present your final solution as one single change - you can think of this as only presenting your polished/final work without revealing the history. When taking this approach it would be fine to have your own commits in non-english and cryptic (I do so myself during development) because they would be hidden at the time of presenting the PR.

Great job, and feel free to address any of changes that you feel comfortable with (I can help with git squash thing later) and I'll have another look :)

@azrahello
Copy link
Contributor Author

First of all, thank you! I truly value everything that comes out of this space. Exploring MFlux has helped me deepen my understanding of my field of research (I work in photography, and my background in computer science is stuck 20 years in the past). I’ve only revisited this environment about a year ago as part of my work on a final project.

What I did by modifying the code was to create visual reference points for a more immediate evaluative analysis compared to opening a file. I tried to include details that would help me identify key differences between one training session and another, and the graph makes this much easier. Additionally, when quickly skimming through the plot files, I liked the idea of a kind of time-lapse effect, so I removed the empty space on the right to make the visualization more central, ending it right at the edge of the graph (aesthetic-wise, I find it works better with the legend).

I wanted to implement an easing-like approach, but my mathematical skills aren’t strong enough to find a proper method, so I improvised and found a solution to adjust the flow of the code. I added an extra 0.01% margin to prevent the graph from hitting the exact right edge. This tweak also gives me a better sense of training progress without relying on other metrics. Lastly, I included the tracking of maximum and minimum loss peaks, which I find particularly helpful
Screenshot 2025-01-21 alle 9 26 51 PM

@filipstrand
Copy link
Owner

filipstrand commented Jan 27, 2025

@azrahello I took the liberty of rebasing your branch ontop of the latest mflux main commit. Since this requires a force push on your version of main, I don't think I have the access to do so, so I opened up this PR #121 with your changes (you are still author of the main commit there). Once we are happy with the updates, we can merge that PR and close this one.

I have made a few small tweaks on top of your changes. One thing I just noticed with the updated loss graph, however, is that the y-axis shows a lot of empty space...I can try to fix this, since I don't think there is any reason to show a lot of empty space on the y-axis (e.g being 1.5 if the top loss value is around 0.6)

Main
Screenshot 2025-01-27 at 23 35 12

This PR
Screenshot 2025-01-27 at 23 35 20

@filipstrand
Copy link
Owner

Now I changed so that we don't show that much empty space on the y-axis

Screenshot 2025-01-27 at 23 49 39

@filipstrand
Copy link
Owner

@azrahello I think this looks good, and I have tried it with both the old rages and the new indices input. I'll go ahead and merge #121 to main. Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants