-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving the original model file metadata might be needed #41
Comments
The hash changes if you replace the VAE, or fix clip positions, remove an embedded controlnet, remove random pytorch lightning keys, remove EMA, convert to safetensors, etc, etc. In all these cases the model remains the same to the user, its the same model just with less junk, yet the hash is completely different. So really model hash's are a ridiculous method of identifying models, the model name should be enough (posters just need to take 2 seconds to name their models). On a note, metadata can be embedded into safetensor files, this is a feature of the safetensor standard, though its not used at all in the SD community. Instead we opt to do stupid things like make .yaml config files with the same name as the model that users have to make sure to download or its completely broken, etc. |
Community will always do whatever is easiest and works unless they are forced to do things right. Could then Model Toolkit offer to embed the metadata saying "Pruned by Model Toolkit version cf82458; original size: x bytes; original hash: xyz"? And create the new file with "_pruned" infix? File names are apparently not a good enough method of identifying Loras, as two authors can name their files the same way, right and put them in different repositories so the filename conflict isn't detected automatically? Or they reuse the name when creating new version of the same Lora. And Civitai doesn't allow filename search in their API (https://github.com/orgs/civitai/discussions/183) so I'm having a hard time trying to find the Loras I use in Civitai after pruning. Civitai isn't the only model repository, the same models happen to coexist on other sites under different filenames. Since neither hash or filename are perfect, the tools for creating models should just put UUID inside while generating. I've suggested it (bmaltais/kohya_ss#1601) and hope Model Toolkit will preserve it while pruning. |
UUID is probably the best technical solution, but i wouldn't count on training software. If Civitai started to embed a UUID derived from the model page ID it would very quickly solve this issue. Though naming could also naturally fix itself as the struggle becomes more common (like how pruning became more common). Posters can just include their name (and version if applicable). I'll patch the toolkit to preserve safetensor metadata. |
I like the idea of pruning models but I've got into some trouble because of running Autoprune:
For these reasons I suppose Autoprune should at least store original size, hash etc. into a separate directory (or even have an option of saving the original files, if not the "delta" information that could be used to un-prune models) so that nobody runs into such issues again. Perhaps the dev teams could also collaborate on integrating the plugins better, e.g. if Autoprune somehow "marked" the pruned model file, then the Civitai Helper would know where to look for the original file hash and succeed with its search.
The text was updated successfully, but these errors were encountered: