Saving the original model file metadata might be needed #41

paboum · 2023-10-11T19:41:53Z

I like the idea of pruning models but I've got into some trouble because of running Autoprune:

Since Civitai doesn't autoprune all their models, I can't now easily use https://github.com/zixaphir/Stable-Diffusion-Webui-Civitai-Helper.git to update Loras.
Since Civitai only uses a hash of the whole file in their search API (see [Bug]: LoRA hashes are not using correct hash function civitai/civitai#742 ), now I can't also simply recollect the original model file to fix issue no 1.
Since nobody enforces a standard, in which the model would include a manifest with the original filename, original weights hash or even the author's homepage, I can't really rely on any automated method of recovering the original files.

For these reasons I suppose Autoprune should at least store original size, hash etc. into a separate directory (or even have an option of saving the original files, if not the "delta" information that could be used to un-prune models) so that nobody runs into such issues again. Perhaps the dev teams could also collaborate on integrating the plugins better, e.g. if Autoprune somehow "marked" the pruned model file, then the Civitai Helper would know where to look for the original file hash and succeed with its search.

arenasys · 2023-10-12T06:28:38Z

The hash changes if you replace the VAE, or fix clip positions, remove an embedded controlnet, remove random pytorch lightning keys, remove EMA, convert to safetensors, etc, etc. In all these cases the model remains the same to the user, its the same model just with less junk, yet the hash is completely different. So really model hash's are a ridiculous method of identifying models, the model name should be enough (posters just need to take 2 seconds to name their models).

On a note, metadata can be embedded into safetensor files, this is a feature of the safetensor standard, though its not used at all in the SD community. Instead we opt to do stupid things like make .yaml config files with the same name as the model that users have to make sure to download or its completely broken, etc.

paboum · 2023-10-12T12:55:43Z

Community will always do whatever is easiest and works unless they are forced to do things right. Could then Model Toolkit offer to embed the metadata saying "Pruned by Model Toolkit version cf82458; original size: x bytes; original hash: xyz"? And create the new file with "_pruned" infix?

File names are apparently not a good enough method of identifying Loras, as two authors can name their files the same way, right and put them in different repositories so the filename conflict isn't detected automatically? Or they reuse the name when creating new version of the same Lora. And Civitai doesn't allow filename search in their API (https://github.com/orgs/civitai/discussions/183) so I'm having a hard time trying to find the Loras I use in Civitai after pruning.

Civitai isn't the only model repository, the same models happen to coexist on other sites under different filenames. Since neither hash or filename are perfect, the tools for creating models should just put UUID inside while generating. I've suggested it (bmaltais/kohya_ss#1601) and hope Model Toolkit will preserve it while pruning.

arenasys · 2023-10-12T15:51:26Z

UUID is probably the best technical solution, but i wouldn't count on training software. If Civitai started to embed a UUID derived from the model page ID it would very quickly solve this issue. Though naming could also naturally fix itself as the struggle becomes more common (like how pruning became more common). Posters can just include their name (and version if applicable). I'll patch the toolkit to preserve safetensor metadata.

paboum mentioned this issue Oct 11, 2023

updating is broken zixaphir/Stable-Diffusion-Webui-Civitai-Helper#24

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saving the original model file metadata might be needed #41

Saving the original model file metadata might be needed #41

paboum commented Oct 11, 2023

arenasys commented Oct 12, 2023

paboum commented Oct 12, 2023

arenasys commented Oct 12, 2023

Saving the original model file metadata might be needed #41

Saving the original model file metadata might be needed #41

Comments

paboum commented Oct 11, 2023

arenasys commented Oct 12, 2023

paboum commented Oct 12, 2023

arenasys commented Oct 12, 2023