You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
Hi, when loading a model on meta device many warnings "copying from a non-meta parameter in the checkpoint to a meta parameter in the current model" are printed.
/home/fxmarty/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py:2397: UserWarning: for model.decoder.embed_tokens.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(
/home/fxmarty/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py:2397: UserWarning: for model.decoder.embed_positions.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(
/home/fxmarty/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py:2397: UserWarning: for model.decoder.final_layer_norm.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(
/home/fxmarty/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py:2397: UserWarning: for model.decoder.final_layer_norm.bias: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(
/home/fxmarty/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py:2397: UserWarning: for model.decoder.layers.0.self_attn.k_proj.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(
/home/fxmarty/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py:2397: UserWarning: for model.decoder.layers.0.self_attn.k_proj.bias: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(
/home/fxmarty/miniconda3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py:2397: UserWarning: for model.decoder.layers.0.self_attn.v_proj.weight: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass `assign=True` to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)
warnings.warn(
...
I don't recall it being the case previously, maybe something changed in from_pretrained or torch, that results in these warnings?
Expected behavior
No warnings, params on meta device
The text was updated successfully, but these errors were encountered:
cc @SunMarc@muellerzr since this crosses over with accelerate a bit, but also @fxmarty-amd the errors don't surprise me, since from_pretrained loads weight data into the model, and meta tensors don't actually hold weight data! Maybe you could avoid the errors by initializing the same model architecture without weight loading like this:
I don't recall it being the case previously, maybe something changed in from_pretrained or torch, that results in these warnings?
This is due to the torch if I recall correctly. I can check what can be done on our side but since you don't need the weights, I think @Rocketknight1 solution will be more appropriate.
Thank you this makes sense, I'll try to use from_config whenever possible (although external libraries relying on from_pretrained may not expose from_config).
System Info
Who can help?
@ArthurZucker @SunMarc
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Hi, when loading a model on meta device many warnings "copying from a non-meta parameter in the checkpoint to a meta parameter in the current model" are printed.
Reproduction:
gives a lot of warnings:
I don't recall it being the case previously, maybe something changed in
from_pretrained
or torch, that results in these warnings?Expected behavior
No warnings, params on meta device
The text was updated successfully, but these errors were encountered: