Add SmolLM (smollm2) #9354

Inklingdq · 2025-03-18T08:05:00Z

Summary

Add SmolLM 135M model (smollm2) for issue #9324

python -m examples.models.llama.runner.native --model smollm2 \
--pte smollm2.pte  \
--tokenizer /Users/danqingwang/tmp/snapshots/1d461723eec654e65efdc40cf49301c89c0c92f4/tokenizer.json \
--tokenizer_config /Users/danqingwang/tmp/snapshots/1d461723eec654e65efdc40cf49301c89c0c92f4/tokenizer_config.json \
--prompt "What ingredients are in a California roll?" \
--params examples/models/smollm2/135M_config.json --max_len 64 \
--temperature 0 -kv
import error: No module named 'triton'
W0318 22:08:14.883045 53671 site-packages/torch/distributed/elastic/multiprocessing/redirects.py:29] NOTE: Redirects are currently not supported in Windows or MacOs.
[program.cpp:136] InternalConsistency verification requested but not available

California rolls are made with a variety of ingredients, including wheat, corn, rice, and beans. The ingredients are combined in a special way to create a unique texture and flavor.
What is the difference between a California roll and a roll?
The main difference between a

Prefill time: 0.03980898857116699
Generation tok/s: 76.69897827254086

output agrees with the eager model output

Test plan

Convert to meta format

python examples/models/smollm/convert_weights.py /Users/danqingwang/tmp/snapshots/1d461723eec654e65efdc40cf49301c89c0c92f4/ /Users/danqingwang/smollm.pth

Run export

./install_executorch.sh --pybind xnnpack && python -m examples.models.llama.export_llama   \
--model smollm2 --params examples/models/smollm2/135M_config.json  \
--checkpoint /Users/danqingwang/smollm.pth -kv --use_sdpa_with_kv_cache  \
-X -d fp32 --metadata '{"get_bos_id":[11191, 12870], "get_eos_ids":[29070, 25853]}'  \
--output_name="smollm2.pte" --verbose

Run test
python -m examples.models.llama.runner.native --model smollm2
--pte smollm2.pte
--tokenizer /Users/danqingwang/tmp/snapshots/1d461723eec654e65efdc40cf49301c89c0c92f4/tokenizer.json
--tokenizer_config /Users/danqingwang/tmp/snapshots/1d461723eec654e65efdc40cf49301c89c0c92f4/tokenizer_config.json
--prompt "What ingredients are in a California roll?"
--params examples/models/smollm2/135M_config.json --max_len 64
--temperature 0 -kv

pytorch-bot · 2025-03-18T08:05:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9354

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-03-18T08:05:05Z

Hi @Inklingdq!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

Inklingdq · 2025-03-18T08:12:28Z

examples/models/smollm/135M_config.json

@@ -0,0 +1,14 @@
+{
+    "dim": 576,
+    "ffn_dim_multiplier": 1,


There's some size mismatch error during quantization

size mismatch for layers.0.feed_forward.w1.weight: copying a param with shape torch.Size([1536, 576]) from checkpoint, the shape in current model is torch.Size([576, 576]). size mismatch for layers.0.feed_forward.w2.weight: copying a param with shape torch.Size([576, 1536]) from checkpoint, the shape in current model is torch.Size([576, 576]). size mismatch for layers.0.feed_forward.w3.weight: copying a param with shape torch.Size([1536, 576]) from checkpoint, the shape in current model is torch.Size([576, 576]).

I'm not very sure about the definiation of dim and ffn_dim_multiplier here, looks like some wrong value here? Would you mind provide some pointer/context on this? Appreciate it! @jackzhxng

The model structure is below

LlamaForCausalLM(
(model): LlamaModel(
(embed_tokens): Embedding(49152, 576)
(layers): ModuleList(
(0-29): 30 x LlamaDecoderLayer(
(self_attn): LlamaSdpaAttention(
(q_proj): Linear(in_features=576, out_features=576, bias=False)
(k_proj): Linear(in_features=576, out_features=192, bias=False)
(v_proj): Linear(in_features=576, out_features=192, bias=False)
(o_proj): Linear(in_features=576, out_features=576, bias=False)
(rotary_emb): LlamaRotaryEmbedding()
)
(mlp): LlamaMLP(
(gate_proj): Linear(in_features=576, out_features=1536, bias=False)
(up_proj): Linear(in_features=576, out_features=1536, bias=False)
(down_proj): Linear(in_features=1536, out_features=576, bias=False)
(act_fn): SiLU()
)
(input_layernorm): LlamaRMSNorm((576,), eps=1e-05)
(post_attention_layernorm): LlamaRMSNorm((576,), eps=1e-05)
)
)
(norm): LlamaRMSNorm((576,), eps=1e-05)
(rotary_emb): LlamaRotaryEmbedding()
)
(lm_head): Linear(in_features=576, out_features=49152, bias=False)
)

jackzhxng · 2025-03-18T17:06:14Z

examples/models/smollm/convert_weights.py

+        checkpoint_dir=args.input_dir,
+        checkpoint_files=["model.safetensors"],
+        output_dir=".",
+        model_type="MISTRAL",


Change to Llama

Thank you updated~

jackzhxng · 2025-03-18T17:06:31Z

examples/models/smollm/convert_weights.py

+        converted_state_dict[new_key] = value
+
+    # Input and output embeddings are tied.
+    converted_state_dict["output.weight"] = converted_state_dict[


Might be because of this, input and output embeddings are not shared for Llama which this model is based off of

Make sense, removed this

jackzhxng · 2025-03-18T17:07:29Z

examples/models/llama/export_llama_lib.py

@@ -94,6 +94,7 @@
    "static_llama",
    "qwen2_5",
    "phi-4-mini",
+    "smollm",


rename this and directory to smolllm2

Thanks! Should it be smollm2 or smolllm2?

Ah - it should be smollm2*

jackzhxng · 2025-03-18T23:26:30Z

examples/models/smollm/135M_config.json

+{
+    "dim": 576,
+    "ffn_dim_multiplier": 1,
+    "hidden_dim": 576,


Should be 1536 - https://huggingface.co/HuggingFaceTB/SmolLM2-135M/blob/main/config.json#L12

Thank you! I got it mixed with the hidden_size 😄

jackzhxng · 2025-03-19T00:10:39Z

examples/models/smollm/135M_config.json

+    "rope_theta": 10000.0,
+    "use_scaled_rope": false,
+    "vocab_size": 49152,
+    "use_hf_rope": true,


this should be false

Thank you!! Updated

Add SmolLM

8419a74

Inklingdq commented Mar 18, 2025

View reviewed changes

jackzhxng reviewed Mar 18, 2025

View reviewed changes

address comment

34c5dee

Inklingdq force-pushed the viable/strict branch from 52e68fc to 34c5dee Compare March 18, 2025 23:08

jackzhxng reviewed Mar 18, 2025

View reviewed changes

jackzhxng reviewed Mar 19, 2025

View reviewed changes

Danqing Wang (MPK) added 2 commits March 18, 2025 17:39

change use_hf_rope to False

91e5dd2

update model name to smollm2

14d3ca7

Inklingdq marked this pull request as ready for review March 19, 2025 05:15

Inklingdq requested a review from lucylq as a code owner March 19, 2025 05:15

Inklingdq changed the title ~~Add SmolLM~~ Add SmolLM (smollm2) Mar 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SmolLM (smollm2) #9354

Add SmolLM (smollm2) #9354

Inklingdq commented Mar 18, 2025 •

edited

Loading

pytorch-bot bot commented Mar 18, 2025

facebook-github-bot commented Mar 18, 2025

Inklingdq Mar 18, 2025

jackzhxng Mar 18, 2025

Inklingdq Mar 18, 2025

jackzhxng Mar 18, 2025

Inklingdq Mar 18, 2025

jackzhxng Mar 18, 2025

Inklingdq Mar 18, 2025

jackzhxng Mar 18, 2025

jackzhxng Mar 18, 2025

Inklingdq Mar 18, 2025

jackzhxng Mar 19, 2025

Inklingdq Mar 19, 2025

Add SmolLM (smollm2) #9354

Are you sure you want to change the base?

Add SmolLM (smollm2) #9354

Conversation

Inklingdq commented Mar 18, 2025 • edited Loading

Summary

Test plan

pytorch-bot bot commented Mar 18, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9354

facebook-github-bot commented Mar 18, 2025

Action Required

Process

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Inklingdq commented Mar 18, 2025 •

edited

Loading