Move tokenizer information into pte to reduce ExecuTorch runner args #1484
Labels
actionable
Items in the backlog waiting for an appropriate impl/fix
enhancement
New feature or request
ExecuTorch
Issues related to ExecuTorch installation, export, or build. Mobile uses separate tags
good first issue
Good for newcomers
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
🚀 The feature, motivation and pitch
After an ExecuTorch model is exported to a
pte
, tokenization information must be passed in as an arg (-l <#>
) to the runner. This can be avoided by writing this information into thepte
file itself since the tokenizer is known at export time (sentencepiece => 2, tiktoken =>3). Tokenization information can be stored during export as a constant_method.For example: https://github.com/pytorch/torchchat?tab=readme-ov-file#deploy-and-run-on-android
Task:
For a similar optimization made for aoti: #1159.
See #1439 for conversation/more context
Alternatives
Continue to pass tokenizer arguments to the runner
Additional context
No response
RFC (Optional)
No response
The text was updated successfully, but these errors were encountered: