You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for trying to migrate it to Deepspeed. However, maintaining a code in Deepseed is not in our roadmap, but we do plan to support for Megatron-LM
感谢moonlight如此优秀模型的开源,swift借助examples/toy_train.py的实现集成了muon优化器:modelscope/ms-swift#3234
shell可以参考这里:https://github.com/modelscope/ms-swift/blob/main/examples/train/optimizer/muon.sh
但是在进行deepspeed zero2/zero3训练时,遇到报错:KeyError: 'use_muon',后续会兼容zero2/zero3的训练嘛
复现脚本:
The text was updated successfully, but these errors were encountered: