Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Phi-3.5 MoE #2457

Open
maziyarpanahi opened this issue Aug 25, 2024 · 1 comment
Open

Support Phi-3.5 MoE #2457

maziyarpanahi opened this issue Aug 25, 2024 · 1 comment
Labels
new model Request for integration of new model

Comments

@maziyarpanahi
Copy link
Contributor

Feature request

Add support for microsoft/Phi-3.5-MoE-instruct which has PhiMoEForCausalLM arch.

Motivation

It fails with the following error:

2024-08-25 21:25:51.891 | INFO     | text_generation_server.utils.import_utils:<module>:75 - Detected system cuda
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
Traceback (most recent call last):

  File "/opt/conda/bin/text-generation-server", line 8, in <module>
    sys.exit(app())

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 118, in serve
    server.serve(

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 297, in serve
    asyncio.run(

  File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)

  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 231, in serve_inner
    model = get_model(

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 1064, in get_model
    raise NotImplementedError("sharded is not supported for AutoModel")

NotImplementedError: sharded is not supported for AutoModel
 rank=3
2024-08-25T21:25:56.550031Z ERROR text_generation_launcher: Shard 3 failed to start
2024-08-25T21:25:56.550058Z  INFO text_generation_launcher: Shutting down shards

Your contribution

I can test any PR

@ErikKaum
Copy link
Member

Thanks for reporting this @maziyarpanahi 👍

We don't have at the moment a lot of extra bandwidth but we might prioritize adding this model.

Also as a note, to indicate more demand for a model getting thumbs ups or similar reactions on your issue is a signal for us to prioritize something :)

@drbh drbh added the new model Request for integration of new model label Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new model Request for integration of new model
Projects
None yet
Development

No branches or pull requests

3 participants