[Model] Ultravox Model: Support v0.5 Release #12912

farzadab · 2025-02-07T17:43:35Z

Hi there!

We (cc @petersalas) are preparing for the v0.5 release of Ultravox audio/text to text model. This release does some minor adjustments to the architecture (moves a layer norm before the last layer of the projector).
The previous versions are still supported through the config parameter projector_ln_mid.

I'm gonna mark this as draft for now until I can verify everything. I'm also gonna add a test for v0.5 going forward once it becomes public.

github-actions · 2025-02-07T17:43:49Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: Farzad Abdolhosseini <[email protected]>

vllm/model_executor/models/ultravox.py

Signed-off-by: Farzad Abdolhosseini <[email protected]>

ywang96 · 2025-02-07T19:11:10Z

@farzadab @petersalas Thanks for the contribution! Feel freel to ping me here or on slack whenever this is ready for review!

petersalas · 2025-02-07T19:21:17Z

I'm also gonna add a test for v0.5 going forward once it becomes public.

I suspect we should probably just update the test to use the newest model (after smoke testing that the new code still works on 0.3 and 0.4.1)? And maybe switch to using the 1B model as long as it can demonstrate some understanding of the sample audios.

farzadab marked this pull request as draft February 7, 2025 17:44

update ultravox model to support v0.5 release

35c3e17

Signed-off-by: Farzad Abdolhosseini <[email protected]>

farzadab force-pushed the farzad-ultravox-v05 branch from ec6e6c0 to 35c3e17 Compare February 7, 2025 17:50

petersalas reviewed Feb 7, 2025

View reviewed changes

vllm/model_executor/models/ultravox.py Outdated Show resolved Hide resolved

revert to using MulAndSilu instead of FlippedSiluAndMul

323abb0

Signed-off-by: Farzad Abdolhosseini <[email protected]>

ywang96 self-assigned this Feb 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Ultravox Model: Support v0.5 Release #12912

[Model] Ultravox Model: Support v0.5 Release #12912

farzadab commented Feb 7, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Feb 7, 2025

ywang96 commented Feb 7, 2025

petersalas commented Feb 7, 2025

[Model] Ultravox Model: Support v0.5 Release #12912

Are you sure you want to change the base?

[Model] Ultravox Model: Support v0.5 Release #12912

Conversation

farzadab commented Feb 7, 2025 • edited by github-actions bot Loading

github-actions bot commented Feb 7, 2025

ywang96 commented Feb 7, 2025

petersalas commented Feb 7, 2025

farzadab commented Feb 7, 2025 •

edited by github-actions bot

Loading