Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Frontend] Add /v1/audio/transcriptions OpenAI API endpoint #12909

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

NickLucche
Copy link
Contributor

@NickLucche NickLucche commented Feb 7, 2025

Follow up on the great work by @robertgshaw2-redhat and @varun-sundar-rabindranath here #12458.

This PR adds the /v1/audio/transcriptions OpenAI API endpoint.

Basic example (start server with vllm serve openai/whisper-large-v3):

from openai import OpenAI
from vllm.assets.audio import AudioAsset

mary_had_lamb = AudioAsset('mary_had_lamb').get_asset_path()

openai_api_base = "http://localhost:8000/v1"
client = OpenAI(
    api_key="EMPTY",
    base_url=openai_api_base,
)
with open(str(mary_had_lamb), "rb") as f:
    transcription = client.audio.transcriptions.create(
        file=f,
        model="openai/whisper-large-v3",
        language="en",
        response_format="text",
        temperature=0.0)
    print("transcription:", transcription)

I am also adding a correctness test based on computing the WER for a subset of a dataset found here https://huggingface.co/datasets/open-asr-leaderboard/datasets, comparing against the transformers baseline.

Mind that this is all currently a bit "fit" to Whisper, being the only model we support. In particular in the way special tokens (<|startoftranscript|>, <|transcribe|>).. are sent as input, as this would require a more general design. Same thing in the validating supported languages and audio duration limit.

Let me know if I overlooked something in this PR.

TODOs (next PRs):

  • verbose_json response format
  • Group api validation (audio length limit above all) by model so we can extend to other models.
  • Openai does has no "streaming" support, perhaps we could add it as a custom addition (shouldn't be too different from what we have).
  • Translation API (more of the same with a token)

Copy link

github-actions bot commented Feb 7, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added documentation Improvements or additions to documentation ci/build frontend labels Feb 7, 2025
Varun Sundar Rabindranath and others added 10 commits February 7, 2025 17:37
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: NickLucche <[email protected]>
@NickLucche
Copy link
Contributor Author

Had to rebase and force push to make some very old commits compliant with dco and pre-commit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/build documentation Improvements or additions to documentation frontend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants