-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] Documentation on how to run infinity on AMD GPU #400
Comments
@tjtanaa It works pretty much out of the box. Not a lot of providers (modal, azure, ..) offer ROCm & I don't have a local development setup. I would be glad, if there
infinity_emb v2 --model-id mixedbread-ai/mxbai-rerank-base-v1 |
@michaelfeil Thank you. It works now, though a bit bloated with alot of nvidia-packages in the python environment. I also have to add --no-bettertransformers flag as optimum.bettertransformers throw error that mha_var_len is not support on ROCm.
Working command
I have question, it seems in the benchmark that is found in the documentation stated it is ran without
|
@tjtanaa On my RocM device / setup I saw the model segfaulting for torch.compile. I am unsure if its identical to the above error. I think a docker image with these default settings backed in (similar to vllm) would be great to look into. Happy to collab here if you are interested @tjtanaa |
Feature request
Could we get a documentation on how to run infinity on AMD GPU? I could only find benchmark and description that infinity could be run with ROCm backend.
Motivation
Easier setup of infinity on AMD platform
Your contribution
Bring out the awareness of demand to run infinity on AMD GPU. Thank you.
The text was updated successfully, but these errors were encountered: