Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llava model quantization seems not be supported #73

Open
caojinpei opened this issue Aug 10, 2024 · 4 comments
Open

Llava model quantization seems not be supported #73

caojinpei opened this issue Aug 10, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@caojinpei
Copy link

Describe the bug
When I use llm-compressor to quantize llava model, but at the begining, it failed. (Unrecognized configuration class: 'transformers.models.llava.configuration_llava.LlavaConfig')

Expected behavior
Hope llm-compressor can support LLaVA model.

To Reproduce
from llmcompressor.transformers import SparseAutoModelForCausalLM, oneshot
MODEL_ID = "/home/models/llava-v1.6-vicuna-7b"
model = SparseAutoModelForCausalLM.from_pretrained(
MODEL_ID,
device_map="auto",
trust_remote_code=True,
)

Errors
ValueError: Unrecognized configuration class <class 'transformers.models.llava.configuration_llava.LlavaConfig'> for this kind of AutoModel

Hope to get your reply, thanks.

@caojinpei caojinpei added the bug Something isn't working label Aug 10, 2024
@robertgshaw2-neuralmagic
Copy link
Sponsor Collaborator

Hey @caojinpei - right now we only support models with XXXForCausalLM, which LLaVA is not.

I have added supporting vision language models and general XXXForConditionalLM to our roadmap. If you have any capacity to contribute a feature, we have happy to give you some pointers to get started! Let me know!

@caojinpei
Copy link
Author

caojinpei commented Aug 12, 2024

Hey @caojinpei - right now we only support models with XXXForCausalLM, which LLaVA is not.

I have added supporting vision language models and general XXXForConditionalLM to our roadmap. If you have any capacity to contribute a feature, we have happy to give you some pointers to get started! Let me know!

Hi, @robertgshaw2-neuralmagic

I am glad to get your reply and thanks for sharing roadmap. Now I want to quantize LLava-v1.6 model whose architecture is LlavaLlamaForCausalLM (Is it XXXForCausalLM?) into W8A16 using GPTQ within llm-compressor.
Could you give me some detailed points to do it? Is it very hard to implement it? And I am wonderring, due to llava model includes vision and language model, if we quantize all of them, will the accuracy of llava drop a lot?
By the way, if I just want to quantize language model in LLava-v1.6 using llm-compressor, do you have some suggestions?

Looking forward your reply, thanks.

*Model link: https://huggingface.co/liuhaotian/llava-v1.6-vicuna-7b

@caojinpei
Copy link
Author

Hi @robertgshaw2-neuralmagic
By the way, I am wonderring, if llm-compressor support llava-hf/llava-v1.6-vicuna-7b-hf which architectures is LlavaNextForConditionalGeneration? Can you help me check this?

*Model link: https://huggingface.co/llava-hf/llava-v1.6-vicuna-7b-hf

@robertgshaw2-neuralmagic
Copy link
Sponsor Collaborator

robertgshaw2-neuralmagic commented Sep 9, 2024

@caojinpei apologies for the delay, supporting vision-language models is on our roadmap, but not yet supported. We would definitely welcome a PR or an example though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants