Llava model quantization seems not be supported #73

caojinpei · 2024-08-10T09:16:31Z

Describe the bug
When I use llm-compressor to quantize llava model, but at the begining, it failed. (Unrecognized configuration class: 'transformers.models.llava.configuration_llava.LlavaConfig')

Expected behavior
Hope llm-compressor can support LLaVA model.

To Reproduce
from llmcompressor.transformers import SparseAutoModelForCausalLM, oneshot
MODEL_ID = "/home/models/llava-v1.6-vicuna-7b"
model = SparseAutoModelForCausalLM.from_pretrained(
MODEL_ID,
device_map="auto",
trust_remote_code=True,
)

Errors
ValueError: Unrecognized configuration class <class 'transformers.models.llava.configuration_llava.LlavaConfig'> for this kind of AutoModel

Hope to get your reply, thanks.

robertgshaw2-neuralmagic · 2024-08-11T21:30:40Z

Hey @caojinpei - right now we only support models with XXXForCausalLM, which LLaVA is not.

I have added supporting vision language models and general XXXForConditionalLM to our roadmap. If you have any capacity to contribute a feature, we have happy to give you some pointers to get started! Let me know!

caojinpei · 2024-08-12T05:44:46Z

Hey @caojinpei - right now we only support models with XXXForCausalLM, which LLaVA is not.

I have added supporting vision language models and general XXXForConditionalLM to our roadmap. If you have any capacity to contribute a feature, we have happy to give you some pointers to get started! Let me know!

Hi, @robertgshaw2-neuralmagic

I am glad to get your reply and thanks for sharing roadmap. Now I want to quantize LLava-v1.6 model whose architecture is LlavaLlamaForCausalLM (Is it XXXForCausalLM?) into W8A16 using GPTQ within llm-compressor.
Could you give me some detailed points to do it? Is it very hard to implement it? And I am wonderring, due to llava model includes vision and language model, if we quantize all of them, will the accuracy of llava drop a lot?
By the way, if I just want to quantize language model in LLava-v1.6 using llm-compressor, do you have some suggestions?

Looking forward your reply, thanks.

*Model link: https://huggingface.co/liuhaotian/llava-v1.6-vicuna-7b

caojinpei · 2024-08-13T10:53:25Z

Hi @robertgshaw2-neuralmagic
By the way, I am wonderring, if llm-compressor support llava-hf/llava-v1.6-vicuna-7b-hf which architectures is LlavaNextForConditionalGeneration? Can you help me check this?

*Model link: https://huggingface.co/llava-hf/llava-v1.6-vicuna-7b-hf

robertgshaw2-neuralmagic · 2024-09-09T13:01:16Z

@caojinpei apologies for the delay, supporting vision-language models is on our roadmap, but not yet supported. We would definitely welcome a PR or an example though!

caojinpei added the bug Something isn't working label Aug 10, 2024

robertgshaw2-neuralmagic mentioned this issue Aug 11, 2024

Q3 ROADMAP #30

Open

21 tasks

caojinpei mentioned this issue Aug 27, 2024

Yaml parsing fails with a custom mapping provided to SmoothQuantModifier recipe #105

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llava model quantization seems not be supported #73

Llava model quantization seems not be supported #73

caojinpei commented Aug 10, 2024

robertgshaw2-neuralmagic commented Aug 11, 2024

caojinpei commented Aug 12, 2024 •

edited

Loading

caojinpei commented Aug 13, 2024

robertgshaw2-neuralmagic commented Sep 9, 2024 •

edited

Loading

Llava model quantization seems not be supported #73

Llava model quantization seems not be supported #73

Comments

caojinpei commented Aug 10, 2024

robertgshaw2-neuralmagic commented Aug 11, 2024

caojinpei commented Aug 12, 2024 • edited Loading

caojinpei commented Aug 13, 2024

robertgshaw2-neuralmagic commented Sep 9, 2024 • edited Loading

caojinpei commented Aug 12, 2024 •

edited

Loading

robertgshaw2-neuralmagic commented Sep 9, 2024 •

edited

Loading