convert model to FP8 error #110

kuangdao · 2024-08-26T03:03:03Z

Describe the bug
A clear and concise description of what the bug is.

Expected behavior
A clear and concise description of what you expected to happen.

Environment
Include all relevant environment information:

OS [Ubuntu 20.04]:
Python version [3.10.12]:
LLM Compressor version or commit hash [0.1.0]:
ML framework version(s) [2.4.0]:
Other Python package versions [e.g. vLLM, compressed-tensors, numpy, ONNX]:
Other relevant environment information [e.g. hardware, CUDA version]:

To Reproduce


from llmcompressor.transformers import SparseAutoModelForCausalLM
from transformers import AutoTokenizer

MODEL_ID = "/data/models/deepseek-coder-6.7b-base/"

model = SparseAutoModelForCausalLM.from_pretrained(
  MODEL_ID, device_map="auto", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)


from llmcompressor.transformers import oneshot
from llmcompressor.modifiers.quantization import QuantizationModifier

# Configure the simple PTQ quantization
recipe = QuantizationModifier(
  targets="Linear", scheme="FP8_DYNAMIC", ignore=["lm_head"])

# Apply the quantization algorithm.
oneshot(model=model, recipe=recipe)

# Save the model.
SAVE_DIR = MODEL_ID.split("/")[1] + "-FP8-Dynamic"
model.save_pretrained(SAVE_DIR)
tokenizer.save_pretrained(SAVE_DIR)

Errors

Additional context

The text was updated successfully, but these errors were encountered:

robertgshaw2-neuralmagic · 2024-08-26T11:50:24Z

Can you share:

torch version
are you running on cpu or gpu

It looks like max for fp8 is not supported on your torch version

kuangdao added the bug Something isn't working label Aug 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert model to FP8 error #110

convert model to FP8 error #110

kuangdao commented Aug 26, 2024

robertgshaw2-neuralmagic commented Aug 26, 2024

convert model to FP8 error #110

convert model to FP8 error #110

Comments

kuangdao commented Aug 26, 2024

robertgshaw2-neuralmagic commented Aug 26, 2024