Abstraction for `resolve_torch_dtype_device(dtype: Dtype, device: Device) -> tuple[quantization_type, torch.device, torch.dtype]` #424

michaelfeil · 2024-10-14T15:48:25Z

Feature request

Too much boilerplate template:

Resolves loading, quantization, and device

Eg. if
device: auto -> torch.cuda.is_available() -> cuda or mps.
dtype: float32 -> float32, no quantization
dtype: float16 -> float16, no quantization
dtype: bfloat16 -> float16, no quantization
dtype: auto -> (bfloat16 if possible else float16) if device is cuda else float32, no quantization
dtype: int8 -> float32, int8 quantization
dtype: fp8 -> float32, fp8 quantization

Motivation

Your contribution

EricLiclair · 2024-10-22T14:44:04Z

@michaelfeil I believe this method should exist as a method of

infinity/libs/infinity_emb/infinity_emb/env.py

Line 24 in 62a07c9

class __Infinity_EnvManager:

or a method in the same file?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abstraction for `resolve_torch_dtype_device(dtype: Dtype, device: Device) -> tuple[quantization_type, torch.device, torch.dtype]` #424

Abstraction for `resolve_torch_dtype_device(dtype: Dtype, device: Device) -> tuple[quantization_type, torch.device, torch.dtype]` #424

michaelfeil commented Oct 14, 2024

EricLiclair commented Oct 22, 2024

Abstraction for resolve_torch_dtype_device(dtype: Dtype, device: Device) -> tuple[quantization_type, torch.device, torch.dtype] #424

Abstraction for resolve_torch_dtype_device(dtype: Dtype, device: Device) -> tuple[quantization_type, torch.device, torch.dtype] #424

Comments

michaelfeil commented Oct 14, 2024

Feature request

Motivation

Your contribution

EricLiclair commented Oct 22, 2024

Abstraction for `resolve_torch_dtype_device(dtype: Dtype, device: Device) -> tuple[quantization_type, torch.device, torch.dtype]` #424

Abstraction for `resolve_torch_dtype_device(dtype: Dtype, device: Device) -> tuple[quantization_type, torch.device, torch.dtype]` #424