Vec2Box mismatching tensor devices on v9-m and v9-s only #145

jnt0rrente · 2025-01-01T19:22:44Z

Describe the bug

When creating a converter by following the hf_demo example, there is a mismatch in behavior depending on model size. This causes:
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor.

This happens when DEFAULT_MODEL is v9-m or v9-s, and does not happen when using either v9-c or v7. Tracing the problem to the yolo code yielded no results, so i am asking here in case this is an oversight on my part.

To Reproduce

This is the relevant code. No other part of my code is interfacing with the yolo module. I believe copying and pasting this will yield the exact same result.

DEFAULT_MODEL = "v9-m"
IMAGE_SIZE = (640, 640)

def load_model(model_name, device):
    model_cfg = OmegaConf.load(f"./modules/yolo_config/model/{model_name}.yaml")
    model_cfg.model.auxiliary = {}
    model = create_model(model_cfg, True)
    model.to(device).eval()
    return model, model_cfg

class YoloTracker:
    component_name = 'yolo_tracker'

    def __init__(self):
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.model, self.model_cfg = load_model(DEFAULT_MODEL, self.device)
        self.converter = create_converter(self.model_cfg.name, self.model, self.model_cfg.anchor, IMAGE_SIZE, self.device)
        self.class_list = ['Person'] #OmegaConf.load("./modules/yolo_config/dataset/coco.yaml").class_list
        self.transform = AugmentationComposer([])

        nms_confidence = 0.5
        nms_iou = 0.5
        max_bbox = 100

        nms_config = NMSConfig(nms_confidence, nms_iou, max_bbox)
        self.post_process = PostProcess(self.converter, nms_config)

Expected behavior

I would expect the issue to arise either on all model sizes or in none of them.

System Info (please complete the following ## information):

OS: Arch 6.12.7-arch1-1
Python Version: 3.10.16
PyTorch Version: 2.5.1+cu124
CUDA/cuDNN/MPS Version: 12.7
YOLO Model Version: not applicable

jnt0rrente · 2025-01-01T19:29:37Z

On further investigation, I have traced the issue to the dummy test for auto-anchor size, since the faulty model configurations do not provide the stride parameter.

henrytsui000 · 2025-01-03T14:18:28Z

Hi,

Thank you for reporting this issue and providing detailed environment information.

The bug has been fixed in commit 5f0e785. The issue was caused by the way we adapted the converter. It has now been resolved by moving the model to the correct device after building the converter.

Best regards,
Henry Tsui

jnt0rrente added the bug Something isn't working label Jan 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vec2Box mismatching tensor devices on v9-m and v9-s only #145

Vec2Box mismatching tensor devices on v9-m and v9-s only #145

jnt0rrente commented Jan 1, 2025 •

edited

Loading

jnt0rrente commented Jan 1, 2025

henrytsui000 commented Jan 3, 2025

Vec2Box mismatching tensor devices on v9-m and v9-s only #145

Vec2Box mismatching tensor devices on v9-m and v9-s only #145

Comments

jnt0rrente commented Jan 1, 2025 • edited Loading

Describe the bug

To Reproduce

Expected behavior

System Info (please complete the following ## information):

jnt0rrente commented Jan 1, 2025

henrytsui000 commented Jan 3, 2025

jnt0rrente commented Jan 1, 2025 •

edited

Loading