Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation fails when image names are not int convertible. #67

Open
Abdul-Mukit opened this issue Aug 13, 2024 · 11 comments · May be fixed by #79
Open

Validation fails when image names are not int convertible. #67

Abdul-Mukit opened this issue Aug 13, 2024 · 11 comments · May be fixed by #79
Labels
bug Something isn't working

Comments

@Abdul-Mukit
Copy link
Contributor

Abdul-Mukit commented Aug 13, 2024

First of all, thank you very much for this much-needed initiative.

Describe the bug

It is assumed in this repo that the image file names will always be int convertible. That is not true. If image names in the validation split are not convertible to int, validation fails. e.g.
instead of tests/data/images/val/000000151480.jpg if the name was tests/data/images/val/000000151480_.jpg and instances_val.json was updated accordingly, validation will fail.

This is the initial error message:

Error executing job with overrides: ['task=train', 'model=v9-s', 'dataset=mock', 'task.epoch=1']
Traceback (most recent call last):
  File "/home/abdul/projects/YOLO/yolo/lazy.py", line 39, in main
    solver.solve(dataloader)
  File "/home/abdul/projects/YOLO/yolo/tools/solver.py", line 148, in solve
    mAPs = self.validator.solve(self.validation_dataloader, epoch_idx=epoch_idx)
  File "/home/abdul/projects/YOLO/yolo/tools/solver.py", line 252, in solve
    predict_json.extend(predicts_to_json(img_paths, predicts, rev_tensor))
  File "/home/abdul/projects/YOLO/yolo/utils/model_utils.py", line 154, in predicts_to_json
    "image_id": int(Path(img_path).stem),
ValueError: invalid literal for int() with base 10: '000000151480_'

I am open to making PRs. Will appreciate input on how to solve this.

To Reproduce

Steps to reproduce the behavior:

  1. Rename tests/data/images/val/000000151480.jpg to tests/data/images/val/000000151480_.jpg.
  2. Edit tests/data/annotations/instances_val.json to change "file_name": "000000151480.jpg", to "file_name": "000000151480_.jpg".
  3. Try training using python yolo/lazy.py task=train model=v9-s dataset=mock task.epoch=1.
  4. See error

Expected behavior

In COCO datasets' */annotations/*.json files, the images are supposed to have int ids and string file_names. A string can have any character and the logic of this codebase should accommodate for it. Instead of using filename as a id it should use the actual id of the image as mentioned in the .json file.

Screenshots

The problem is the it was assumed in "image_id": int(Path(img_path).stem), that the stem will always be convertible to int.

Even if the int is removed to make it "image_id": Path(img_path).stem. The solution is fundamentally wrong as image_id is not filename. In the instances_val.json that particular image already had an int id assigned to it. Both id (int) and filename (string) can be completely random and may not have any similarity
image

Instead of taking the correct id, the file name was used as the id. This was done as the ModelValidator.solve()'s dataloader doesn't return the actual ids but only the img_paths.
image

Thus we run into another assert error:

Error executing job with overrides: ['task=train', 'model=v9-s', 'dataset=mock', 'task.epoch=1']
Traceback (most recent call last):
  File "/home/abdul/projects/YOLO/yolo/lazy.py", line 39, in main
    solver.solve(dataloader)
  File "/home/abdul/projects/YOLO/yolo/tools/solver.py", line 148, in solve
    mAPs = self.validator.solve(self.validation_dataloader, epoch_idx=epoch_idx)
  File "/home/abdul/projects/YOLO/yolo/tools/solver.py", line 260, in solve
    result = calculate_ap(self.coco_gt, predict_json)
  File "/home/abdul/projects/YOLO/yolo/utils/solver_utils.py", line 12, in calculate_ap
    coco_dt = coco_gt.loadRes(pd_path)
  File "/home/abdul/projects/YOLO/.venv/lib/python3.10/site-packages/pycocotools/coco.py", line 327, in loadRes
    assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), \
AssertionError: Results do not correspond to current coco set

System Info (please complete the following ## information):

  • OS: [e.g. Ubuntu 22.04]
  • Python Version: [3.10]
  • PyTorch Version: [ 2.4.0]
  • CUDA/cuDNN/MPS Version: []
  • YOLO Model Version: [e.g. YOLOv9-s]

Additional context

My dataset was produced directly from CVAT with int image_id and random string for filenames. I have used these datasets before with other DNN libraries, without having this issue before. I am confident that I am not breaking any of COCO's dataset conventions. This "filename is int and same as id", assumption is very limiting. In many industry applications, images are time stamped like <time>_<date>_<location>.jpg. Moreover, while editing datasets using libraries like datumaro images ids mentioned in .json can certainly change anytime.

@Abdul-Mukit Abdul-Mukit added the bug Something isn't working label Aug 13, 2024
@yjmm10
Copy link

yjmm10 commented Aug 13, 2024

I have encountered this case, and I try to solve it by matching the filename and img_path.
You can try it with this code in solver_utils.py:

from pathlib import Path

def calculate_ap(coco_gt: COCO, pd_path):
    for idx, pd in enumerate(pd_path):
        for k,v in coco_gt.imgs.items():
            if Path(v["file_name"]).stem == pd["image_id"]:
                pd_path[idx]["image_id"] = coco_gt.anns[k]["image_id"]

Did you train the model successfully? I trained the model and I can't get the AP that is more than 0. Do you know how to solve it?
If you have any question, welcome to talk to me.

@Abdul-Mukit
Copy link
Contributor Author

Abdul-Mukit commented Aug 14, 2024

@yjmm10 thank you. I appreciate your input. I want to share my thoughts about the code snipped you shared.

The solution will work but will be extremely expensive. It has a time complexity of O(m*n). Imagine, you have m number of bounding box detections in total. This is basically the length of pd_path in your code. And then you have n number of images, which is the length of coco_gt.imgs in your code. For every annotation, you are essentially going through every image and doing a string comparison. If you had 1,000 images and each images had on average 10 detections, then you are doing 1,000 * 1,000 * 10 = 10,000,000 string comparisons just to fix a mistake some other function did before the call to calculate_ap.

I think it would benefit us all if the design of this repo can be made much better. The dataloader should also return the actual image_id to being with and predicts_to_json should be changed to accept a list of int as the first argument. The first argument should be image_id_list not img_paths. It may look something like this:

def predicts_to_json(image_id_list:list[int], predicts, rev_tensor):
    """
    TODO: function document
    turn a batch of imagepath and predicts(n x 6 for each image) to a List of diction(Detection output)
    """
    batch_json = []
    for image_id, bboxes, box_reverse in zip(image_id_list, predicts, rev_tensor):
        scale, shift = box_reverse.split([1, 4])
        bboxes[:, 1:5] = (bboxes[:, 1:5] - shift[None]) / scale[None]
        bboxes[:, 1:5] = transform_bbox(bboxes[:, 1:5], "xyxy -> xywh")
        for cls, *pos, conf in bboxes:
            bbox = {
                "image_id": image_id,
                "category_id": IDX_TO_ID[int(cls)],
                "bbox": [float(p) for p in pos],
                "score": float(conf),
            }
            batch_json.append(bbox)
    return batch_json

@yjmm10
Copy link

yjmm10 commented Aug 14, 2024

@Abdul-Mukit You are right. I tried it in my small dataset, but when using it in a big dataset, the efficiency is slow.

I will try it and pr 。

Do you have other questions during training? If you complete it, would you talk about how to do it.

@Abdul-Mukit
Copy link
Contributor Author

@yjmm10 no I have not trained using this model yet. I got stuck due to this bug.

@Abdul-Mukit
Copy link
Contributor Author

Abdul-Mukit commented Aug 14, 2024

I am actually a bit hesitant with the design choices in the repository. For example, the function calculate_ap(coco_gt: COCO, pd_path) requires a coco object. On the contrary look at this mean average precision function from torchmetrics: https://lightning.ai/docs/torchmetrics/stable/detection/mean_average_precision.html
I think the design choices of torchmetrics is much better.
Easier to understand and debug for devs.

@yjmm10
Copy link

yjmm10 commented Aug 14, 2024

I am actually a bit hesitant with the design choices in the repository. For example, the function calculate_ap(coco_gt: COCO, pd_path) requires a coco object. On the contrary look at this mean average precision function from torchmetrics: https://lightning.ai/docs/torchmetrics/stable/detection/mean_average_precision.html I think the design choices of torchmetrics is much better. Easier to understand and debug for devs.

I think that the function of calculate_ap is not so good.

  1. The repository currently favors COCO and YOLO formats. However, when using YOLO, the AP calculation requires a JSON file, which is not an ideal approach.
  2. The dataset format should be standardized and loaded through a dataloader. Additionally, the AP calculation method could utilize torchmetrics for a more convenient and streamlined process.

@Abdul-Mukit
Copy link
Contributor Author

@yjmm10 I am working on a fix. I saw your MR where you edited load_data #72 . I'll ping you again on that MR if needed. After my investigations, I think my solution will come somewhere from there so it would be good to stay in sync with you.

@yjmm10
Copy link

yjmm10 commented Aug 15, 2024

@yjmm10 I am working on a fix. I saw your MR where you edited load_data #72 . I'll ping you again on that MR if needed. After my investigations, I think my solution will come somewhere from there so it would be good to stay in sync with you.

Ok, Now I complete the simple validate for the val dataset. The image_id will be processed in other MR, welcome to talk to me, or contribute your code to YOLO

@Abdul-Mukit
Copy link
Contributor Author

Found the problem I think

def create_image_metadata(labels_path: str) -> Tuple[Dict[int, List], Dict[str, Dict]]:
    """
    Create a dictionary containing image information and annotations indexed by image ID.

    Args:
        labels_path (str): The path to the annotation json file.

    Returns:
        - annotations_index: A dictionary where keys are image IDs and values are lists of annotations.
        - image_info_dict: A dictionary where keys are image file names without extension and values are image information dictionaries.
    """
    with open(labels_path, "r") as file:
        labels_data = json.load(file)
        id_to_idx = discretize_categories(labels_data.get("categories", [])) if "categories" in labels_data else None
        annotations_index = organize_annotations_by_image(labels_data, id_to_idx)  # check lookup is a good name?
        image_info_dict = {Path(img["file_name"]).stem: img for img in labels_data["images"]}
        return annotations_index, image_info_dict

annotations_index is using the image_id stored in .json files as keys.
image_info_dict is using image_id calculated from file name.
Both should have used image_id calculated form file name only. Will be making the MR shortly. Thanks.

@yjmm10
Copy link

yjmm10 commented Aug 15, 2024

image
the variable of image_info_dict have the key of image_id
If we use image_id, when we open image ,it's also troublesome to obtain the original path of th image

@Abdul-Mukit
Copy link
Contributor Author

the variable of image_info_dict have the key of image_id。 If we use image_id, when we open image ,it's also troublesome to obtain the original path of th image

We already have available to us the dataset_path, phase_name variable. "images" folder name string is common in both coco and yolo datasets. That means we can reconstruct the original path of the image easily.
image_path = dataset_path / "Images" / phase_name / image_id + '.jpg'

The only thing concerning is that .jpg. According to proposal of using image_id as our key, we will be getting rid of that. Image can also be .png format. So I guess a better key would be the image's file names with extension but without any path info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants