Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modify default value of stride to 64 in the function letterbox #13150

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

TommeyChang
Copy link

@TommeyChang TommeyChang commented Jun 29, 2024

When the default value of stride is 32, it confronts an error when the width or height of the image is resized to 864.
I fix the bug from setting the stride to 64. Thus, I suggest to change the default value.


I have read the CLA Document and I sign the CLA

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Small update to the image preprocessing in YOLOv5 to allow more flexible stride options.

📊 Key Changes

  • Modified the letterbox function's stride parameter from a default of 32 to 64.

🎯 Purpose & Impact

  • Improved Flexibility: By increasing the default stride value, images can be resized and padded with more flexible stride options.
  • Potential Performance Impact: This may affect model inference times or performance based on how images are processed, possibly improving efficiency for certain applications.
  • User Convenience: Users may experience improved outcomes in training and inference without needing to manually adjust stride settings.

Overall, this change aims to optimize image preprocessing, benefiting both developers and end-users by making the model slightly more adaptable and efficient. 📈✨

Copy link
Contributor

github-actions bot commented Jun 29, 2024

All Contributors have signed the CLA. ✅
Posted by the CLA Assistant Lite bot.

@TommeyChang
Copy link
Author

I have read the CLA Document and I sign the CLA

@TommeyChang
Copy link
Author

recheck

@glenn-jocher
Copy link
Member

@TommeyChang hello,

Thank you for bringing this to our attention! To help us investigate the issue further, could you please provide a minimum reproducible code example? This will allow us to better understand the problem and work towards a solution. You can refer to our guidelines on creating a minimum reproducible example here: Minimum Reproducible Example. 🛠️

Additionally, please ensure that you are using the latest versions of torch and the YOLOv5 repository from Ultralytics. Sometimes, updating to the latest versions can resolve unexpected issues.

Looking forward to your response so we can assist you further!

@TommeyChang
Copy link
Author

TommeyChang commented Jul 12, 2024

When using the letterbox to resize an image of (1280, 1964, 3) into (1280, 1280) with stride=32, one dimension of the image is changed into 864.
Then, the following error raises: Sizes of tensors must match except in dimension 1. Expected size 28 but got size 27 for tensor number 1 in the list.
This error can be fixed set the stride as 64.

The example codes are:

img = cv2.imread(img_path)
# check the size of original image
print(img_2_tensor.shape)

img_2_tensor = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_2_tensor = letterbox(img_2_tensor, new_shape=(1280, 1280), stride=32)[0]
# check the size of resized image
print(img_2_tensor.shape)

img_2_tensor = torch.from_numpy(img_2_tensor.transpose((2, 0, 1))).float().to(device) / 255.0
if img_2_tensor.ndimension() == 3:
    img_2_tensor = img_2_tensor.unsqueeze(0)

pred = model(img_2_tensor)[0]

@glenn-jocher
Copy link
Member

Hello @TommeyChang,

Thank you for providing a detailed description of the issue and the example code! This is very helpful. 😊

To address your concern, it seems like the error arises due to the mismatch in tensor sizes when the image is resized with a stride of 32. Changing the stride to 64 appears to resolve the issue in your case.

Before we proceed further, could you please confirm the following:

  1. Minimum Reproducible Example: Ensure that the provided code snippet is a complete, minimal example that reproduces the issue. This helps us investigate the problem more effectively. You can refer to our guidelines on creating a minimum reproducible example here: Minimum Reproducible Example.
  2. Latest Versions: Verify that you are using the latest versions of torch and the YOLOv5 repository from Ultralytics. Sometimes, updating to the latest versions can resolve unexpected issues.

If the issue persists after these checks, we can further investigate the possibility of modifying the default stride value or providing additional flexibility in the letterbox function.

Thank you for your cooperation and for being an active member of the YOLO community! 🌟

@TommeyChang
Copy link
Author

TommeyChang commented Jul 14, 2024

Bug description:

When the default value of stride is 32, it confronts an tensor size mismatch error when using letterbox to resize an image of (1280, 1964, 3) into (1280, 1280) with stride=32.

MRE

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install
import torch
import numpy as np
import cv2

from models.common import DetectMultiBackend
from utils.augmentations import letterbox

model_path = 'checkpoints/yolov5m6.pt'

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = DetectMultiBackend(model_path, device=device)
model.eval()

img = np.zeros((1280, 1964, 3), dtype=np.uint8)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

img = letterbox(img, new_shape=(1280, 1280), stride=32)[0]
# check the size of resized image, (864, 1280, 3)
assert img.shape == (864, 1280, 3)

img_2_tensor = torch.from_numpy(img.transpose((2, 0, 1))).float().to(device) / 255.0
img_2_tensor = img_2_tensor.unsqueeze(0)

pred = model(img_2_tensor)[0]

Error message:

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 28 but got size 27 for tensor number 1 in the list.

Dependencies:

  • torch==2.2.2
  • ultralytics==8.2.34

@glenn-jocher
Copy link
Member

Hello @TommeyChang,

Thank you for providing a detailed bug report and the minimum reproducible example (MRE). This is very helpful for diagnosing the issue. 😊

The tensor size mismatch error you're encountering when using letterbox with a stride of 32 is indeed concerning. Here are a few steps to help address this issue:

  1. Verify Latest Versions: Ensure you are using the latest versions of torch and the YOLOv5 repository. Sometimes, issues are resolved in newer releases. You can update your packages using:

    pip install --upgrade torch ultralytics
  2. Adjusting Stride: As you mentioned, setting the stride to 64 resolves the issue. You can modify the letterbox function call in your code to use this stride:

    img = letterbox(img, new_shape=(1280, 1280), stride=64)[0]
  3. Customizing letterbox Function: If you frequently encounter this issue, you might want to customize the letterbox function to handle different stride values more gracefully. Here’s an example of how you can modify the function:

    def letterbox(img, new_shape=(1280, 1280), color=(114, 114, 114), stride=32):
        # Your existing letterbox code here
        # Ensure the new_shape dimensions are divisible by the stride
        new_shape = [math.ceil(x / stride) * stride for x in new_shape]
        # Rest of the function
  4. Community Feedback: If this is a recurring issue for many users, we can consider updating the default stride value or adding more flexibility in future releases. Your feedback is valuable, and we encourage you to open an issue or a pull request on our GitHub repository to discuss this further with the community.

Thank you for your contribution and for being an active member of the YOLO community! If you have any further questions or need additional assistance, feel free to ask. 🌟

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants