Skip to content

tellarin/sea-vl-image-collection

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Image Bulk Submission Guide

If you have many images to contribute to SEA-VL, you are welcome to refer to this guide for bulk upload to the SEA-VL image collection Github Repo.

UI Tool-based Upload

Tip

We now have a UI tool that can make the process of adding the image descriptions and generating the required CSVs much easier!

  1. Navigate into the batch_uploader folder. Fill out your details in the contributor_details.yaml file. You only need to do this the first time.
  2. Place images in the ./to_upload folder as follows:
    • Any individual images may be kept directly in the ./to_upload folder
    • If submitting multiple images that are very closely related to each other (e.g. the same object photographed from multiple angles or with different levels of zoom), place all such images in a single sub-folder within the ./to_upload folder
  3. Run process_and_label.py:
    • if the csv (seavl_batch_labels - YOUR NAME.csv) already exists, the script will warn you: type in y to continue, and the script will append to the existing csv. IMPORTANT: if you have already submitted the csv, type in n, manually delete the csv, and restart the script; otherwise, duplicate entries will be created in subsequent steps of the SEA-VL pipeline
    • for each image shown, fill in the English and native language boxes, and click next. The following will happen:
      • a new line will be added to the csv
      • the image (or all the images, if the image shown was part of a sub-folder) will be added to the ../data folder after being processed (resized and renamed as required)
      • the image or sub-folder will be moved into the ./processing_complete folder
  4. Submit your hard work:
    • raise a PR (to the main branch) with the new images added into the ../data/ folder
    • send us the seavl_batch_labels - YOUR NAME.csv CSV through Discord / email for review.

Manual Upload

  1. Upload the images to the data folder in this repo.
    • ⚠️ Please follow the current naming convention, each photo file should be named as {file_name} - {your_name}.{extension}.

    • ⚠️ Keep the maximum image width/height to 2000px, so you might need to resize some images, you can use the following code to resize the image.

      from PIL import Image
      image = Image.open(input_path)
      (w, h) = image.size
      if w > 2000 or h > 2000:
          if w > h:
              h = int(h * 2000. / w) 
              w = 2000 
          else:
              w = int(w * 2000. / h)
              h = 2000
      image = image.resize((w, h), Image.BILINEAR) 
      image.save(output_path)
      
  2. Use the CSV sample (link) to provide more details on the submitted photos.
    • ⚠️ Keep the empty column as empty.
    • ⚠️ For the email column just fill it with [email protected].
    • ⚠️ For the image column, please fill it with https://raw.githubusercontent.com/SEACrowd/sea-vl-image-collection/refs/heads/main/data/%7Bfile_name%7D. file_name should be the same as the one you submitted to the repository.
  3. As the last step, you can send us the CSV through Discord / email ([email protected] or [email protected]) and we will review the CSV and images. If there is no issue, we will push the photos to the annotation platform.

Any questions? Message @rc.z or @samuelcahyawijaya on Discord!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%