If you have many images to contribute to SEA-VL, you are welcome to refer to this guide for bulk upload to the SEA-VL image collection Github Repo.
Tip
We now have a UI tool that can make the process of adding the image descriptions and generating the required CSVs much easier!
- Navigate into the
batch_uploader
folder. Fill out your details in thecontributor_details.yaml
file. You only need to do this the first time. - Place images in the
./to_upload
folder as follows:- Any individual images may be kept directly in the
./to_upload
folder - If submitting multiple images that are very closely related to each other (e.g. the same object photographed from multiple angles or with different levels of zoom), place all such images in a single sub-folder within the
./to_upload
folder
- Any individual images may be kept directly in the
- Run
process_and_label.py
:- if the csv (
seavl_batch_labels - YOUR NAME.csv
) already exists, the script will warn you: type iny
to continue, and the script will append to the existing csv. IMPORTANT: if you have already submitted the csv, type inn
, manually delete the csv, and restart the script; otherwise, duplicate entries will be created in subsequent steps of the SEA-VL pipeline - for each image shown, fill in the English and native language boxes, and click next. The following will happen:
- a new line will be added to the csv
- the image (or all the images, if the image shown was part of a sub-folder) will be added to the
../data
folder after being processed (resized and renamed as required) - the image or sub-folder will be moved into the
./processing_complete
folder
- if the csv (
- Submit your hard work:
- raise a PR (to the
main
branch) with the new images added into the../data/
folder - send us the
seavl_batch_labels - YOUR NAME.csv
CSV through Discord / email for review.
- raise a PR (to the
- Upload the images to the
data
folder in this repo.-
⚠️ Please follow the current naming convention, each photo file should be named as{file_name} - {your_name}.{extension}
. -
⚠️ Keep the maximum image width/height to 2000px, so you might need to resize some images, you can use the following code to resize the image.from PIL import Image image = Image.open(input_path) (w, h) = image.size if w > 2000 or h > 2000: if w > h: h = int(h * 2000. / w) w = 2000 else: w = int(w * 2000. / h) h = 2000 image = image.resize((w, h), Image.BILINEAR) image.save(output_path)
-
- Use the CSV sample (link) to provide more details on the submitted photos.
⚠️ Keep the empty column as empty.⚠️ For theemail
column just fill it with [email protected].⚠️ For theimage
column, please fill it withhttps://raw.githubusercontent.com/SEACrowd/sea-vl-image-collection/refs/heads/main/data/%7Bfile_name%7D
.file_name
should be the same as the one you submitted to the repository.
- As the last step, you can send us the CSV through Discord / email ([email protected] or [email protected]) and we will review the CSV and images. If there is no issue, we will push the photos to the annotation platform.
Any questions? Message @rc.z
or @samuelcahyawijaya
on Discord!