Skip to content

Latest commit

 

History

History
83 lines (56 loc) · 6.32 KB

Ref-Youtube-VOS.md

File metadata and controls

83 lines (56 loc) · 6.32 KB

Ref-Youtube-VOS

Model Zoo

To evaluate the results, please upload the zip file to the competition server.

Backbone J&F CFBI J&F Pretrain Model Submission CFBI Submission
ResNet-50 55.6 59.4 weight model link link
ResNet-101 57.3 60.3 weight model link link
Swin-T 58.7 61.2 weight model link link
Swin-L 62.4 63.3 weight model link link
Video-Swin-T* 55.8 - - model link -
Video-Swin-T 59.4 - weight model link -
Video-Swin-S 60.1 - weight model link -
Video-Swin-B 62.9 - weight model link -

* indicates the model is trained from scratch.

Joint training with Ref-COCO/+/g datasets.

Backbone J&F J F Model Submission
ResNet-50 58.7 57.4 60.1 model link
ResNet-101 59.3 58.1 60.4 model link
Swin-L 64.2 62.3 66.2 model link
Video-Swin-T 62.6 59.9 63.3 model link
Video-Swin-S 63.3 61.4 65.2 model link
Video-Swin-B 64.9 62.8 67.0 model link

Inference & Evaluation

First, inference using the trained model.

python3 inference_ytvos.py --with_box_refine --binary --freeze_text_encoder --output_dir=[/path/to/output_dir] --resume=[/path/to/model_weight] --backbone [backbone] 
python3 inference_ytvos.py --with_box_refine --binary --freeze_text_encoder --output_dir=ytvos_dirs/swin_tiny --resume=ytvos_swin_tiny.pth --backbone swin_t_p4w7

If you want to visualize the predited masks, you may add --visualize to the above command.

Then, enter the output_dir, rename the folder valid as Annotations. Use the following command to zip the folder:

zip -q -r submission.zip Annotations

To evaluate the results, please upload the zip file to the competition server.

Training

  • Finetune

The following command includes the training and inference stages.

./scripts/dist_train_test_ytvos.sh [/path/to/output_dir] [/path/to/pretrained_weight] --backbone [backbone] 

For example, training the Video-Swin-Tiny model, run the following command:

./scripts/dist_train_test_ytvos.sh ytvos_dirs/video_swin_tiny pretrained_weights/video_swin_tiny_pretrained.pth --backbone video_swin_t_p4w7 
  • Train from scratch

The following command includes the training and inference stages.

./scripts/dist_train_test_ytvos_scratch.sh [/path/to/output_dir] --backbone [backbone] --backbone_pretrained [/path/to/backbone_pretrained_weight] [other args]

For example, training the Video-Swin-Tiny model, run the following command:

./scripts/dist_train_test_ytvos.sh ytvos_dirs/video_swin_tiny_scratch --backbone video_swin_t_p4w7 --backbone_pretrained video_swin_pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth