Ref-Youtube-VOS

Model Zoo

To evaluate the results, please upload the zip file to the competition server.

Backbone	J&F	CFBI J&F	Pretrain	Model	Submission	CFBI Submission
ResNet-50	55.6	59.4	weight	model	link	link
ResNet-101	57.3	60.3	weight	model	link	link
Swin-T	58.7	61.2	weight	model	link	link
Swin-L	62.4	63.3	weight	model	link	link
Video-Swin-T*	55.8	-	-	model	link	-
Video-Swin-T	59.4	-	weight	model	link	-
Video-Swin-S	60.1	-	weight	model	link	-
Video-Swin-B	62.9	-	weight	model	link	-

* indicates the model is trained from scratch.

Joint training with Ref-COCO/+/g datasets.

Backbone	J&F	J	F	Model	Submission
ResNet-50	58.7	57.4	60.1	model	link
ResNet-101	59.3	58.1	60.4	model	link
Swin-L	64.2	62.3	66.2	model	link
Video-Swin-T	62.6	59.9	63.3	model	link
Video-Swin-S	63.3	61.4	65.2	model	link
Video-Swin-B	64.9	62.8	67.0	model	link

Inference & Evaluation

First, inference using the trained model.

python3 inference_ytvos.py --with_box_refine --binary --freeze_text_encoder --output_dir=[/path/to/output_dir] --resume=[/path/to/model_weight] --backbone [backbone]

python3 inference_ytvos.py --with_box_refine --binary --freeze_text_encoder --output_dir=ytvos_dirs/swin_tiny --resume=ytvos_swin_tiny.pth --backbone swin_t_p4w7

If you want to visualize the predited masks, you may add --visualize to the above command.

Then, enter the output_dir, rename the folder valid as Annotations. Use the following command to zip the folder:

zip -q -r submission.zip Annotations

To evaluate the results, please upload the zip file to the competition server.

Training

Finetune

The following command includes the training and inference stages.

./scripts/dist_train_test_ytvos.sh [/path/to/output_dir] [/path/to/pretrained_weight] --backbone [backbone]

For example, training the Video-Swin-Tiny model, run the following command:

./scripts/dist_train_test_ytvos.sh ytvos_dirs/video_swin_tiny pretrained_weights/video_swin_tiny_pretrained.pth --backbone video_swin_t_p4w7

Train from scratch

The following command includes the training and inference stages.

./scripts/dist_train_test_ytvos_scratch.sh [/path/to/output_dir] --backbone [backbone] --backbone_pretrained [/path/to/backbone_pretrained_weight] [other args]

For example, training the Video-Swin-Tiny model, run the following command:

./scripts/dist_train_test_ytvos.sh ytvos_dirs/video_swin_tiny_scratch --backbone video_swin_t_p4w7 --backbone_pretrained video_swin_pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ref-Youtube-VOS.md

Ref-Youtube-VOS.md

Ref-Youtube-VOS

Model Zoo

Inference & Evaluation

Training

Files

Ref-Youtube-VOS.md

Latest commit

History

Ref-Youtube-VOS.md

File metadata and controls

Ref-Youtube-VOS

Model Zoo

Inference & Evaluation

Training