feat: add longclip #20

bwanglzu · 2024-04-24T15:07:05Z

support LongCLIP style training with an additional longclip arg, during training apply a non-zero PCA on image_features to obtain principal components of image_features and maintain a short_loss together with full-caption loss.

koukandre · 2024-04-24T15:11:32Z

src/open_clip/loss.py

@@ -16,6 +17,7 @@
 except ImportError:
    hvd = None

+from utils import PCA


from .utils import PCA

koukandre · 2024-04-24T15:18:59Z

src/training/train.py

@@ -200,6 +205,9 @@ def train_one_epoch(

                    losses['embedding_loss'] = args.emb_loss_weight * embedding_loss

+                if args.longclip:
+                    modelout_short = model(images_short, texts_short)
+                    loss_short = loss(**modelout_short, output_dict=True, pca_dim=32)


losses['short_loss'] = loss(**modelout_short, output_dict=True, pca_dim=32)
this also, if we use only one loss for image-text pair

bwanglzu added 2 commits April 24, 2024 17:06

feat: add longclip

01ab719

feat: add args

1c82bbc

koukandre reviewed Apr 24, 2024

View reviewed changes

feat: add longclip loss

6103e88

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add longclip #20

feat: add longclip #20

bwanglzu commented Apr 24, 2024 •

edited

Loading

koukandre Apr 24, 2024

koukandre Apr 24, 2024 •

edited

Loading

feat: add longclip #20

Are you sure you want to change the base?

feat: add longclip #20

Conversation

bwanglzu commented Apr 24, 2024 • edited Loading

koukandre Apr 24, 2024

Choose a reason for hiding this comment

koukandre Apr 24, 2024 • edited Loading

Choose a reason for hiding this comment

bwanglzu commented Apr 24, 2024 •

edited

Loading

koukandre Apr 24, 2024 •

edited

Loading