Train on image slices, validate/test on whole image? #11860

Trotts · 2024-07-18T10:45:17Z

Trotts
Jul 18, 2024

Hi,

I am currently using MMDetection for the training of a model which operates over large scale images containing small area classes. In order to aid the detection of these classes, I currently train my model on slices of the larger images (e.g. 250x250 subsections with some overlap) which I have generated offline.

Currently, when performing validation during the training process or testing after, my val/test data is also sliced offline. Once I have a trained model, I then use SAHI's get_sliced_prediction function to load the MMDetection model, process a whole image into slices, detect objects, post-process the boxes using non-max merging, and then rebuild the image.

Currently, I suspect my model's performance metrics are artificially lowered when evaluating on a per-patch basis rather than a per-image basis. As such, I'd like to be able to perform this process at validation/test time directly. For example, during the model training process, I'd like to train on the offline sliced images but at validation time operate over whole images. This will allow me to better understand how my model performs on whole images after non-max merging.

I understand from the documentation that the validation/test processes are controlled using Loops and Hooks. However, I'm currently unsure as to how I could go about creating Hook/Loop for this, or even which one of the two (or both?) I should be using. I'm struggling to find any documented examples of custom Hooks/Loops online, and I'm finding the source code to be so modular to the point where I'm struggling to connect the dots. The closest thing I can find is the documented EMAHook, though again I'm unsure how related this is to my own question.

Any help would be greatly appreciated.

===================================

For reference, my current config setup is below with added comments:

#dataset settings
dataset_type = 'CocoDataset'
classes = ['cup_coral', 'stylasterids', 'ophiocantha_vivipara', 'astrochlamys', 'demosponges', 'asteroidea', 'glass_sponge', 'anthomastus', 'alcyonium', 'ophiuroid_5_arms', 'actiniarian', 'gorgonian', 'pycnogonid', 'cucumber', 'pencil_urchin', 'worm_tubes', 'crinoid', 'bryozoan', 'benthic_fish', 'crustaceans', 'ascidian_pyura_bouvetensis', 'clump', 'echinoid', 'hydroid_solitary', 'buried_ophiuroid', 'buried_actinarian', 'ascidian_cnemidocarpa_verrucosa', 'ascidian_distaplia']
data_root = 'data/patches/250_250_overlap_05/'

h, w = (250, 250)

data=dict(train=dict(classes=classes),  val=dict(classes=classes),  test=dict(classes=classes))


backend_args = None

albu_transforms = [
    dict(type='HorizontalFlip', p=0.5),
    dict(type='VerticalFlip', p=0.5),
]

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='LoadAnnotations',
        with_bbox=True),
    # Redundant resize, but necessary to add scale_factor to img_meta (https://github.com/open-mmlab/mmdetection/issues/11164)
    dict(
        type='Resize',
        scale=(h, w),
        keep_ratio=True),
    dict(type = 'Albu',
        transforms = albu_transforms,
        bbox_params = dict(
            type='BboxParams',
            format='pascal_voc', # fixes error https://github.com/albumentations-team/albumentations/issues/459#issuecomment-919113188
            label_fields=['gt_bboxes_labels'],
            min_visibility=0.1,
            min_area = 0.1,
            filter_lost_elements=True),
            keymap={'img': 'image', 'gt_bboxes': 'bboxes'},
        skip_img_without_anno=True 
    ),
    dict(type='PackDetInputs',meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape','scale_factor'))
]
val_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='LoadAnnotations',
        with_bbox=True
        ),
    # Redundant resize, but necessary to add scale_factor to img_meta (https://github.com/open-mmlab/mmdetection/issues/11164)
    dict(
        type='Resize',
        scale=(h, w),
        keep_ratio=True),
    dict(type='PackDetInputs',meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape','scale_factor'))
]

test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='LoadAnnotations',
        with_bbox=True
        ),
    # Redundant resize, but necessary to add scale_factor to img_meta (https://github.com/open-mmlab/mmdetection/issues/11164)
    dict(
        type='Resize',
        scale=(h, w),
        keep_ratio=True),
    dict(type='PackDetInputs',meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape','scale_factor'))
]

metainfo = dict(classes = ['cup_coral', 'stylasterids', 'ophiocantha_vivipara', 'astrochlamys', 'demosponges', 'asteroidea', 'glass_sponge', 'anthomastus', 'alcyonium', 'ophiuroid_5_arms', 'actiniarian', 'gorgonian', 'pycnogonid', 'cucumber', 'pencil_urchin', 'worm_tubes', 'crinoid', 'bryozoan', 'benthic_fish', 'crustaceans', 'ascidian_pyura_bouvetensis', 'clump', 'echinoid', 'hydroid_solitary', 'buried_ophiuroid', 'buried_actinarian', 'ascidian_cnemidocarpa_verrucosa', 'ascidian_distaplia'], palette = [(255, 0, 0), (255, 55, 0), (255, 111, 0), (255, 167, 0), (255, 223, 0), (231, 255, 0), (175, 255, 0), (119, 255, 0), (63, 255, 0), (7, 255, 0), (0, 255, 47), (0, 255, 103), (0, 255, 159), (0, 255, 215), (0, 239, 255), (0, 183, 255), (0, 127, 255), (0, 71, 255), (0, 15, 255), (39, 0, 255), (95, 0, 255), (151, 0, 255), (207, 0, 255), (252, 0, 244), (255, 0, 191), (255, 0, 135), (255, 0, 79), (255, 0, 23)])

train_dataloader = dict(
    batch_size=64,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    batch_sampler=dict(type='AspectRatioBatchSampler'),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        metainfo=metainfo,
        ann_file='data/patches/250_250_overlap_05/annotations/dataset_train_patches.json', # Training on the offline sliced data
        data_prefix=dict(img=data_root + 'images/'),
        filter_cfg=dict(filter_empty_gt=True, min_size=32),
        pipeline=train_pipeline,
        backend_args=backend_args)) 

val_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        metainfo=metainfo,
        ann_file='data/patches/250_250_overlap_05/annotations/dataset_val_patches.json', # Could I make this a json of non-sliced images and then use SAHI's get_sliced_prediction function or similar at val time?
        data_prefix=dict(img=data_root + 'images/'),
        test_mode=True,
        pipeline=val_pipeline,
        backend_args=backend_args))

val_evaluator = dict(
    type='CocoMetric',
    ann_file='data/patches/250_250_overlap_05/annotations/dataset_val_patches.json', # Could I make this a json of non-sliced images and then use SAHI's get_sliced_prediction function or similar at val time?
    metric='bbox',
    format_only=False,
    backend_args=backend_args)

#inference on test dataset and
#format the output results for submission.

test_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        metainfo=metainfo,
        ann_file='data/patches/250_250_overlap_05/annotations/dataset_test_patches.json', # Could I make this a json of non-sliced images and then use SAHI's get_sliced_prediction function or similar at test time?
        data_prefix=dict(img=data_root + 'images/'),
        test_mode=True,
        pipeline=test_pipeline))

test_evaluator = dict(
    type='CocoMetric',
    metric='bbox',
    format_only=False,
    ann_file='data/patches/250_250_overlap_05/annotations/dataset_test_patches.json', # Could I make this a json of non-sliced images and then use SAHI's get_sliced_prediction function or similar at test time?
    outfile_prefix='./work_dirs/save_dir')

When performing SAHI after model training, I use:

from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction

model = AutoDetectionModel.from_pretrained(model_type = 'mmdet',
                           model_path = best_checkpoint,
                           config_path = config_src,
                           confidence_threshold = bbox_conf_threshold,
                           device = 'cuda') # I could use this functionality during test time as I would have the best_checkpoint and config, but how to do this at val time during the training process?

slice_height, slice_width = patch_size # e.g. (250, 250)

results_sahi = get_sliced_prediction(full_path, model,
                               slice_height = slice_height,
                               slice_width = slice_width,
                               overlap_height_ratio = overlap, # e.g. 0.5
                               overlap_width_ratio = overlap,
                               postprocess_type = 'NMM')

results_sahi_coco = results_sahi.to_coco_annotations() ## Send for visualisation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train on image slices, validate/test on whole image? #11860

{{title}}

Replies: 0 comments

Select a reply

Train on image slices, validate/test on whole image? #11860

Trotts Jul 18, 2024

Replies: 0 comments

Trotts
Jul 18, 2024