Evaluation, Reproducibility, Benchmarks Meeting 18

Minutes of meeting 18

Date: 27th July 2022

Present: Carole, Annika

Carole has implemented all metrics from the Metrics Reloaded Framework (as was in June 2022; we will send her the final list of metrics once finalized)
Open questions:
- Assume an Object Detection problem, for which the algorithm perfectly predicted the location of the reference object but assigned it to the wrong class. This would be partially correct. If validated per class, this may be penalized as “too heavy”.
  - We should separate these cases and define this as a new biomedical question in which you only define foreground (all classes together) versus background and make a new metric selection (traversal)
  - In the original question (localization+categorization), it is correct that these cases are considered as errors
- How to deal with different NaN cases in the aggregation? This should be done case by case, including
  - Empty image, empty prediction => correct
  - Empty image, non-empty prediction => incorrect
  - Non-empty image, empty prediction => incorrect
  - Missing submission of image (e.g. in challenges) => incorrect
- Carole will send the code to Annika and others (probably group leads)
- The code will probably be ready until the submission of the paper

Copyright (c) MONAI Consortium