-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add preprocessors distance_metrics
and histogram
#2299
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2299 +/- ##
==========================================
+ Coverage 94.44% 94.51% +0.07%
==========================================
Files 246 246
Lines 13745 14020 +275
==========================================
+ Hits 12981 13251 +270
- Misses 764 769 +5 ☔ View full report in Codecov by Sentry. |
I took a first look. The example using two CMIP6 models works nicely but when I try to use ERA-Interim or ERA5 as a reference dataset I get an error message about insufficient coordinate metadata:
I have no idea what might cause this problem. This is the recipe I tried:
|
Thanks for testing Axel! I think the reason for this error are different time coordinates in the cubes. Could you try to add the preprocessor |
Thanks for the idea. When using
|
Could you try to delete the following lines and run again? ESMValCore/esmvalcore/preprocessor/_time.py Lines 1061 to 1066 in b72da76
|
Commenting out those lines in
|
Then I suggest to fix |
Sounds like a plan! |
I found another little problem with this preprocessor: I would like to apply e.g. RMSE calculation to geographical distributions of multi-year annual means. For this, it would make sense to first use
Any ideas what this could work? Is the |
I am not entirely sure if I understand you correctly. What are your options to If you want annual means for multiple years instead (1 value per year for each grid cell) , I think |
Yes that's exactly what I mean:
This would be very useful, however, to compare annual means (e.g. for a benchmarking map plot)... |
Ah, so you want to get an RMSE for each grid cell that has been calculated from one time step? In that case RMSE=absolute bias so it's probably better to use the I think in general it's not possible to allow calculations of distance metrics over scalar dimensions. For example, the Pearson correlation is (AFAIK) undefined for one single value (it boils down to 0/0). In addition, it's not as easy as removing the line |
I guess I have to rethink this a bit. That happens when I try to be clever... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approval from me on a technical basis (Manu, pls fix merge conflicts), lemme know when Axel approves from testing/sci side, and will mergy-merge 🍺
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. My tests with histogram
and distance_metric
were successfuly and look fine. A few small wording suggestions for the documentation (please see additional comments).
Co-authored-by: Axel Lauer <[email protected]>
Co-authored-by: Axel Lauer <[email protected]>
2bffd03
to
cbb6680
Compare
Description
This PR adds a
distance_metrics
preprocessor that is able to calculate distance metrics between datasets and a reference dataset. In addition, a preprocessorhistogram
is added (which is necessary to calculate one of the metrics). These preprocessors have the following call signatures:If used within a recipe, exactly one dataset in the products which enter
distance_metrics
needs thereference_for_metric: true
key. Example:Currently supported metrics:
rmse
: root mean square errorweighted_rmse
: weighted root mean square errorpearsonr
: Pearson correlation coefficientweighted_pearsonr
: weighted Pearson correlation coefficientemd
: earth mover's distance/first Wasserstein metricweighted_emd
: weighted earth mover's distance/first Wasserstein metricCloses #2266
Link to documentation:
Before you get started
Checklist
It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.
All checks below this pull request were successfulCodacy issue about too many arguments cannot be fixed for preprocessor function.To help with the number pull requests: