Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New metric request: NSE #446

Open
atsyplenkov opened this issue Oct 2, 2023 · 0 comments
Open

New metric request: NSE #446

atsyplenkov opened this issue Oct 2, 2023 · 0 comments
Labels
metric 📏 a new yardstick metric

Comments

@atsyplenkov
Copy link

Nash-Sutcliffe efficiency (NSE)

Please consider adding the Nash and Sutcliffe Efficiency (NSE) metric. It is highly used in hydrological studies, as it indicates how different the modeled-observed relationship is from the 1:1 line, making it a more robust metric than R².

The NSE values, as described by Nash and Sutcliffe (1970), range from -Inf to 1.

  • An efficiency of 1 (NSE = 1) corresponds to a perfect match between the modeled and observed data.
  • An efficiency of 0 (NSE = 0) indicates that the model predictions are as accurate as the mean of the observed data.
  • An efficiency less than zero (-Inf < NSE < 0) occurs when the observed mean is a better predictor than the model.

Essentially, the closer the model efficiency is to 1, the more accurate the model is.

Reprex

library(tidymodels) # CRAN v1.1.1
library(rlang) # CRAN v1.1.1

# Vector implementation
nse_impl <-
  function(truth,
           estimate,
           case_weights = NULL) {
    1 - (sum((truth - estimate) ^ 2) / sum((truth - mean(truth)) ^ 2))
    
  }

nse_vec <-
  function(truth,
           estimate,
           na_rm = TRUE,
           case_weights = NULL,
           ...) {
    check_numeric_metric(truth, estimate, case_weights)
    
    if (na_rm) {
      result <-
        yardstick_remove_missing(truth, estimate, case_weights)
      
      truth <- result$truth
      estimate <- result$estimate
      case_weights <- result$case_weights
      
    } else if (yardstick_any_missing(truth, estimate, case_weights)) {
      return(NA_real_)
      
    }
    
    nse_impl(truth, estimate, case_weights = case_weights)
    
  }

# Vector Check
data("solubility_test")

nse_vec(truth = solubility_test$solubility,
        estimate = solubility_test$prediction)
#> [1] 0.8789135


# Dataframe implementation
nse <- function(data, ...) {
  UseMethod("nse")
}

nse <-
  new_numeric_metric(nse, direction = "maximize")

nse.data.frame <-
  function(data,
           truth,
           estimate,
           na_rm = TRUE,
           case_weights = NULL,
           ...) {
    numeric_metric_summarizer(
      name = "nse",
      fn = nse_vec,
      data = data,
      truth = !!enquo(truth),
      estimate = !!enquo(estimate),
      na_rm = na_rm,
      case_weights = !!enquo(case_weights)
    )
    
  }

nse(solubility_test, truth = solubility, estimate = prediction)
#> # A tibble: 1 × 3
#>   .metric .estimator .estimate
#>   <chr>   <chr>          <dbl>
#> 1 nse     standard       0.879

Created on 2023-10-03 with reprex v2.0.2

@EmilHvitfeldt EmilHvitfeldt added the metric 📏 a new yardstick metric label Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
metric 📏 a new yardstick metric
Projects
None yet
Development

No branches or pull requests

2 participants