-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Survival *_vec metrics seem to assume the .data.frame version of inputs #484
Comments
Hello @tripartio 👋 welcome! I think the issue you are running into, is happening because of a typo. When using library(yardstick)
lung_surv |>
roc_auc_survival(
truth = surv_obj,
.pred
)
#> # A tibble: 5 × 4
#> .metric .estimator .eval_time .estimate
#> <chr> <chr> <dbl> <dbl>
#> 1 roc_auc_survival standard 100 0.659
#> 2 roc_auc_survival standard 200 0.679
#> 3 roc_auc_survival standard 300 0.688
#> 4 roc_auc_survival standard 400 0.648
#> 5 roc_auc_survival standard 500 0.662
roc_auc_survival_vec(
truth = lung_surv$surv_obj,
estimate = lung_surv$.pred
)
#> # A tibble: 5 × 2
#> .eval_time .estimate
#> <dbl> <dbl>
#> 1 100 0.659
#> 2 200 0.679
#> 3 300 0.688
#> 4 400 0.648
#> 5 500 0.662
lung_surv |>
brier_survival(
truth = surv_obj,
.pred
)
#> # A tibble: 5 × 4
#> .metric .estimator .eval_time .estimate
#> <chr> <chr> <dbl> <dbl>
#> 1 brier_survival standard 100 0.109
#> 2 brier_survival standard 200 0.194
#> 3 brier_survival standard 300 0.219
#> 4 brier_survival standard 400 0.222
#> 5 brier_survival standard 500 0.197
brier_survival_vec(
truth = lung_surv$surv_obj,
estimate = lung_surv$.pred
)
#> # A tibble: 5 × 2
#> .eval_time .estimate
#> <dbl> <dbl>
#> 1 100 0.109
#> 2 200 0.194
#> 3 300 0.219
#> 4 400 0.222
#> 5 500 0.197
lung_surv |>
brier_survival_integrated(
truth = surv_obj,
.pred
)
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 brier_survival_integrated standard 0.158
brier_survival_integrated_vec(
truth = lung_surv$surv_obj,
estimate = lung_surv$.pred
)
#> [1] 0.1576877 |
@EmilHvitfeldt, thanks for the response, but what you gave me does not correspond at all to my understanding of what I expect from the Here is the documentation for the
First, the documentation for Second, what I expect for a How do we even obtain that particular kind of prediction? It seems to me that the list structure is the very specific If such total dependency is by design, I don't think that is a wise design decision. For example, I often use {yardstick} to calculate AUC when not using {tidymodels}. Could you please clarify me by providing accurate documentation for the Thanks for your time and consideration. |
Thank you, that is a mistake, and should be fixed. As a general rule in {yardstick}, all metrics that work like so: metric_function(data, truth = col1, estimate = col2) can be rewritten as metric_function_vec(truth = data$col1, estimate = data$col2) with the difference being that There are a couple of caveats to this. This pertains to probability, survival and curve metrics, that uses metric_function(data, truth = col1, col2)
Yes, the input structure is a little more complicated than with regression of classification metrics, but I hope that we have documented the input types clearly enough like here: https://yardstick.tidymodels.org/dev/reference/brier_survival.html#details If you want more information why that specific information is needed, I refer you to the two articles we have written about it on tidymodels.org
As far as we can tell, the input required for these predictions, are required to calculate these metrics correctly. You should be able to generate those results without {tidymodels}, but it may require some wrangling to get the format that {yardstick} expects.
I strongly disagree on this point. I'm not going to comment much on the rest of the post, because I think I have covered most of it earlier.
That is why I'm here! |
Also, i hope this doesn't come off condersending, as it isn't the point 🤞 The way you calculate the ROC curve for censored data is different than how you calculate it otherwise. See references listed here to see how censored data changes the calculations https://yardstick.tidymodels.org/dev/reference/roc_curve_survival.html#references |
@EmilHvitfeldt Thank you for your detailed response. No, it does not come off as condescending; rather, I really appreciate your time. I will need some time to absorb all this and retest what I'm trying to do. I hope to follow up next week. |
Hello. This is my first time here, so I will take the opportunity to first of all thank you for your fantastic package, which is an essential part of my ML workflow.
I'm having some trouble with the
*_vec
versions of some survival metrics in the development version of {yardstick} (version 1.2.0.9003). (I don't think these metrics exist in the 1.2.0 CRAN version.) Here is a reprex:Created on 2024-01-18 with reprex v2.1.0
I initially tried to fix the code and make a PR, but I soon found that the structure of interweaving functions is quite complicated; it probably requires someone internal to the function creation to resolve this. But I will try to share my attempts to isolate the bug.
It seems that the immediate source of the bug might be in the
check_dynamic_survival_metric()
function in the filecheck-metric.R
, which all six metric functions call (that is, the*.data.frame
versions as well as the*_vec
versions):Specifically, these metric functions all call
validate_surv_truth_list_estimate()
. However, that validation function seems to assume a very particular structure of the dataframe or tibble that is passed to the metric functions. Other than the fact that how to generate that particular structure does not seem to be well documented, the bug seems to be in that the*_vec
functions also assume that dataframe structure.Following the code for the
roc_auc_survival()
and theroc_auc_survival_vec()
functions (roc_auc_survival_vec() is the function that I personally need), it seems that in thesurv-roc_curve_survival.R
, anything that enters theroc_curve_survival_vec()
is passed on toroc_curve_survival_impl()
, which assumes that its input truthargument
is a dataframe in the desired format, rather than a vector. I would expect that theroc_curve_survival_vec()
should be called only when the dataframe method has already processed the dataframe into vector form, but that does not seem to be the case. I did not investigatebrier_survival_vec()
andbrier_survival_integrated_vec()
in such detail, but I suspect that the issue might be similar for those functions.Perhaps whoever wrote these metric functions might have written the dataframe versions from beginning to end and then added the vector versions without thoroughly testing them. If so, the bug might be resolved perhaps by rewriting the vector versions first and then writing the dataframe versions afterwards. I don't expect that any truly new code would be needed, but it seems that a lot of existing code will need to be rearranged.
I hope that the information I've given here can help to quickly identify and resolve these bugs.
The text was updated successfully, but these errors were encountered: