last_fit() $ operator is invalid for atomic vectors #716

KJT-Habitat · 2023-09-05T14:18:01Z

The problem

I am creating multiple random forest and support vector machines models for a classification problem. Each model is run 4 times using a different set of variables. I want to see how the variable selection impacts model accuracy. All my models work fine, except for one. The error occurs trying to fit the finalized workflow using the best performing model from the tune_bayes() object using a svm_rbf() model from kernlab. However, all the random forest models work without errors.

Reproducible example

I am unable to provide a small reproducible example, since evetime I change my dataset, the model works. I am happy to share the dataset directly to the developer for testing and trouble shooting.

# Load data
df<-read_csv("data.csv")

# Set seed
set.seed(5326)

# Split data
split<-initial_split(df,strata=Species)
train<-training(split)
test<-testing(split)

# Set up validation
fold<-vfold_cv(train,v=5,strata=Species)

# Create recipes
Recipe<-recipe(Species~.,data=train) %>%
     step_string2factor(Species) %>%
     step_rm(c("ID_Segs")) %>%
     step_normalize(all_numeric(),-all_outcomes())
     
# Model specification
spec<-svm_rbf(cost=tune(),
     rbf_sigma=tune()) %>%
     set_engine("kernlab") %>%
     set_mode("classification")
     
# Set up workflow
wflow<-workflow() %>%
     add_recipe(Recipe) %>%
     add_model(spec)
     
# Set parameters
param<-extract_parameter_set_dials(wflow)

# Model tuning
cl<-makePSOCKcluster(8)
registerDoParallel(cl)
tuned<-wflow %>%
     tune_bayes(
          resamples=fold,
          param_info=param,
          initial=5, 
          iter=100,
          metrics=metric_set(accuracy),
          control=control_bayes(
               no_improve=30,verbose=TRUE))
stopCluster(cl)
rm(cl)

# Select best parameters for final model
best<-select_best(tuned,"accuracy")
final<-finalize_workflow(wflow,best)
last_fit<-last_fit(final,split,
     metrics=metric_set(yardstick::accuracy,
          yardstick::f_meas,yardstick::precision,
          yardstick::recall,yardstick::kap,
          yardstick::roc_auc,yardstick::sens,
          yardstick::spec))
    
# error occurs here

→ A | error:   $ operator is invalid for atomic vectors
There were issues with some computations   A: x1
Warning message:
All models failed. Run `show_notes(.Last.tune.result)` for more information.
> show_notes(.Last.tune.result)
unique notes:
────────────────────────────────────────
$ operator is invalid for atomic vectors

The funny thing is that I got the same error on one of the other dataset using a svm_rbf() model , but after adding step_string2factor(Species) to my recipe fixed the issue, but not for this example. My reading led me to this thread #150 (comment)

Any advise on what is happening here?

Session Info

─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16 ucrt)
 os       Windows 11 x64 (build 22621)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_Canada.utf8
 ctype    English_Canada.utf8
 tz       America/Toronto
 date     2023-09-05
 pandoc   NA

─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────  
 package      * version    date (UTC) lib source
 backports      1.4.1      2021-12-13 [1] CRAN (R 4.3.0)
 bit            4.0.5      2022-11-15 [1] CRAN (R 4.3.1)
 bit64          4.0.5      2020-08-30 [1] CRAN (R 4.3.1)
 broom        * 1.0.5      2023-06-09 [1] CRAN (R 4.3.1)
 class          7.3-22     2023-05-03 [2] CRAN (R 4.3.1)
 classInt       0.4-9      2023-02-28 [1] CRAN (R 4.3.1)
 cli            3.6.1      2023-03-23 [1] CRAN (R 4.3.1)
 codetools      0.2-19     2023-02-01 [2] CRAN (R 4.3.1)
 colorspace     2.1-0      2023-01-23 [1] CRAN (R 4.3.1)
 cowplot      * 1.1.1      2020-12-30 [1] CRAN (R 4.3.1)
 crayon         1.5.2      2022-09-29 [1] CRAN (R 4.3.1)
 data.table     1.14.8     2023-02-17 [1] CRAN (R 4.3.1)
 DBI            1.1.3      2022-06-18 [1] CRAN (R 4.3.1)
 dials        * 1.2.0      2023-04-03 [1] CRAN (R 4.3.1)
 DiceDesign     1.9        2021-02-13 [1] CRAN (R 4.3.1)
 digest         0.6.33     2023-07-07 [1] CRAN (R 4.3.1)
 doParallel   * 1.0.17     2022-02-07 [1] CRAN (R 4.3.1)
 dplyr        * 1.1.2      2023-04-20 [1] CRAN (R 4.3.1)
 e1071          1.7-13     2023-02-01 [1] CRAN (R 4.3.1)
 ellipsis       0.3.2      2021-04-29 [1] CRAN (R 4.3.1)
 fansi          1.0.4      2023-01-22 [1] CRAN (R 4.3.1)
 foreach      * 1.5.2      2022-02-02 [1] CRAN (R 4.3.1)
 fs             1.6.3      2023-07-20 [1] CRAN (R 4.3.1)
 furrr          0.3.1      2022-08-15 [1] CRAN (R 4.3.1)
 future         1.33.0     2023-07-01 [1] CRAN (R 4.3.1)
 future.apply   1.11.0     2023-05-21 [1] CRAN (R 4.3.1)
 generics       0.1.3      2022-07-05 [1] CRAN (R 4.3.1)
 ggplot2      * 3.4.3      2023-08-14 [1] CRAN (R 4.3.1)
 globals        0.16.2     2022-11-21 [1] CRAN (R 4.3.0)
 glue           1.6.2      2022-02-24 [1] CRAN (R 4.3.1)
 gower          1.0.1      2022-12-22 [1] CRAN (R 4.3.0)
 GPfit          1.0-8      2019-02-08 [1] CRAN (R 4.3.1)
 gridExtra      2.3        2017-09-09 [1] CRAN (R 4.3.1)
 gtable         0.3.3      2023-03-21 [1] CRAN (R 4.3.1)
 hardhat        1.3.0      2023-03-30 [1] CRAN (R 4.3.1)
 hms            1.1.3      2023-03-21 [1] CRAN (R 4.3.1)
 infer        * 1.0.4      2022-12-02 [1] CRAN (R 4.3.1)
 ipred          0.9-14     2023-03-09 [1] CRAN (R 4.3.1)
 iterators    * 1.0.14     2022-02-05 [1] CRAN (R 4.3.1)
 jsonlite       1.8.7      2023-06-29 [1] CRAN (R 4.3.1)
 kernlab      * 0.9-32     2023-01-31 [1] CRAN (R 4.3.0)
 KernSmooth     2.23-22    2023-07-10 [1] CRAN (R 4.3.1)
 lattice        0.21-8     2023-04-05 [2] CRAN (R 4.3.1)
 lava           1.7.2.1    2023-02-27 [1] CRAN (R 4.3.1)
 lhs            1.1.6      2022-12-17 [1] CRAN (R 4.3.1)
 lifecycle      1.0.3      2022-10-07 [1] CRAN (R 4.3.1)
 listenv        0.9.0      2022-12-16 [1] CRAN (R 4.3.1)
 lubridate      1.9.2      2023-02-10 [1] CRAN (R 4.3.1)
 magrittr     * 2.0.3      2022-03-30 [1] CRAN (R 4.3.1)
 MASS           7.3-60     2023-05-04 [1] CRAN (R 4.3.1)
 Matrix         1.6-1      2023-08-14 [1] CRAN (R 4.3.1)
 modeldata    * 1.2.0      2023-08-09 [1] CRAN (R 4.3.1)
 munsell        0.5.0      2018-06-12 [1] CRAN (R 4.3.1)
 nnet           7.3-19     2023-05-03 [2] CRAN (R 4.3.1)
 parallelly     1.36.0     2023-05-26 [1] CRAN (R 4.3.0)
 parsnip      * 1.1.1      2023-08-17 [1] CRAN (R 4.3.1)
 pillar         1.9.0      2023-03-22 [1] CRAN (R 4.3.1)
 pins         * 1.2.1      2023-08-16 [1] CRAN (R 4.3.1)
 pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.3.1)
 prodlim        2023.03.31 2023-04-02 [1] CRAN (R 4.3.1)
 proxy          0.4-27     2022-06-09 [1] CRAN (R 4.3.1)
 purrr        * 1.0.2      2023-08-10 [1] CRAN (R 4.3.1)
 R6             2.5.1      2021-08-19 [1] CRAN (R 4.3.1)
 ranger       * 0.15.1     2023-04-03 [1] CRAN (R 4.3.1)
 rappdirs       0.3.3      2021-01-31 [1] CRAN (R 4.3.1)
 Rcpp           1.0.11     2023-07-06 [1] CRAN (R 4.3.1)
 readr        * 2.1.4      2023-02-10 [1] CRAN (R 4.3.1)
 recipes      * 1.0.7      2023-08-10 [1] CRAN (R 4.3.1)
 rlang          1.1.1      2023-04-28 [1] CRAN (R 4.3.1)
 ROSE           0.0-4      2021-06-14 [1] CRAN (R 4.3.1)
 rpart          4.1.19     2022-10-21 [2] CRAN (R 4.3.1)
 rsample      * 1.1.1      2022-12-07 [1] CRAN (R 4.3.1)
 rstudioapi     0.15.0     2023-07-07 [1] CRAN (R 4.3.1)
 scales       * 1.2.1      2022-08-20 [1] CRAN (R 4.3.1)
 sessioninfo  * 1.2.2      2021-12-06 [1] CRAN (R 4.3.1)
 sf           * 1.0-14     2023-07-11 [1] CRAN (R 4.3.1)
 stringi        1.7.12     2023-01-11 [1] CRAN (R 4.3.0)
 stringr      * 1.5.0      2022-12-02 [1] CRAN (R 4.3.1)
 survival       3.5-7      2023-08-14 [1] CRAN (R 4.3.1)
 terra        * 1.7-39     2023-06-23 [1] CRAN (R 4.3.1)
 themis       * 1.0.2      2023-08-14 [1] CRAN (R 4.3.1)
 tibble       * 3.2.1      2023-03-20 [1] CRAN (R 4.3.1)
 tidymodels   * 1.1.0      2023-05-01 [1] CRAN (R 4.3.1)
 tidyr        * 1.3.0      2023-01-24 [1] CRAN (R 4.3.1)
 tidyselect     1.2.0      2022-10-10 [1] CRAN (R 4.3.1)
 timechange     0.2.0      2023-01-11 [1] CRAN (R 4.3.1)
 timeDate       4022.108   2023-01-07 [1] CRAN (R 4.3.0)
 tune         * 1.1.1      2023-04-11 [1] CRAN (R 4.3.1)
 tzdb           0.4.0      2023-05-12 [1] CRAN (R 4.3.1)
 units          0.8-3      2023-08-10 [1] CRAN (R 4.3.1)
 utf8           1.2.3      2023-01-31 [1] CRAN (R 4.3.1)
 vctrs          0.6.3      2023-06-14 [1] CRAN (R 4.3.1)
 vetiver      * 0.2.3      2023-08-14 [1] CRAN (R 4.3.1)
 vip          * 0.3.2      2020-12-17 [1] CRAN (R 4.3.0)
 vroom          1.6.3      2023-04-28 [1] CRAN (R 4.3.1)
 withr          2.5.0      2022-03-03 [1] CRAN (R 4.3.1)
 workflows    * 1.1.3      2023-02-22 [1] CRAN (R 4.3.1)
 workflowsets * 1.0.1      2023-04-06 [1] CRAN (R 4.3.1)
 yardstick    * 1.2.0      2023-04-21 [1] CRAN (R 4.3.1)

 [1] C:/Users/Jurie/AppData/Local/R/win-library/4.3
 [2] C:/Program Files/R/R-4.3.1/library

simonpcouch · 2023-09-06T19:22:12Z

Thanks for the issue! There's unfortunately not much we can do here without a reproducible example.

Could you upload data.csv to the internet and supply that URL to read_csv()? The selector all_numeric(),-all_outcomes() does raise an eyebrow—perhaps all_numeric_predictors() instead? Does the issue persist when tuning sequentially instead of in parallel?

KJT-Habitat · 2023-09-06T19:36:59Z

Thanks for your response, @simonpcouch.

I have emailed you data.csv on your gmail linked to your Github account.
I will now test your two suggestions, first trying it sequentially, then using all_numeric_predictors. I personally don't think all_numeric_predictors will do anything, as the code above worked for all other models. It does take a long time to run, so testing it on 20 iter and 5 no_improve.

Please confirm if you received the data.

simonpcouch · 2023-09-06T19:50:09Z

Sure thing! I did receive the data though I'm unable to reproduce the error you've shown.

KJT-Habitat · 2023-09-06T19:56:28Z

Thanks for confirming. That was fast, did you run the code on the full dataset? How did you modify the code above? Could it be outdated package versions?

simonpcouch · 2023-09-06T20:17:17Z

Just loaded tidyverse, tidymodels, parallel, and doParallel as needed.

If you're able to put together a reprex with more minimal input data we'll be glad to take a look. Given our inability to reproduce with the provided information, I'm going to go ahead and close.

github-actions · 2023-09-21T00:32:26Z

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

simonpcouch closed this as completed Sep 6, 2023

github-actions bot locked and limited conversation to collaborators Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

last_fit() $ operator is invalid for atomic vectors #716

last_fit() $ operator is invalid for atomic vectors #716

KJT-Habitat commented Sep 5, 2023

simonpcouch commented Sep 6, 2023

KJT-Habitat commented Sep 6, 2023 •

edited

Loading

simonpcouch commented Sep 6, 2023

KJT-Habitat commented Sep 6, 2023 •

edited

Loading

simonpcouch commented Sep 6, 2023

github-actions bot commented Sep 21, 2023

last_fit() $ operator is invalid for atomic vectors #716

last_fit() $ operator is invalid for atomic vectors #716

Comments

KJT-Habitat commented Sep 5, 2023

The problem

Reproducible example

simonpcouch commented Sep 6, 2023

KJT-Habitat commented Sep 6, 2023 • edited Loading

simonpcouch commented Sep 6, 2023

KJT-Habitat commented Sep 6, 2023 • edited Loading

simonpcouch commented Sep 6, 2023

github-actions bot commented Sep 21, 2023

KJT-Habitat commented Sep 6, 2023 •

edited

Loading

KJT-Habitat commented Sep 6, 2023 •

edited

Loading