-
Notifications
You must be signed in to change notification settings - Fork 89
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
improve docs and errors re: model formulas
- Loading branch information
1 parent
907d216
commit 5f622eb
Showing
5 changed files
with
230 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
#' Formulas with special terms in tidymodels | ||
#' | ||
#' @description | ||
#' | ||
#' In R, formulas provide a compact, symbolic notation to specify model terms. | ||
#' Many modeling functions in R make use of ["specials"][stats::terms.formula], | ||
#' or nonstandard notations used in formulas. Specials are defined and handled as | ||
#' a special case by a given modeling package. For example, the mgcv package, | ||
#' which provides support for | ||
#' [generalized additive models][parsnip::gen_additive_mod] in R, defines a | ||
#' function `s()` to be in-lined into formulas. It can be used like so: | ||
#' | ||
#' ``` r | ||
#' mgcv::gam(mpg ~ wt + s(disp, k = 5), data = mtcars) | ||
#' ``` | ||
#' | ||
#' In this example, the `s()` special defines a smoothing term that the mgcv | ||
#' package knows to look for when preprocessing model input. | ||
#' | ||
#' The parsnip package can handle most specials without issue. The analogous | ||
#' code for specifying this generalized additive model | ||
#' [with the parsnip "mgcv" engine][parsnip::details_gen_additive_mod_mgcv] | ||
#' looks like: | ||
#' | ||
#' ``` r | ||
#' gen_additive_mod() %>% | ||
#' set_mode("regression") %>% | ||
#' set_engine("mgcv") %>% | ||
#' fit(mpg ~ wt + s(disp, k = 5), data = mtcars) | ||
#' ``` | ||
#' | ||
#' However, parsnip is often used in conjunction with the greater tidymodels | ||
#' package ecosystem, which defines its own pre-processing infrastructure and | ||
#' functionality via packages like hardhat and recipes. The specials defined | ||
#' in many modeling packages introduce conflicts with that infrastructure. | ||
#' | ||
#' To support specials while also maintaining consistent syntax elsewhere in | ||
#' the ecosystem, **the tidymodels delineates between two types of formulas: | ||
#' preprocessing formulas and model formulas**. Preprocessing formulas determine | ||
#' the model terms, while model formulas determine the model structure. | ||
#' | ||
#' @section Example: | ||
#' | ||
#' To create the preprocessing formula from the model formula, just remove | ||
#' the specials, retaining references to model terms themselves. For example: | ||
#' | ||
#' ``` | ||
#' model_formula <- mpg ~ wt + s(disp, k = 5) | ||
#' preproc_formula <- mpg ~ wt + disp | ||
#' ``` | ||
#' | ||
#' \itemize{ | ||
#' \item **With parsnip,** just use the model formula: | ||
#' | ||
#' ``` r | ||
#' model_spec <- | ||
#' gen_additive_mod() %>% | ||
#' set_mode("regression") %>% | ||
#' set_engine("mgcv") | ||
#' | ||
#' model_spec %>% | ||
#' fit(model_formula, data = mtcars) | ||
#' ``` | ||
#' | ||
#' \item **With workflows,** use the preprocessing formula everywhere, but | ||
#' pass the model formula to the `formula` argument in `add_model()`: | ||
#' | ||
#' ``` r | ||
#' library(workflows) | ||
#' | ||
#' wflow <- | ||
#' workflow() %>% | ||
#' add_formula(preproc_formula) %>% | ||
#' add_model(model_spec, formula = model_formula) | ||
#' | ||
#' fit(wflow, data = mtcars) | ||
#' ``` | ||
#' | ||
#' We would still use the preprocessing formula if we had added | ||
#' a recipe preprocessor using `add_recipe()` instead a formula via | ||
#' `add_formula()`. | ||
#' | ||
#' \item **With recipes**, use the preprocessing formula only: | ||
#' | ||
#' ``` r | ||
#' library(recipes) | ||
#' | ||
#' recipe(preproc_formula, mtcars) | ||
#' ``` | ||
#' | ||
#' The recipes package supplies a large variety of preprocessing techniques | ||
#' that may replace the need for specials altogether, in some cases. | ||
#' | ||
#' \item **With tune**, use a workflow (rather than a model specification | ||
#' alone), implemented as before: | ||
#' | ||
#' ``` r | ||
#' library(tune) | ||
#' library(rsample) | ||
#' | ||
#' fit_resamples(wflow, data = bootstraps(mtcars)) | ||
#' ``` | ||
#' | ||
#' } | ||
#' | ||
#' @name model_formula | ||
NULL |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters