Fit and estimate variable importance from a workflow using many bootstrap resamples.
Source:R/vi_boots.R
vi_boots.Rd
Generate variable importances from a tidymodel workflow using bootstrap resampling.
vi_boots()
generates n
bootstrap resamples, fits a model to each (creating
n
models), then creates n
estimates of variable importance for each variable
in the model.
Arguments
- workflow
An un-fitted workflow object.
- n
An integer for the number of bootstrap resampled models that will be created.
- training_data
A tibble or dataframe of data to be resampled and used for training.
- verbose
A logical. Defaults to
FALSE
. If set toTRUE
, prints progress of training to console.- ...
Additional params passed to
rsample::bootstraps()
.
Value
A tibble with a column indicating each variable in the model and a nested list of variable importances for each variable. The shape of the list may vary by model type. For example, linear models return two nested columns: the absolute value of each variable's importance and the sign (POS/NEG), whereas tree-based models return a single nested column of variable importance. Similarly, the number of nested rows may vary by model type as some models may not utilize every possible predictor.
Details
Since vi_boots()
fits a new model to each resample, the
argument workflow
must not yet be fit. Any tuned hyperparameters must be
finalized prior to calling vi_boots()
.
Examples
if (FALSE) {
library(tidymodels)
# setup a workflow without fitting
wf <-
workflow() %>%
add_recipe(recipe(qsec ~ wt, data = mtcars)) %>%
add_model(linear_reg())
# fit and estimate variable importance from 125 bootstrap resampled models
set.seed(123)
wf %>%
vi_boots(n = 2000, training_data = mtcars)
}