Fit and estimate variable importance from a workflow using many bootstrap resamples.
Source:R/vi_boots.R
vi_boots.RdGenerate variable importances from a tidymodel workflow using bootstrap resampling.
vi_boots() generates n bootstrap resamples, fits a model to each (creating
n models), then creates n estimates of variable importance for each variable
in the model.
Arguments
- workflow
An un-fitted workflow object.
- n
An integer for the number of bootstrap resampled models that will be created.
- training_data
A tibble or dataframe of data to be resampled and used for training.
- verbose
A logical. Defaults to
FALSE. If set toTRUE, prints progress of training to console.- ...
Additional params passed to
rsample::bootstraps().
Value
A tibble with a column indicating each variable in the model and a nested list of variable importances for each variable. The shape of the list may vary by model type. For example, linear models return two nested columns: the absolute value of each variable's importance and the sign (POS/NEG), whereas tree-based models return a single nested column of variable importance. Similarly, the number of nested rows may vary by model type as some models may not utilize every possible predictor.
Details
Since vi_boots() fits a new model to each resample, the
argument workflow must not yet be fit. Any tuned hyperparameters must be
finalized prior to calling vi_boots().
Examples
if (FALSE) {
library(tidymodels)
# setup a workflow without fitting
wf <-
workflow() %>%
add_recipe(recipe(qsec ~ wt, data = mtcars)) %>%
add_model(linear_reg())
# fit and estimate variable importance from 125 bootstrap resampled models
set.seed(123)
wf %>%
vi_boots(n = 2000, training_data = mtcars)
}