Fit and estimate variable importance from a workflow using many bootstrap resamples.

Generate variable importances from a tidymodel workflow using bootstrap resampling. vi_boots() generates n bootstrap resamples, fits a model to each (creating n models), then creates n estimates of variable importance for each variable in the model.

Usage

vi_boots(workflow, n = 2000, training_data, verbose = FALSE, ...)

Arguments

workflow: An un-fitted workflow object.
n: An integer for the number of bootstrap resampled models that will be created.
training_data: A tibble or dataframe of data to be resampled and used for training.
verbose: A logical. Defaults to FALSE. If set to TRUE, prints progress of training to console.
...: Additional params passed to rsample::bootstraps().

Value

A tibble with a column indicating each variable in the model and a nested list of variable importances for each variable. The shape of the list may vary by model type. For example, linear models return two nested columns: the absolute value of each variable's importance and the sign (POS/NEG), whereas tree-based models return a single nested column of variable importance. Similarly, the number of nested rows may vary by model type as some models may not utilize every possible predictor.

Details

Since vi_boots() fits a new model to each resample, the argument workflow must not yet be fit. Any tuned hyperparameters must be finalized prior to calling vi_boots().

Examples

if (FALSE) {
library(tidymodels)

# setup a workflow without fitting
wf <-
  workflow() %>%
  add_recipe(recipe(qsec ~ wt, data = mtcars)) %>%
  add_model(linear_reg())

# fit and estimate variable importance from 125 bootstrap resampled models
set.seed(123)
wf %>%
  vi_boots(n = 2000, training_data = mtcars)
}