Fit the final best model to the training set and evaluate the test set

last_fit() emulates the process where, after determining the best model, the final fit on the entire training set is needed and is then evaluated on the test set.

Usage

last_fit(object, ...)

# S3 method for class 'model_spec'
last_fit(
  object,
  preprocessor,
  split,
  ...,
  metrics = NULL,
  eval_time = NULL,
  control = control_last_fit(),
  add_validation_set = FALSE
)

# S3 method for class 'workflow'
last_fit(
  object,
  split,
  ...,
  metrics = NULL,
  eval_time = NULL,
  control = control_last_fit(),
  add_validation_set = FALSE
)

Arguments

object: A parsnip model specification or an unfitted workflow(). No tuning parameters are allowed; if arguments have been marked with tune(), their values must be finalized.
...: Currently unused.
preprocessor: A traditional model formula or a recipe created using recipes::recipe().
split: An rsplit object created from rsample::initial_split() or rsample::initial_validation_split().
metrics: A yardstick::metric_set(), or NULL to compute a standard set of metrics.
eval_time: A numeric vector of time points where dynamic event time metrics should be computed (e.g. the time-dependent ROC curve, etc). The values must be non-negative and should probably be no greater than the largest event time in the training set (See Details below).
control: A control_last_fit() object used to fine tune the last fit process.
add_validation_set: For 3-way splits into training, validation, and test set via rsample::initial_validation_split(), should the validation set be included in the data set used to train the model. If not, only the training set is used.

Value

A single row tibble that emulates the structure of fit_resamples(). However, a list column called .workflow is also attached with the fitted model (and recipe, if any) that used the training set. Helper functions for formatting tuning results like collect_metrics() and collect_predictions() can be used with last_fit() output.

Details

This function is intended to be used after fitting a variety of models and the final tuning parameters (if any) have been finalized. The next step would be to fit using the entire training set and verify performance using the test data.

Case Weights

Some models can utilize case weights during training. tidymodels currently supports two types of case weights: importance weights (doubles) and frequency weights (integers). Frequency weights are used during model fitting and evaluation, whereas importance weights are only used during fitting.

To know if your model is capable of using case weights, create a model spec and test it using parsnip::case_weights_allowed().

To use them, you will need a numeric column in your data set that has been passed through either hardhat:: importance_weights() or hardhat::frequency_weights().

For functions such as fit_resamples() and the tune_*() functions, the model must be contained inside of a workflows::workflow(). To declare that case weights are used, invoke workflows::add_case_weights() with the corresponding (unquoted) column name.

From there, the packages will appropriately handle the weights during model fitting and (if appropriate) performance estimation.

Censored Regression Models

Three types of metrics can be used to assess the quality of censored regression models:

static: the prediction is independent of time.
dynamic: the prediction is a time-specific probability (e.g., survival probability) and is measured at one or more particular times.
integrated: same as the dynamic metric but returns the integral of the different metrics from each time point.

Which metrics are chosen by the user affects how many evaluation times should be specified. For example:

# Needs no `eval_time` value
metric_set(concordance_survival)

# Needs at least one `eval_time`
metric_set(brier_survival)
metric_set(brier_survival, concordance_survival)

# Needs at least two eval_time` values
metric_set(brier_survival_integrated, concordance_survival)
metric_set(brier_survival_integrated, concordance_survival)
metric_set(brier_survival_integrated, concordance_survival, brier_survival)

Values of eval_time should be less than the largest observed event time in the training data. For many non-parametric models, the results beyond the largest time corresponding to an event are constant (or NA).

Examples

library(recipes)
library(rsample)
library(parsnip)

set.seed(6735)
tr_te_split <- initial_split(mtcars)

spline_rec <- recipe(mpg ~ ., data = mtcars) %>%
  step_ns(disp)

lin_mod <- linear_reg() %>%
  set_engine("lm")

spline_res <- last_fit(lin_mod, spline_rec, split = tr_te_split)
spline_res
#> # Resampling results
#> # Manual resampling 
#> # A tibble: 1 × 6
#>   splits         id           .metrics .notes   .predictions .workflow 
#>   <list>         <chr>        <list>   <list>   <list>       <list>    
#> 1 <split [24/8]> train/test … <tibble> <tibble> <tibble>     <workflow>

# test set metrics
collect_metrics(spline_res)
#> # A tibble: 2 × 4
#>   .metric .estimator .estimate .config             
#>   <chr>   <chr>          <dbl> <chr>               
#> 1 rmse    standard       3.80  Preprocessor1_Model1
#> 2 rsq     standard       0.729 Preprocessor1_Model1

# test set predictions
collect_predictions(spline_res)
#> # A tibble: 8 × 5
#>   .pred id                .row   mpg .config             
#>   <dbl> <chr>            <int> <dbl> <chr>               
#> 1  22.1 train/test split     1  21   Preprocessor1_Model1
#> 2  29.6 train/test split     3  22.8 Preprocessor1_Model1
#> 3  13.5 train/test split     7  14.3 Preprocessor1_Model1
#> 4  18.9 train/test split    10  19.2 Preprocessor1_Model1
#> 5  31.0 train/test split    18  32.4 Preprocessor1_Model1
#> 6  15.4 train/test split    25  19.2 Preprocessor1_Model1
#> 7  31.2 train/test split    26  27.3 Preprocessor1_Model1
#> 8  27.4 train/test split    32  21.4 Preprocessor1_Model1

# or use a workflow

library(workflows)
spline_wfl <-
  workflow() %>%
  add_recipe(spline_rec) %>%
  add_model(lin_mod)

last_fit(spline_wfl, split = tr_te_split)
#> # Resampling results
#> # Manual resampling 
#> # A tibble: 1 × 6
#>   splits         id           .metrics .notes   .predictions .workflow 
#>   <list>         <chr>        <list>   <list>   <list>       <list>    
#> 1 <split [24/8]> train/test … <tibble> <tibble> <tibble>     <workflow>