fit_best()
takes the results from model tuning and fits it to the training
set using tuning parameters associated with the best performance.
Usage
fit_best(x, ...)
# S3 method for default
fit_best(x, ...)
# S3 method for tune_results
fit_best(
x,
metric = NULL,
parameters = NULL,
verbose = FALSE,
add_validation_set = NULL,
...
)
Arguments
- x
The results of class
tune_results
(coming from functions such astune_grid()
,tune_bayes()
, etc). The control optionsave_workflow = TRUE
should have been used.- ...
Not currently used.
- metric
A character string (or
NULL
) for which metric to optimize. IfNULL
, the first metric is used.- parameters
An optional 1-row tibble of tuning parameter settings, with a column for each tuning parameter. This tibble should have columns for each tuning parameter identifier (e.g.
"my_param"
iftune("my_param")
was used). IfNULL
, this argument will be set toselect_best(metric)
.- verbose
A logical for printing logging.
- add_validation_set
When the resamples embedded in
x
are a split into training set and validation set, should the validation set be included in the data set used to train the model? If not, only the training set is used. IfNULL
, the validation set is not used for resamples originating fromrsample::validation_set()
while it is used for resamples originating fromrsample::validation_split()
.
Details
This function is a shortcut for the manual steps of:
best_param <- select_best(tune_results, metric) # or other `select_*()`
wflow <- finalize_workflow(wflow, best_param) # or just `finalize_model()`
wflow_fit <- fit(wflow, data_set)
In comparison to last_fit()
, that function requires a finalized model, fits
the model on the training set defined by rsample::initial_split()
, and
computes metrics from the test set.
Examples
library(recipes)
library(rsample)
library(parsnip)
library(dplyr)
data(meats, package = "modeldata")
meats <- meats %>% select(-water, -fat)
set.seed(1)
meat_split <- initial_split(meats)
meat_train <- training(meat_split)
meat_test <- testing(meat_split)
set.seed(2)
meat_rs <- vfold_cv(meat_train, v = 10)
pca_rec <-
recipe(protein ~ ., data = meat_train) %>%
step_normalize(all_numeric_predictors()) %>%
step_pca(all_numeric_predictors(), num_comp = tune())
knn_mod <- nearest_neighbor(neighbors = tune()) %>% set_mode("regression")
ctrl <- control_grid(save_workflow = TRUE)
set.seed(128)
knn_pca_res <-
tune_grid(knn_mod, pca_rec, resamples = meat_rs, grid = 10, control = ctrl)
knn_fit <- fit_best(knn_pca_res, verbose = TRUE)
#> Using rmse as the metric, the optimal parameters were:
#> neighbors: 6
#> num_comp: 4
#>
#> ℹ Fitting using 161 data points...
#> ✔ Done.
predict(knn_fit, meat_test)
#> # A tibble: 54 × 1
#> .pred
#> <dbl>
#> 1 19.7
#> 2 20.1
#> 3 15.0
#> 4 13.2
#> 5 19.6
#> 6 21.1
#> 7 19.9
#> 8 18.5
#> 9 19.6
#> 10 17.9
#> # ℹ 44 more rows