Skip to content

show_best() displays the top sub-models and their performance estimates.

Usage

show_best(x, ...)

# S3 method for default
show_best(x, ...)

# S3 method for tune_results
show_best(x, metric = NULL, n = 5, ...)

select_best(x, ...)

# S3 method for default
select_best(x, ...)

# S3 method for tune_results
select_best(x, metric = NULL, ...)

select_by_pct_loss(x, ...)

# S3 method for default
select_by_pct_loss(x, ...)

# S3 method for tune_results
select_by_pct_loss(x, ..., metric = NULL, limit = 2)

select_by_one_std_err(x, ...)

# S3 method for default
select_by_one_std_err(x, ...)

# S3 method for tune_results
select_by_one_std_err(x, ..., metric = NULL)

Arguments

x

The results of tune_grid() or tune_bayes().

...

For select_by_one_std_err() and select_by_pct_loss(), this argument is passed directly to dplyr::arrange() so that the user can sort the models from most simple to most complex. See the examples below. At least one term is required for these two functions.

metric

A character value for the metric that will be used to sort the models. (See https://yardstick.tidymodels.org/articles/metric-types.html for more details). Not required if a single metric exists in x. If there are multiple metric and none are given, the first in the metric set is used (and a warning is issued).

n

An integer for the number of top results/rows to return.

limit

The limit of loss of performance that is acceptable (in percent units). See details below.

Value

A tibble with columns for the parameters. show_best() also includes columns for performance metrics.

Details

select_best() finds the tuning parameter combination with the best performance values.

select_by_one_std_err() uses the "one-standard error rule" (Breiman _el at, 1984) that selects the most simple model that is within one standard error of the numerically optimal results.

select_by_pct_loss() selects the most simple model whose loss of performance is within some acceptable limit.

For percent loss, suppose the best model has an RMSE of 0.75 and a simpler model has an RMSE of 1. The percent loss would be (1.00 - 0.75)/1.00 * 100, or 25 percent. Note that loss will always be non-negative.

References

Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and Regression Trees. Monterey, CA: Wadsworth.

Examples

# \donttest{
data("example_ames_knn")

show_best(ames_iter_search, metric = "rmse")
#> # A tibble: 5 × 12
#>       K weight_…¹ dist_…²   lon   lat .metric .esti…³   mean     n std_err
#>   <int> <chr>       <dbl> <int> <int> <chr>   <chr>    <dbl> <int>   <dbl>
#> 1    33 triweight   0.511    10     3 rmse    standa… 0.0728    10 0.00337
#> 2     5 rank        0.411     2     7 rmse    standa… 0.0740    10 0.00328
#> 3    21 triweight   0.909    10     4 rmse    standa… 0.0742    10 0.00313
#> 4    21 cos         0.626     1     4 rmse    standa… 0.0746    10 0.00359
#> 5    19 inv         0.117     1     4 rmse    standa… 0.0758    10 0.00360
#> # … with 2 more variables: .config <chr>, .iter <int>, and abbreviated
#> #   variable names ¹​weight_func, ²​dist_power, ³​.estimator

select_best(ames_iter_search, metric = "rsq")
#> # A tibble: 1 × 6
#>       K weight_func dist_power   lon   lat .config              
#>   <int> <chr>            <dbl> <int> <int> <chr>                
#> 1    33 triweight        0.511    10     3 Preprocessor10_Model1

# To find the least complex model within one std error of the numerically
# optimal model, the number of nearest neighbors are sorted from the largest
# number of neighbors (the least complex class boundary) to the smallest
# (corresponding to the most complex model).

select_by_one_std_err(ames_grid_search, metric = "rmse", desc(K))
#> # A tibble: 1 × 13
#>       K weight_…¹ dist_…²   lon   lat .metric .esti…³   mean     n std_err
#>   <int> <chr>       <dbl> <int> <int> <chr>   <chr>    <dbl> <int>   <dbl>
#> 1    33 triweight   0.511    10     3 rmse    standa… 0.0728    10 0.00337
#> # … with 3 more variables: .config <chr>, .best <dbl>, .bound <dbl>, and
#> #   abbreviated variable names ¹​weight_func, ²​dist_power, ³​.estimator

# Now find the least complex model that has no more than a 5% loss of RMSE:
select_by_pct_loss(
  ames_grid_search,
  metric = "rmse",
  limit = 5, desc(K)
)
#> # A tibble: 1 × 13
#>       K weight_…¹ dist_…²   lon   lat .metric .esti…³   mean     n std_err
#>   <int> <chr>       <dbl> <int> <int> <chr>   <chr>    <dbl> <int>   <dbl>
#> 1    33 triweight   0.511    10     3 rmse    standa… 0.0728    10 0.00337
#> # … with 3 more variables: .config <chr>, .best <dbl>, .loss <dbl>, and
#> #   abbreviated variable names ¹​weight_func, ²​dist_power, ³​.estimator
# }