`show_best()`

displays the top sub-models and their performance estimates.

show_best(x, metric, n = 5, ...) select_best(x, metric, ...) select_by_pct_loss(x, ..., metric, limit = 2) select_by_one_std_err(x, ..., metric)

x | The results of |
---|---|

metric | A character value for the metric that will be used to sort
the models. (See
https://tidymodels.github.io/yardstick/articles/metric-types.html for
more details). Not required if a single metric exists in |

n | An integer for the number of top results/rows to return. |

... | For |

limit | The limit of loss of performance that is acceptable (in percent units). See details below. |

A tibble with columns for the parameters. `show_best()`

also
includes columns for performance metrics.

`select_best()`

finds the tuning parameter combination with the best
performance values.

`select_by_one_std_err()`

uses the "one-standard error rule" (Breiman _el
at, 1984) that selects the most simple model that is within one standard
error of the numerically optimal results.

`select_by_pct_loss()`

selects the most simple model whose loss of
performance is within some acceptable limit.

For percent loss, suppose the best model has an RMSE of 0.75 and a simpler
model has an RMSE of 1. The percent loss would be `(1.00 - 0.75)/1.00 * 100`

,
or 25 percent. Note that loss will always be non-negative.

Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984).
*Classification and Regression Trees.* Monterey, CA: Wadsworth.

#> # A tibble: 5 x 11 #> K weight_func dist_power lon lat .iter .metric .estimator mean n #> <int> <chr> <dbl> <int> <int> <dbl> <chr> <chr> <dbl> <int> #> 1 33 triweight 0.325 10 3 0 rmse standard 0.0733 10 #> 2 21 cos 0.415 1 4 0 rmse standard 0.0744 10 #> 3 5 rank 0.245 2 7 0 rmse standard 0.0747 10 #> 4 12 epanechnik… 1.13 4 7 0 rmse standard 0.0753 10 #> 5 9 optimal 1.00 2 8 8 rmse standard 0.0755 10 #> # … with 1 more variable: std_err <dbl>select_best(ames_iter_search, metric = "rsq")#> # A tibble: 1 x 5 #> K weight_func dist_power lon lat #> <int> <chr> <dbl> <int> <int> #> 1 33 triweight 0.325 10 3# To find the least complex model within one std error of the numerically # optimal model, the number of nearest neighbors are sorted from the largest # number of neighbors (the least complex class boundary) to the smallest # (corresponding to the most complex model). select_by_one_std_err(ames_grid_search, metric = "rmse", desc(K))#> # A tibble: 1 x 12 #> K weight_func dist_power lon lat .metric .estimator mean n #> <int> <chr> <dbl> <int> <int> <chr> <chr> <dbl> <int> #> 1 33 triweight 0.325 10 3 rmse standard 0.0733 10 #> # … with 3 more variables: std_err <dbl>, .best <dbl>, .bound <dbl># Now find the least complex model that has no more than a 5% loss of RMSE: select_by_pct_loss(ames_grid_search, metric = "rmse", limit = 5, desc(K))#> # A tibble: 1 x 12 #> K weight_func dist_power lon lat .metric .estimator mean n #> <int> <chr> <dbl> <int> <int> <chr> <chr> <dbl> <int> #> 1 33 triweight 0.325 10 3 rmse standard 0.0733 10 #> # … with 3 more variables: std_err <dbl>, .best <dbl>, .loss <dbl># }