show_best()
displays the top sub-models and their performance estimates.
show_best(x, metric = NULL, n = 5, ...) select_best(x, metric = NULL, ...) select_by_pct_loss(x, ..., metric = NULL, limit = 2) select_by_one_std_err(x, ..., metric = NULL)
x | The results of |
---|---|
metric | A character value for the metric that will be used to sort
the models. (See
https://yardstick.tidymodels.org/articles/metric-types.html for
more details). Not required if a single metric exists in |
n | An integer for the number of top results/rows to return. |
... | For |
limit | The limit of loss of performance that is acceptable (in percent units). See details below. |
A tibble with columns for the parameters. show_best()
also
includes columns for performance metrics.
select_best()
finds the tuning parameter combination with the best
performance values.
select_by_one_std_err()
uses the "one-standard error rule" (Breiman _el
at, 1984) that selects the most simple model that is within one standard
error of the numerically optimal results.
select_by_pct_loss()
selects the most simple model whose loss of
performance is within some acceptable limit.
For percent loss, suppose the best model has an RMSE of 0.75 and a simpler
model has an RMSE of 1. The percent loss would be (1.00 - 0.75)/1.00 * 100
,
or 25 percent. Note that loss will always be non-negative.
Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and Regression Trees. Monterey, CA: Wadsworth.
#> # A tibble: 5 x 12 #> K weight_func dist_power lon lat .metric .estimator mean n #> <int> <chr> <dbl> <int> <int> <chr> <chr> <dbl> <int> #> 1 33 triweight 0.511 10 3 rmse standard 0.0728 10 #> 2 5 rank 0.411 2 7 rmse standard 0.0740 10 #> 3 21 triweight 0.909 10 4 rmse standard 0.0742 10 #> 4 21 cos 0.626 1 4 rmse standard 0.0746 10 #> 5 19 inv 0.117 1 4 rmse standard 0.0758 10 #> # … with 3 more variables: std_err <dbl>, .config <chr>, .iter <int>select_best(ames_iter_search, metric = "rsq")#> # A tibble: 1 x 6 #> K weight_func dist_power lon lat .config #> <int> <chr> <dbl> <int> <int> <chr> #> 1 33 triweight 0.511 10 3 Recipe10_Model01# To find the least complex model within one std error of the numerically # optimal model, the number of nearest neighbors are sorted from the largest # number of neighbors (the least complex class boundary) to the smallest # (corresponding to the most complex model). select_by_one_std_err(ames_grid_search, metric = "rmse", desc(K))#> # A tibble: 1 x 13 #> K weight_func dist_power lon lat .metric .estimator mean n #> <int> <chr> <dbl> <int> <int> <chr> <chr> <dbl> <int> #> 1 33 triweight 0.511 10 3 rmse standard 0.0728 10 #> # … with 4 more variables: std_err <dbl>, .config <chr>, .best <dbl>, #> # .bound <dbl># Now find the least complex model that has no more than a 5% loss of RMSE: select_by_pct_loss(ames_grid_search, metric = "rmse", limit = 5, desc(K))#> # A tibble: 1 x 13 #> K weight_func dist_power lon lat .metric .estimator mean n #> <int> <chr> <dbl> <int> <int> <chr> <chr> <dbl> <int> #> 1 33 triweight 0.511 10 3 rmse standard 0.0728 10 #> # … with 4 more variables: std_err <dbl>, .config <chr>, .best <dbl>, #> # .loss <dbl># }