Run a suite of statistical models
run_mod_lm.Rd
run_mod_lm()
runs a suite of statistical models, returning a final model fit.
Arguments
- tbl
Input data frame containing the data to model.
- preproc
A list of pre-processing steps.
- folds
An integer. The number of cross-validation folds.
- metrics
A tibble containing the performance metrics to evaluate.
- rank_metric
A metric from
metrics
to rank results by.- cross
A logical: should all combinations of the pre-processors and models be used to create the workflows? If FALSE, the length of preproc and models should be equal.
- seed
A single integer.
Examples
if (FALSE) {
tbl <-
build_tbl(
"tb",
estimated = "who_estimates.e_inc_num",
notified = "who_notifications.c_newinc",
year = 2019,
vars = extract_vars("tb")
) |>
dplyr::mutate(is_hbc = forcats::as_factor(is_hbc)) |>
dplyr::select(-any_of(c("year")))
preproc_list <- get_mod_preproc(
.tbl = tbl,
.neighbors = 5,
.threshold = 0.25,
.impute_with = c("gdp", "e_inc_num", "pop_total")
)
run_mod_lm(
tbl,
preproc = preproc_list,
folds = 10,
metrics = yardstick::metric_set(yardstick::rmse, yardstick::rsq),
rank_metric = "rmse"
)
}