Skip to contents

This function returns a list of pre-processing steps applied with run_mod_lm().

Usage

get_mod_preproc(.tbl, .neighbors, .threshold, .impute_with)

Arguments

.tbl

Input data frame containing the data to model.

.neighbors

The number of neighbors.

.threshold

A value for the threshold of missing values in column. The step will remove the columns where the proportion of missing values exceeds the threshold.

.impute_with

A call to imp_vars to specify which variables are used to impute the variables that can include specific variable names separated by commas or different selectors (see selections()). If a column is included in both lists to be imputed and to be an imputation predictor, it will be removed from the latter and not used to impute itself.

Value

A list of items of class "recipe".

Examples

if (FALSE) {
tbl <-
  build_tbl(
    "tb",
    estimated = "who_estimates.e_inc_num",
    notified = "who_notifications.c_newinc",
    year = 2019,
    vars = extract_vars("tb")
  ) |>
  dplyr::mutate(is_hbc = forcats::as_factor(is_hbc)) |>
  dplyr::select(-any_of(c("year")))

get_mod_preproc(
  .tbl = tbl,
  .neighbors = 5,
  .threshold = 0.25,
  .impute_with = c("gdp", "e_inc_num", "pop_total")
 )
}