AE 11: Cross validation


October 23, 2023


Model statistics function

You will use this function to calculate \(Adj. R^2\), AIC, and BIC in the cross validation.

calc_model_stats <- function(x) {
  glance(extract_fit_parsnip(x)) |>
    select(adj.r.squared, AIC, BIC)



Load data and relevel factors

tips <- read_csv("data/tip-data.csv")

tips <- tips |>
  mutate(Age = factor(Age, levels = c("Yadult", "Middle", "SenCit")), 
         Meal = factor(Meal, levels = c("Lunch", "Dinner", "Late Night"))

Split data into training and testing

Split your data into testing and training sets.

tips_split <- initial_split(tips)
tips_train <- training(tips_split)
tips_test <- testing(tips_split)

Specify model

Specify a linear regression model. Call it tips_spec.

tips_spec <- linear_reg() |>

Linear Regression Model Specification (regression)

Computational engine: lm 

Model 1

Create recipe

Create a recipe to use Party, Age, and Meal to predict Tip. Call it tips_rec1.

tips_rec1 <- recipe(Tip ~ Party + Age + Meal,
                    data = tips_train) |>
  step_dummy(all_nominal_predictors()) |>

── Recipe ──────────────────────────────────────────────────────────────────────
── Inputs 
Number of variables by role
outcome:   1
predictor: 3
── Operations 
• Dummy variables from: all_nominal_predictors()
• Zero variance filter on: all_predictors()

Preview recipe

prep(tips_rec1) |>
  bake(tips_train) |>
Rows: 126
Columns: 6
$ Party           <dbl> 3, 2, 2, 4, 2, 7, 4, 3, 2, 4, 1, 2, 2, 1, 2, 1, 2, 3, …
$ Tip             <dbl> 4.00, 4.92, 5.09, 8.84, 3.09, 15.00, 8.00, 4.00, 5.00,…
$ Age_Middle      <dbl> 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, …
$ Age_SenCit      <dbl> 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, …
$ Meal_Dinner     <dbl> 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, …
$ Meal_Late.Night <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, …

Create workflow

Create the workflow that brings together the model specification and recipe. Call it tips_wflow1.

tips_wflow1 <- workflow() |>
  add_model(tips_spec) |>

══ Workflow ════════════════════════════════════════════════════════════════════
Preprocessor: Recipe
Model: linear_reg()

── Preprocessor ────────────────────────────────────────────────────────────────
2 Recipe Steps

• step_dummy()
• step_zv()

── Model ───────────────────────────────────────────────────────────────────────
Linear Regression Model Specification (regression)

Computational engine: lm 

Cross validation

Create folds

Create 5 folds.

# make 10 folds
folds <- vfold_cv(tips_train, v = 5)

Conduct cross validation

Conduct cross validation on the 5 folds.

# Fit model and performance statistics for each iteration
tips_fit_rs1 <- tips_wflow1 |>
  fit_resamples(resamples = folds, 
                control = control_resamples(extract = calc_model_stats))

Take a look at tips_fit_rs1

# Resampling results
# 5-fold cross-validation 
# A tibble: 5 × 5
  splits           id    .metrics         .notes           .extracts       
  <list>           <chr> <list>           <list>           <list>          
1 <split [100/26]> Fold1 <tibble [2 × 4]> <tibble [0 × 3]> <tibble [1 × 2]>
2 <split [101/25]> Fold2 <tibble [2 × 4]> <tibble [0 × 3]> <tibble [1 × 2]>
3 <split [101/25]> Fold3 <tibble [2 × 4]> <tibble [0 × 3]> <tibble [1 × 2]>
4 <split [101/25]> Fold4 <tibble [2 × 4]> <tibble [0 × 3]> <tibble [1 × 2]>
5 <split [101/25]> Fold5 <tibble [2 × 4]> <tibble [0 × 3]> <tibble [1 × 2]>

Summarize assessment CV metrics

Summarize assessment metrics from your CV iterations These statistics are calculated using the assessment set.

collect_metrics(tips_fit_rs1, summarize = TRUE)
# A tibble: 2 × 6
  .metric .estimator  mean     n std_err .config             
  <chr>   <chr>      <dbl> <int>   <dbl> <chr>               
1 rmse    standard   2.09      5  0.265  Preprocessor1_Model1
2 rsq     standard   0.673     5  0.0519 Preprocessor1_Model1

Set summarize = FALSE to see the individual \(R^2\) and RMSE for each iteration.

Summarize model fit CV metrics

Summarize model fit statistics from your CV iterations These statistics are calculated using the analysis set.

map_df(tips_fit_rs1$.extracts, ~ .x[[1]][[1]]) |>
  summarise(mean_adj_rsq = mean(adj.r.squared), 
            mean_aic = mean(AIC), 
            mean_bic = mean(BIC))
# A tibble: 1 × 3
  mean_adj_rsq mean_aic mean_bic
         <dbl>    <dbl>    <dbl>
1        0.670     434.     453.

Run the first line of code map_df(tips_fit_rs1$.extracts, ~ .x[[1]][[1]]) to see the individual \(Adj. R^2\), AIC, and BIC for each iteration.

Another model - Model 2

Create the recipe for a new model that includes Party, Age, Meal, and Alcohol (an indicator for whether the party ordered alcohol with the meal). Conduct 10-fold cross validation and summarize the metrics.

Model 2: Recipe

# add code here

Model 2: Model building workflow

# add code here

Model 2: Conduct CV


We will use the same folds as the ones used for Model 1. Why should we use the same folds to evaluate and compare both models?

# add code here

Model 2: Summarize assessment CV metrics

# add code here

Model 2: Summarize model fit CV metrics

# add code here

Compare and choose a model

  • Describe how the two models compare to each other based on cross validation metrics.

  • Which model do you choose for the final model? Why?

Fit the selected model

Fit the selected model using the entire training set.

# add code here

See notes for example code.

Evaluate the performance of the selected model on the testing data

Calculate predicted values

# add code here

Calculate \(RMSE\)

# add code here

See notes notes for example code.

  • How does the model performance on the testing data compare to its performance on the training data?

  • Is this what you expected? Why or why not?



