Ex 3.2: Fit, predict, evaluate

teach_ds :: Teaching modern modeling with tidymodels

library(tidyverse)
library(tidymodels)
library(openintro)

Introduction

Some of the code below has been pre-populated. In these cases, there is a code chunk option set as #| eval: false. Make sure to remove this option prior to running the relevant code chunk to avoid any errors when rendering the document.

Model

The goal is to fit a model to predict the interest rate (interest_rate) based on the debt to income ratio (debt_to_income), type of application (application_type), and whether there are any bankruptcies listed in the public record for the individual (bankrupt). The model should allow the effect of debt to income ratio to differ based on application type.

\[ \begin{align}\widehat{interest\_rate} = b_0 &+ b_1 \times debt\_to\_income \\ &+ b_2 \times application\_type \\ &+ b_3 \times bankrupt \\ &+ b_4 \times debt\_to\_income:application\_type\end{align} \]

Recipe

This exercise builds on Exercise 1. To get you started, the recipe created in Exercise 1 is below.

loans_rec <- recipe(interest_rate ~ debt_to_income + application_type + 
                      public_record_bankrupt, data = loans_full_schema) |>
  step_impute_mean(debt_to_income) |>
  step_mutate(bankrupt = as_factor(if_else(public_record_bankrupt == 0, 
                                           "no", "yes"))) |>
  step_rm(public_record_bankrupt) |>
  step_dummy(all_nominal_predictors()) |>
  step_interact(terms = ~ starts_with("application_type"):debt_to_income)

Build workflow + fit the model

Complete the code below to build a workflow and fit the model to loans_full_schema.

## specify the model 
loans_spec <- linear_reg() |>
  set_engine("lm")
## build workflow 
loans_workflow <- workflow() |>
  add_model(_____) |>
  add_recipe(______)

## fit the model 
loans_fit <- ______ |>
  fit(data = _____)

Predict + evaluate model

Complete the code below to calculate predictions and evaluate the model using \(R^2\) and \(RMSE\).

loans_pred <- predict(______, ______) |>
  bind_cols(loans_full_schema |> select(interest_rate))
rsq(_____, truth = interest_rate, estimate = .pred)
rmse(_____)