library(tidyverse)
library(tidymodels)
library(openintro)
Ex 3.2: Fit, predict, evaluate
teach_ds :: Teaching modern modeling with tidymodels
Introduction
Some of the code below has been pre-populated. In these cases, there is a code chunk option set as #| eval: false
. Make sure to remove this option prior to running the relevant code chunk to avoid any errors when rendering the document.
Model
The goal is to fit a model to predict the interest rate (interest_rate
) based on the debt to income ratio (debt_to_income
), type of application (application_type
), and whether there are any bankruptcies listed in the public record for the individual (bankrupt
). The model should allow the effect of debt to income ratio to differ based on application type.
\[ \begin{align}\widehat{interest\_rate} = b_0 &+ b_1 \times debt\_to\_income \\ &+ b_2 \times application\_type \\ &+ b_3 \times bankrupt \\ &+ b_4 \times debt\_to\_income:application\_type\end{align} \]
Recipe
This exercise builds on Exercise 1. To get you started, the recipe created in Exercise 1 is below.
<- recipe(interest_rate ~ debt_to_income + application_type +
loans_rec data = loans_full_schema) |>
public_record_bankrupt, step_impute_mean(debt_to_income) |>
step_mutate(bankrupt = as_factor(if_else(public_record_bankrupt == 0,
"no", "yes"))) |>
step_rm(public_record_bankrupt) |>
step_dummy(all_nominal_predictors()) |>
step_interact(terms = ~ starts_with("application_type"):debt_to_income)
Build workflow + fit the model
Complete the code below to build a workflow and fit the model to loans_full_schema
.
## specify the model
<- linear_reg() |>
loans_spec set_engine("lm")
## build workflow
<- workflow() |>
loans_workflow add_model(_____) |>
add_recipe(______)
## fit the model
<- ______ |>
loans_fit fit(data = _____)
Predict + evaluate model
Complete the code below to calculate predictions and evaluate the model using \(R^2\) and \(RMSE\).
<- predict(______, ______) |>
loans_pred bind_cols(loans_full_schema |> select(interest_rate))
rsq(_____, truth = interest_rate, estimate = .pred)
rmse(_____)