📦
Building tidy tools

Day 2 Session 2: Managing Side Effects

Emma Rand and Ian Lyttle

Invalid Date

Learning objectives

At the end of this section you will be able to:

  • write effective error messages
  • create validator functions
  • manage the global state (leave no footprints) using {withr}
  • test functions that use or cause side-effects
  • use snapshot testing

Errors

  • The most common type of side-effect is the error condition.
  • Sometimes, error messages can be cryptic:

    seq[10]
    Error in seq[10] : object of type 'closure' is not subsettable
  • You can write error messages that make things clearer for:
    • developers who call your functions
    • end users

Creating an error condition

An effective error condition has:

  • predicate (logical expression used to identify condition)
  • clear message for end user
  • class name for developer
  • more information for developer

Using cli::cli_abort()

{cli} package (Csárdi 2022)

# predicate
if (y > 3) {
  cli::cli_abort(
    # message for end user
    c(
      "{.var y} cannot be greater than 3.", 
      x = "{.var y} is {.val {y}}."
    ),
    # class name
    class = "ussie_error_threshold",  
    # more information
    y = y     
  )
}

Predicate

Prefer:

  • simpler predicates and more error-conditions

Over:

  • complex predicates and fewer error-conditions

Finding simplest set of predicates is just as challenging as finding the “right” names for functions and arguments.

Message

Content:

  • how did we violate the predicate?

Formatting:

  • {cli} provides powerful formatting tools:

    • use curly-braces and a tag, e.g. {.var y}
    • use more curly-braces to interpolate, e.g. {.val {y}}
  • see cli inline-markup for more details.

Class name

  • A developer, calling your function, can use the class name to handle the error, if they want.

  • Convention:

    • "{package}_error_{description}"

Additional information

This “stuff” is also available to an error handler.

  • Provide the data that went into the predicate.

  • Provide name of variable, e.g. y = y.

  • Avoid reserved names: message, class, call, body, footer, trace, parent, use_cli_format.

Validation

If there will be an error, surface it quickly.

Validating the arguments to a function is one way to do this.

For example:

  • is this a data frame?
  • does this data frame have these columns?

Questions like these can be generalized into functions:

  • throw an error if you need to.
  • otherwise, return data argument invisibly.

Validate string-values

You may be familiar with match.arg():

  • uses “magic” to compare value (if any) to default
  • argument default is vector of strings
    • if value among defaults, return it - otherwise error
    • if no value, return first value in default

rlang::arg_match() does the same thing, but:

  • partial match triggers error
  • error messages conform to tidyverse standards

Snapshot tests

Designed for capturing side-effects:

Be careful about accepting changes (don’t just accept).

Can be temperamental - not run on CRAN.

We will go through, using examples.

Your turn "2.2.1"

Implement validator-functions:

  • is the country valid?

Only if needed, btt22::btt_reset_hard("2.2.1")

Get new files, btt22::btt_get("2.2.1"):

  • validate.R
  • test-validate.R

usethis::use_package("cli")

Use rlang::arg_match()

👉 devtools::load_all(), then:

  • 👉 uss_get_matches("italy") 🎉
  • 👉 uss_get_matches("tatooine") 🎉
  • 👉 uss_get_matches("ita") 🤔

get-matches.R:

test-get-matches.R:

  • uncomment test for 2.2.1, then repeat 👉

Build error constructor

  • usethis::use_package("cli")

  • review validate_data_frame() (call argument)

  • build constructor for validate_cols():

    • something like {.field {cols}} may be useful
    • activate 2.2.1 tests in test-validate.R, as you go
  • hold off on snapshot test (we’ll do together)

  • add validator-functions to uss_make_matches():
    • matches.R
    • cols_engsoc() may be useful

Managing global state

Leave no footprints.

Leave the global state how you found it, avoid surprises later:

  • packages loaded
  • also: options, environment variables

Loading {devtools} in .Rprofile changes the global state.

When we hit the “Knit” button, or the Quarto “Render” button:

  • it runs in a new R session
  • does not execute user’s .Rprofile

Using “self-removing footprints”

The {withr} package (Hester et al. 2022) gives us tools to:

  • modify global state
  • specify when to reverse the modification

Useful family of functions:

  • withr::local_*()
  • changes the state back when a scope is exited
  • if called within a function, normally when the function exits

When could this be useful?

CRAN is (rightly) particular about “leave no footprints”.

You may need:

Useful in testthat code and in R code.

Interactive example

write_read_vanish <- function(x) {
  
  tempfile <- withr::local_tempfile(fileext = ".rds")
  
  saveRDS(x, file = tempfile)
  xnew <- readRDS(tempfile)
  
  print(
    glue::glue("'{tempfile}' contained {xnew}.")
  )
  
  invisible(xnew)
}

Another interactive example

not_runif <- function(n, min = 0, max = 1) {

  withr::local_seed(314)
  
  runif(n, min = min, max = max)
}

Example

  • usethis::use_test("validate")

  • change:

    expect_identical(error_condition$cols_req, "foo")
  • to:

    expect_identical(error_condition$cols_re, "foo")
  • rerun tests 🤔

  • By default, $ accepts partial matching.

  • We want stricter testing, so we need to set some options.

Our turn "2.2.2"

Turn off $ partial matching when testing.


Only if needed, btt22::btt_reset_hard("2.2.2")

Get new files, btt22::btt_get("2.2.2"):

  • utils-testing.R

usethis::use_package("withr")

local_warn_partial_match()

In essence, this sets (temporarily) TRUE:

  • options("warnPartialMatchDollar")
  • options("warnPartialMatchArgs")
  • options("warnPartialMatchAttr")

With a couple of wrinkles:

  • we need to treat NULL as FALSE.
  • we set things back when the calling scope exits.

Our turn "2.2.2", continued

  • usethis::use_test("validate")

  • add local_warn_partial_match() to top of blocks.

  • getOption("warnPartialMatchDollar")

  • change:

    expect_identical(error_condition$cols_re, "foo")
  • devtools::test() 🎉

  • getOption("warnPartialMatchDollar")

  • change back to $cols_req

Your turn "2.2.3"

Implement testing with side effects


Only if needed, btt22::btt_reset_hard("2.2.3")

No new files.

  • add local_warn_partial_match() to all test-blocks.

  • test-matches.R:

    • add expect_error() calls for data-frame and columns (use class argument).
    • at end, add:
    expect_snapshot(dplyr::glimpse(italy))

Summary

The most-common side-effect is an error. Good design:

  • simple (as possible) predicate
  • clear mesage
  • add a class using naming convention, and additional information

Use snapshot tests to capture side-effects.

  • be very careful when accepting changes to snapshots.

Use withr::local_*() functions to “leave no footprints”.

Additional material

Brian Beckman on monads (Ian watched at least ten times, learned something each time):

  • tidyverse is a monoidal system

  • maybe monads aren’t the thing, but the functional-programming learned along the way

References

Csárdi, Gábor. 2022. “Cli: Helpers for Developing Command Line Interfaces.” https://CRAN.R-project.org/package=cli.
Hester, Jim, Lionel Henry, Kirill Müller, Kevin Ushey, Hadley Wickham, and Winston Chang. 2022. “Withr: Run Code ’with’ Temporarily Modified Global State.” https://CRAN.R-project.org/package=withr.