2

I'm building an R package and using the targets package to run tests systematically. I expect errors and want the targets output to save the errors (and other objects created along the way) rather than prevent the pipeline from proceeding past the first error. I thought tryCatch would work but the pipeline stops at the first error and returns nothing. Example:

A simple pipeline with error and tryCatch:

get_data <- function() {
  iris %>%
    filter(Species == "versicolor")
}

# fit a working model
fit_model <- function(data) {
  mod <- lm(Sepal.Width ~ Sepal.Length, data)
  return(summary(mod))
}

# fit a model that throws an error using tryCatch to save the error message
fit_model_try <- function(data) {

  lm_mod <- tryCatch(
    {
      lm(Sepal.Width ~ ERRRRRRR, data)
    },
    error=function(cond) {
      message("Here's the original error message:")
      message(cond)
      return(cond)
    }
  )
  return(lm_mod)
}

The targets run file:

# Load packages required to define the pipeline:
library(targets)
# library(tarchetypes) # Load other packages as needed. # nolint

# Set target options:
tar_option_set(
  packages = c("dplyr"), # packages that your targets need to run
  format = "rds" # default storage format
  # Set other options as needed.
)

# tar_make_clustermq() configuration (okay to leave alone):
options(clustermq.scheduler = "multiprocess")

# tar_make_future() configuration (okay to leave alone):
# Install packages {{future}}, {{future.callr}}, and {{future.batchtools}} to allow use_targets() to configure tar_make_future() options.

# Run the R scripts in the R/ folder with your custom functions:
tar_source()
# source("other_functions.R") # Source other scripts as needed. # nolint

# Replace the target list below with your own:
list(
  tar_target(data, get_data()),
  tar_target(model, fit_model(data)),
  tar_target(model_try, fit_model_try(data))
)

In the example above there is a successful model fit_model followed by error model fit_model_try. targets runs into the error but does not return the output of tryCatch. The error stops the pipeline and so the output of the successful model is not returned either. I want to run through all the models in the pipeline saving both successful run results and unsuccessful error messages. TIA!

QAsena
  • 603
  • 4
  • 9

1 Answers1

2

Thanks for the useful reprex. Your example works if you use conditionMessage() to get the message of the error object, as opposed to trying to print the object itself:

fit_model_try <- function(data) {
  lm_mod <- tryCatch(
    lm(Sepal.Width ~ ERRRRRRR, data),
    error = function(cond) {
      message("Here's the original error message:")
      message(conditionMessage(cond)) # Use conditionMessage() here.
      return(cond)
    }
  )
  return(lm_mod)
}

landau
  • 5,636
  • 1
  • 22
  • 50
  • Thanks for the quick reply and solve! I've noticed a few things that work if I run a script manually line-by-line but don't work when wrapped in a function and fed into a `targets` pipeline. Not sure where the difference comes in... – QAsena Mar 08 '23 at 16:57