3

I'm trying to parse JSON contained in a dataframe column, some of which is corrupted. As a first step I want to identify the corrupted rows, and use that to subset the dataframe.

I'm using the trick from this post using c() to populate the list (even though I know it's slow):

myRows <- c()
for (i in 1:nrow(myDataframe)) {
  tryCatch({myDataframe$myJSONstring[i] %>%
    fromJSON() %>%
    length()},
    error = function(e) {print(i); myRows <- c(myRows, i)})
}

However, this doesn't work. print(i) works fine, but after running the loop myRows is still just an empty list. Is there some restriction on what code can run in the error bit of a tryCatch?

Tom Wagstaff
  • 1,443
  • 2
  • 13
  • 15
  • 3
    This is a scoping issue (you do the assignment in the local scope of the function). You probably could do `myRows <<- c(myRows, i)` but I'm not sure without testing it. Scoping within `tryCatch` is a bit complicated. Preferably you should redesign the whole approach. – Roland Nov 05 '18 at 13:34

2 Answers2

5

Though there already is an accepted answer, I will post another way, without creating an environment.
If the result of tryCatch is assigned to a variable, it can be tested later. The trick is to return the error in the error function.
Example based on the accepted answer, same errors.

vec <- rep(1:0, each = 5)

ans <- lapply(seq_along(vec), function(i) {
  tryCatch({ if(vec[i]) stop("error message") else "success" },
           error = function(e) e)
})

bad <- sapply(ans, inherits, "error")
#[1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
4

Here is a little example of how you could solve your issue: (everyone seems to cry about <<-. Somehow assigning to global scope or global vars seems to be bad practice.)

env = environment()
env$ans <- rep("works",10)

vec <- rep(1:0,each = 5)

for (i in seq_along(vec)) {
    tryCatch({ if(vec[i]) stop("error message") else {"success"} },
            error = function(e) {print(i); env$ans[i] <- "error"})
}

#> env$ans
# [1] "error" "error" "error" "error" "error" "works" "works" "works" "works" "works"

So somehow if you call the property of the environment env you can access it from inside tryCatch.

Andre Elrico
  • 10,956
  • 6
  • 50
  • 69