0

I'm using rcrossref package to collect abstracts for multiple DOIs stored in a column of a data frame, and I want the outputs (the abstracts) to be moved over into another column of the same data frame. I'm doing this by running a for loop, but:

  1. the loop seems to get hung up on the error message that appears if there is no abstract available.
  2. A second kind of error also occurs when there is no DOI value in the input column.

How might I go about skipping these errors and moving on to the next row when they are encountered?

Here is my R code:

library(bib2df)
library(rcrossref)

url <- "https://gist.githubusercontent.com/zackbatist/46c14011fd5dd4e2763842cd98627927/raw/e8678589cbb9f73ada52e7944bf617e588e1a5fe/GS01ax.bib"

df <- bib2df(url)
df
str(df)
df$DOI
df$ABSTRACT <- NA
df$ABSTRACT

for (i in 1:nrow(df)) {
    n <- cr_abstract(doi = df[i,28])
      df[i,31] <- n
}

df$ABSTRACT

FYI, df$DOI corresponds with the 28th column, and df$ABSTRACT corresponds with the 31st column.

EDIT pertaining to my comment below:

for (i in 1:nrow(df)) {
  try(n <- cr_abstract(doi = df[i,28]))
  try(df[i,31] <- n)
}

EDIT including tracebacks (never done these before so pardon if I'm doing this wrong)

for error 1:

 Error: no abstract found for 10.11141/IA.44.15 
3.
stop("no abstract found for ", doi, call. = FALSE) 
2.
cr_abstract(doi = df[i, 28]) 
1.
.traceback(for (i in 1:nrow(df)) {
    n <- cr_abstract(doi = df[i, 28])
    df[i, 31] <- n
}) 

and for error 2:

Error: Not Found (HTTP 404) 
3.
stop(sprintf("%s (HTTP %s)", x$message, x$status_code), call. = FALSE) 
2.
res$raise_for_status() 
1.
cr_abstract(doi = df[i, 28]) 
mtl_zack
  • 23
  • 7
  • This is called **exception handling** and R has the wrapper functions [`try`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/try.html) and [`tryCatch`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/conditions.html) – smci Sep 13 '18 at 22:16
  • Yes, but I've tried `try` and it does not work, as per my comments under the response provided by djchapman below. – mtl_zack Sep 13 '18 at 22:22
  • Can you post us both actual errors (with tracebacks) that occur when 1) no abstract available 2) no DOI value in the input column? *" loop seems to get hung up"* is unclear. – smci Sep 13 '18 at 22:29
  • Related, possibly near-duplicate: [Use tryCatch skip to next value of loop upon error?](https://stackoverflow.com/questions/8093914/skip-to-next-value-of-loop-upon-error-in-r-trycatch) – smci Sep 13 '18 at 22:48

2 Answers2

1

You dove straight into asking about the for loop so we have been focused on that, but are you just trying to make a new column? For data frame manipulations in R, loops are rarely the most efficient option. Does this do what you want, make a new column called ABSTRACT but with the values of DOI?

df[, "ABSTRACT"] <- df[, "DOI"]
djchapman
  • 205
  • 1
  • 9
  • Tracebacks have been posted, but I'm new to debugging so sorry for my ignorance. As per this response, I am not seeking to copy over the DOI field to the ABSTRACT field, I'm trying to automate the process of querying the crossref database to acquire the abstracts based upon the DOI key. – mtl_zack Sep 13 '18 at 23:12
  • Ah ok, sorry I misunderstood, hope this answer wasn't too demeaning. I haven't thought of how to get past your actual bug, but another set of tools to have on your radar as you spin up on exception handling is safely() if you are using the map tools, described in section 21.6 here: http://r4ds.had.co.nz/iteration.html – djchapman Sep 14 '18 at 16:58
  • It's alright, I took the risk of asking a newb question on stackoverflow xD. That seems like a good alternative, I'll try it out. I also posted this issue at the RCrossRef github repo and am awaiting a response from their core devs https://github.com/ropensci/rcrossref/issues/174 – mtl_zack Sep 15 '18 at 20:49
0

Did you look into try()?

for (i in 1:nrow(df)) {
    try(n <- cr_abstract(doi = df[i,28]))
      df[i,31] <- n
}
djchapman
  • 205
  • 1
  • 9
  • Nope, the same error still appears `Error : no abstract found for 10.11141/IA.44.15`, and an additional error pops up immediately after it: `Error in n : object 'n' not found`. – mtl_zack Sep 13 '18 at 21:52
  • the following modification takes the abstract pertaining to the first DOI and applies it to every row (prettier code in edited original post): `for (i in 1:nrow(df)) { try(n <- cr_abstract(doi = df[i,28])) try(df[i,31] <- n) }` – mtl_zack Sep 13 '18 at 22:08