0

This question kinda builds on questions I asked here and here, but its finally coming together and I think I know what the problem is, just need help kicking it over the goal line. TL;DR at the bottom.

The overall goal as simply put as possible:

  • I have a dataframe that is from an API pull of a redcap database. It has a few columns of information about various studies.
  • I'd like to go through that dataframe line by line, and push it into a different website called Oncore, through an API.
  • In the first question linked above (here again), I took a much simpler dataframe... took one column from that dataframe (the number), used it to do an API pull from Oncore where it would download from Oncore, copy one variable it downloaded over to a different spot, and push it back in. It would do this over and over, once per row. Then it would return a simple dataframe of the row number and the api status code returned.
  • Now I want to get a bit more complicated and instead of just pulling a number from one colum, I want to swap over a bunch of variables from my original dataframe, and upload them.
  • The idea is for sample studies input into Redcap to be pushed into Oncore.

What I've tried:

I have this dataframe from the redcap api pull:

testprotocols<-structure(list(protocol_no = c("LS-P-Joe's API", "JoeTest3"), 
    nct_number = c(654321, 543210), library = structure(c(2L,
    2L), levels = c("General Research", "Oncology"), class = "factor"),
    organizational_unit = structure(c(1L, 1L), levels = c("Lifespan Cancer Institute", 
    "General Research"), class = "factor"), title = c("Testing to see if basic stuff came through",
    "Testing Oncology Projects for API"), department = structure(c(2L,
    2L), levels = c("Diagnostic Imaging", "Lifespan Cancer Institute"
    ), class = "factor"), protocol_type = structure(2:1, levels = c("Basic Science",
    "Other"), class = "factor"), protocolid = 1:2), row.names = c(NA,
-2L), class = c("tbl_df", "tbl", "data.frame"))

I have used this code to try and push the data into Oncore:

##This chunk gets a random one we're going to change later
base <- "https://website.forteresearchapps.com"
endpoint <- "/website/rest/protocols/"
protocol <- "2501"

## 'results' will get changed later to plug back in


## store
protocolid <- protocolnb <- library_names <- get_codes <- put_codes <- list()

UpdateAccountNumbers <- function(protocol){
  call2<-paste(base,endpoint, protocol, sep="")   
httpResponse <- GET(call2, add_headers(authorization = token))
results = fromJSON(content(httpResponse, "text"))


results$protocolId<- "8887"  ## doesn't seem to matter
results$protocolNo<- testprotocols$protocol_no
results$library<- as.character(testprotocols$library)
results$title<- testprotocols$title
results$nctNo<-testprotocols$nct_number
results$objectives<-"To see if the API works, specifically if you can write over a previous number"
results$shortTitle<- "Short joseph Title"
results$nctNo<-testprotocols$nct_number
results$department <- as.character(testprotocols$department)
results$organizationalUnit<- as.charater(testprotocols$organizational_unit)
results$protocolType<- as.character(testprotocols$protocol_type)

  call2 <- paste(base,endpoint, protocol, sep="") 

  httpResponse_put <- PUT(
    call2, 
    add_headers(authorization = token), 
    body=results, encode = "json", 
    verbose()
  )

  # save stats 
  protocolid <- append(protocolid, protocol)
  protocolnb <- append(protocolnb, testprotocols$PROTOCOL_NO[match(protocol, testprotocols$PROTOCOL_ID)])
  library_names <- append(library_names, testprotocols$LIBRARY[match(protocol, testprotocols$PROTOCOL_ID)])
  get_codes <- append(get_codes, status_code(httpResponse_get))
  put_codes <- append(put_codes, status_code(httpResponse_put))
}
## Oncology will have to change to whatever the df name is, above and below this
purrr::walk(testprotocols$protocol_no, UpdateAccountNumbers)

allresults <- tibble('protocolNo'=unlist(protocol_no),'protocolnb'=unlist(protocolnb),'library_names'=unlist(library_names), 'get_codes'=unlist(get_codes), 'put_codes'=unlist(put_codes) )

When I get to the line:

purrr::walk(testprotocols$protocol_no, UpdateAccountNumbers)

I get this error:

enter image description here

When I do traceback() I get this:

enter image description here

When I step through the loop line by line I realized that in this chunk of code:

call2<-paste(base,endpoint, protocol, sep="")   
httpResponse <- GET(call2, add_headers(authorization = token))
results = fromJSON(content(httpResponse, "text"))


results$protocolId<- "8887"  ## doesn't seem to matter
results$protocolNo<- testprotocols$protocol_no
results$library<- as.character(testprotocols$library)
results$title<- testprotocols$title
results$nctNo<-testprotocols$nct_number
results$objectives<-"To see if the API works, specifically if you can write over a previous number"
results$shortTitle<- "Short joseph Title"
results$nctNo<-testprotocols$nct_number
results$department <- as.character(testprotocols$department)
results$organizationalUnit<- as.charater(testprotocols$organizational_unit)
results$protocolType<- as.character(testprotocols$protocol_type)

Where I had envisioned it downloading ONE sample study and replacing aspects of it with variables from ONE row of my beginning dataframe, its actually trying to paste everything in the column in there. I.e. results$nctNo is "654321 543210" instead of just "654321" from the first row.

TL;DR version:

I need my purrr loop to take one row at a time instead of my entire column, and I think if I do that, it'll all magically work.

Joe Crozier
  • 944
  • 8
  • 20
  • 1
    within `UpdateAccountNumbers()`, you are referring to entire columns of the `testprotocols` frame when you do things like `results$nctNo<-testprotocols$nct_number`.. Instead, perhaps at the top of the `UpdateAccountNumbers()` function, you can do something like `tp = testprotocols[testprotocols$protocol_no == protocol,]`, and then when you are trying to assign values to `results` you can refer to `tp` instead of `testprotocols` – langtang Aug 25 '22 at 15:52
  • This definitely helped, I ran into a bunch of other problems but they fall outside of the scope of the original question. I dont know how to proceed with this question, I guess... do you want to put that comment as an answer and I can accept it? – Joe Crozier Aug 25 '22 at 17:23
  • sure, feel free to post another question if you have other issues arising. In the meantime, I've added the above as an answer – langtang Aug 25 '22 at 18:39

1 Answers1

1

Within UpdateAccountNumbers(), you are referring to entire columns of the testprotocols frame when you do things like results$nctNo<-testprotocols$nct_number ....

Instead, perhaps at the top of the UpdateAccountNumbers() function, you can do something like tp = testprotocols[testprotocols$protocol_no == protocol,], and then when you are trying to assign values to results you can refer to tp instead of testprotocols

Note that your purrr::walk() command is passing just one value of protocol at a time to the UpdateAccountNumbers() function

langtang
  • 22,248
  • 1
  • 12
  • 27