I have a web scrape function that I created that gets data from an API. I pass a df
column I have to one of the function arguments in the web scrape function. The issue I'm having is that the URL takes up to 500 numbers in one of the parameters, and my df
has 2000 rows.
How would I split the rows by 500 in order to pass the values into the function?
I've created a very basic reprex that shows the workflow of what I am looking to do. I want to pass the split df column to the parse function. I'm guessing I would need to wrap the JSON
parse with map_dfr
library(tidyverse)
sample_df <- tibble(id = 1:20,
col_2 = rnorm(1:20))
# parse function
parse_people <- function(ids = c("1", "10"), argument_2 = NULL){
# Fake Base Url
base_url <- "https://www.thisisafakeurl.com/api/people?Ids="
# fix query parameters to collapse Ids to pass to URL
ids<- stringr::str_c(ids, collapse = ",")
url <- glue::glue("{base_url}{ids}")
# Get URL
resp <- httr::GET(url)
# Save Response in JSON Format
out <- httr::content(resp, as = "text", encoding = "UTF-8")
# Read into JSON format.
jsonlite::fromJSON(out, simplifyDataFrame = TRUE, flatten = TRUE)
}
sample_parse <- parse_people(sample_df$id)
I think I probably need to create 2 functions. 1 function that parses the data, and one that uses map_dfr based off of the splits.
Something like:
# Split ID's from DF here. I want blocks of 500 rows to pass below
# Map Split ID's over parse_people
ids %>%
map_dfr(parse_people)