I have written a function to retrieve data via API. The output format is JSON. https://jsoneditoronline.org/?id=ac0ec7ececae49ca92599ff912458a84
With every query a variable (path) should change. This variable is located in a dataframe (product_folders_summarised) in the column (product_folder).
library(tidyverse)
library(httr)
library(jsonlite)
library(data.table)
func_visibility <- function(product_folder) {
api_url <- "https://api.where-the-data-comes-from.com/example"
api_key <- "_API_KEY_"
format <- "json"
request <-
fromJSON(
paste0(api_url, "?api_key=",api_key,"&format=",format,"&path=",product_folder),
simplifyVector = TRUE,
simplifyDataFrame = TRUE,
flatten = TRUE
)
request <- lapply(request, function(x) {
x[sapply(x, is.null)] <- NA
unlist(x)
})
request <- as.data.frame(t(request$answer))
request <- select(request, -sichtbarkeitsindex.path, -sichtbarkeitsindex.date)
return(request)
}
product_folders_summarised <- product_folders_summarised %>%
dplyr::mutate(visibility_value = func_visibility(product_folder))
The dataframe is structured as follows:
|product_folder|value_1|value_2|
|https://www.example.de/folder/|this|that|
|https://www.example.de/anotherfolder/|...|...|
I expect that from the dataframe (product_folders_summarised) the value is taken from the column (product_folder), passed to the function and visibility_value is added as column.
Instead I get the error message
Error: lexical error: invalid char in json text.
https://api.https://api.where-the-data-comes-from.com/example.
(right here) ------^
I have now adjusted my function as suggested by r2evans.
func_visibility <- function(path) {
api_url <- "https://api.where-the-data-comes-from.com/example"
api_key <- "_API_KEY_"
format <- "json"
request <- paste0(api_url,"?api_key=",api_key,"&format=",format,"&path=",path)
request <- lapply(request, jsonlite::fromJSON)
request <- lapply(request, function(x) {
x[sapply(x, is.null)] <- NA
as.data.frame(t(x))
unlist(x)
})
return(request)
}
product_folders_summarised_short <- product_folders_summarised_short %>%
dplyr::mutate(sichtbarkeitsindex_value = func_visibility(product_folder))
The data is now retrieved from the API. The data is written into the new last column of the dataframe:
c(method = "domain.sichtbarkeitsindex", answer.sichtbarkeitsindex.path = "https://www.example.de/folder/", answer.sichtbarkeitsindex.date = "2019-09-02T00:00:00+02:00", answer.sichtbarkeitsindex.value = "0", credits.used = "1")
In my first attempt (see first codeblock), I converted the data into a dataframe.
request <- as.data.frame(t(request$answer)),
request <- select(request, -sichtbarkeitsindex.path, -sichtbarkeitsindex.date),
Applied to a single URL, this worked. Now I integrated
`as.data.frame(t(x))`,
but I only get the result that the data from the API is stored as a character vector.
Do you think it is easier to write the data as a character vector in the last column of the dataframe, in order to assign the vectors to a new dataframe after passing the first function with another function?