I'm using a REST API to retrieve data from an Azure Table using the code below:
library(httr)
library(RCurl)
library(bitops)
library(xml2)
# Stores credentials in variable
Account <- "storageaccount"
Container <- "Usage"
Key <- "key"
# Composes URL
URL <- paste0(
"https://",
Account,
".table.core.windows.net",
"/",
Container
)
# Requests time stamp
requestdate <- format(Sys.time(), "%a, %d %b %Y %H:%M:%S %Z", tz = "GMT")
# As per Microsoft's specs, an empty line is needed for content-length
content_lenght <- 0
# Composes signature string
signature_string <- paste0(
"GET", "\n", # HTTP Verb
"\n", # Content-MD-5
"text/xml", "\n", # Content-Type
requestdate, "\n", # Date
"/", Account, "/", Container # Canonicalized resource
)
# Composes header string
header_string <- add_headers(
Authorization=paste0(
"SharedKey ",
Account,
":",
RCurl::base64(
digest::hmac(
key = RCurl::base64Decode(
Key, mode = "raw"
),
object = enc2utf8(signature_string),
algo = "sha256",
raw = TRUE
)
)
),
'x-ms-date' = requestdate,
'x-ms-version' = "2020-12-06",
'Content-type' = "text/xml"
)
# Creates request
xml_body = content(
GET(
URL,
config = header_string,
verbose()
),
"text"
)
Get_data <- xml_body # Gets data as text from API
From_JSON <-fromJSON(Get_data, flatten = TRUE) # Parses text from JSON
Table_name <- as.data.frame(From_JSON) # Saves data to a table
I can now view the table, but I noted that I can only see the first 1000 rows. What's the most efficient way to implement a loop/cycle that retrieves all the remaining rows and updates the table?
I need to be able to work on the entire dataset.
Also consider that this table will be updated with ~40,000 rows per day, so keeping the visuals current with the data is a concern.
Thanks in advance for your suggestions!
~Alienvolm