0

I am using a function that uses an API to collect data. I would like to dynamically cycle through different inputs assigned in "Years" to the function. Specifically, I am trying to write a for-loop to cycle through each year and input it into the function.

    Years = c("2019", "2018", "2017", "2016", "2015", "2014", "2013", "2012", "2011", "2010")  

    for (Year in names(Years)){


 YearVar <- Year
 Month <- "08"
 Day <- "01"

 Date <- paste(YearVar, Month, Day, sep = "/")

getHourlyLMP <- function(day = Date, locID = 4004, user = getOption(x       = "ISO_NE_USER"), password = getOption(x = "PASSWORD"), 
                                 out.tz = "America/New_York", ...){



 dd_Year <- format(as.Date(day), "%Y%m%d")
 json_Year <- get_path(path = paste0("/hourlylmp/da/final/day/",    dd_Year, "/location/", locID), user = user, password = password, ...)

 dat_Year <- do.call(what = "rbind", 
             lapply(json_Year$HourlyLmps$HourlyLmp, 
                    FUN = function(x){
                      dd_Year <- as.data.frame(x = x, stringsAsFactors = FALSE)
                      locId <- dd_Year[1,"Location"]
                      dd_Year <- dd_Year[2,]
                      dd_Year$locId <- locId
                      dd_Year
                    } ))

 dat_Year$BeginDate <- lubridate::ymd_hms(dat_Year$BeginDate, tz = out.tz)

 rownames(dat_Year) <- 1:nrow(dat_Year)

 return(dat_Year)
 }
  }

When I run this I receive the following error: "Error: $ operator is invalid for atomic vectors"

Any idea what is causing the error? Thanks!

L-dan
  • 19
  • 2
  • 2
    Where does the `get_path` function come from? Are you sure it returns an object with a `$HourlyLmps` name? It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Are you intending to redefine the function each iteration of the loop? Where do you actually call `getHourlyLMP`? Also note `names(Years)` returns NULL because `Years` is not a named vector, so that doesn't seem right. – MrFlick Sep 26 '19 at 18:17
  • 'get_path' is a function that uses the 'httr' package to pull a json file. The json file has '$HourlyLmps' The function works correctly before adding a for-loop. – L-dan Sep 26 '19 at 18:26
  • Years won't have names... Only length – Carl Boneri Sep 26 '19 at 18:37
  • Thanks Carl! How would I input the name? – L-dan Sep 26 '19 at 18:38

1 Answers1

0

Without having too much information, this is the approach I would take:

First, put the getHourlyLMP function outside of the loop... you can call it from within the iterations, and pass the variable into as a parameter.

# The function we want to run for each year.

getHourlyLMP <- function(day = Date, locID = 4004, user = getOption(x = "ISO_NE_USER"), password = getOption(x = "PASSWORD"), out.tz = "America/New_York", ...){

  dd_Year <- format(as.Date(day), "%Y%m%d")
  json_Year <- get_path(path = paste0("/hourlylmp/da/final/day/", dd_Year, "/location/", locID), user = user, password = password, ...)

  dat_Year <- do.call(what = "rbind", 
                      lapply(json_Year$HourlyLmps$HourlyLmp, 
                             FUN = function(x){
                               dd_Year <- as.data.frame(x = x, stringsAsFactors = FALSE)
                               locId <- dd_Year[1,"Location"]
                               dd_Year <- dd_Year[2,]
                               dd_Year$locId <- locId
                               dd_Year
                             } ))

  dat_Year$BeginDate <- lubridate::ymd_hms(dat_Year$BeginDate, tz = out.tz)

  rownames(dat_Year) <- 1:nrow(dat_Year)

  return(dat_Year)
}

# Build the Date object in an easier way using ?sprintf
Years = c("2019", "2018", "2017", "2016", "2015", "2014", "2013", "2012", "2011", "2010")  

Date <- sprintf("%s/08/01", Years)

# Date
# [1] "2019/08/01" "2018/08/01" "2017/08/01" "2016/08/01" "2015/08/01" "2014/08/01" "2013/08/01" "2012/08/01" "2011/08/01" "2010/08/01"

# Now just loop through the Date objects
lapply(Date, function(i){
  getHourlyLMP(day = i)
})

After some research and registering for the API....

Get the current Final Da Hourly LMPs for a given location.

#Parameters
#name   description type    default
#day    The day to retrieve data for (YYYYMMDD) path    FORMAT IS VITAL
#locationId The location id path    

get_paths2 <- function(year = NULL, user = getOption("ISO_NE_USER"), pass = getOption("ISO_NE_PASS"), loc_id = 4004, out.tz = "America/New_York"){
  library(jsonlite)
  # Pay attention to the path query, which needs to be YYYYMMDD or 20190927... since you gave static date, i hard coded
  # the 0801, but change that for your needs moving forward
  base <- 'https://webservices.iso-ne.com/api/v1.1/hourlylmp/da/final/day/%s0801/location/%s'

  # Build our call url
  api_url <- sprintf(base, year, loc_id)

  # Call the API
  req <- httr::GET(api_url, httr::authenticate(user = user, password = pass, type = "basic"))
  # confirm the request worked by returning a 200 status code
  if(status_code(req) == 200L){
    data <- content(req)

    # Your data manipulation functions here:
    # I tested on a just 2 items in the list, and it works fine...
    # > rbind_pages(lapply(a$HourlyLmps$HourlyLmp[1:2], function(x){
    #     as.data.frame(x)
    # }))
    # 
    # Building a ?tryCatch in here...
    tryCatch({
      lapply(data$HourlyLmps$HourlyLmp, function(i){
        dd_Year <- as.data.frame(i, stringsAsFactors = FALSE)
        dd_Year %>% mutate(
          BeginDate = lubridate::ymd_hms(BeginDate, tz = out.tz, quiet = TRUE)
        )
      }) %>% rbind_pages()
    }, error = function(e){
      NA
    })
  }
}

Note: turns out the Year for 2010 doesn't return data....

out_test <- setNames(lapply(Years, function(i){
    get_paths2(i)
}), Years)




> which(is.na(out_test))
2010 
  10 

> Map(slice, out_test[which(!is.na(out_test))], 1)
$`2019`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2019-08-01            4004         LOAD ZONE .Z.CONNECTICUT     23.8           24.46                   0         -0.66

$`2018`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2018-08-01            4004         LOAD ZONE .Z.CONNECTICUT    24.54           24.59                   0         -0.05

$`2017`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2017-08-01            4004         LOAD ZONE .Z.CONNECTICUT    19.63           19.38                   0          0.25

$`2016`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2016-08-01            4004         LOAD ZONE .Z.CONNECTICUT    28.26           28.08                   0          0.18

$`2015`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2015-08-01            4004         LOAD ZONE .Z.CONNECTICUT    19.42           19.21                   0          0.21

$`2014`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2014-08-01            4004         LOAD ZONE .Z.CONNECTICUT    22.27           21.98                   0          0.29

$`2013`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2013-08-01            4004         LOAD ZONE .Z.CONNECTICUT    27.11           26.73                   0          0.38

$`2012`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2012-08-01            4004         LOAD ZONE .Z.CONNECTICUT    26.59           26.45                   0          0.14

$`2011`
   BeginDate Location..LocId Location..LocType     Location.. LmpTotal EnergyComponent CongestionComponent LossComponent
1 2011-08-01            4004         LOAD ZONE .Z.CONNECTICUT    40.49           39.93                   0          0.56

Then can combine all with rbind_pages(out_test[which(!is.na(out_test))])

Carl Boneri
  • 2,632
  • 1
  • 13
  • 15
  • Great suggestion! Unfortunately, I still receive the same error. Within the lapply statement do I need to iteratively name the "json_Year" dataset? – L-dan Sep 26 '19 at 19:08
  • Can you post that function? – Carl Boneri Sep 26 '19 at 19:17
  • Possible bugs: does getOption return what you expect? Do file paths actually pull properly? – Carl Boneri Sep 26 '19 at 19:18
  • get_path <- function(path, user = getOption(x = "ISO_NE_USER"), password = getOption(x = "PASSWORD"), ...){ q <- httr::GET(url = paste0(ISO_NE_PATH(), path), httr::authenticate(user = user, password = password, type = "basic"), httr::accept_json(), ...) json <- RJSONIO::fromJSON(rawToChar(q$content)) return(json) } – L-dan Sep 26 '19 at 19:21
  • I believe that the lapply function passes an atomic vector, which cannot be accessed with '$'. Any ideas how to bypass that issue? – L-dan Sep 26 '19 at 19:39
  • locId <- dd_Year[1,"Location"] dd_Year <- dd_Year[2,] *** there isn't a second row?*** – Carl Boneri Sep 26 '19 at 19:49
  • also colnames are a mess... hold on, im building your fix.. (i found the api you are using...) – Carl Boneri Sep 26 '19 at 19:51
  • Honestly... It might have been nothing more than the API doesn't return data for 2010... Also make sure dplyr is loaded.... – Carl Boneri Sep 26 '19 at 20:06