0

So I had previously posted a similar question but this is kind of an extension to it. The complicacy of my data set has increased and now my data set is a list of data frames where each data frame has a KEY, CAL, OAS. KEY is the unique value which changes from one data frame to another. CAL is the timescale I have to utilize to make a ts object expressed in week.year. and OAS is the number of units that are recorded for every week. The original data set has some 250 data frames, for convenience I am posting just 2.

   KEY      CAL      OAS     
  444-12235140 01.2019 144.5667  
  444-12235140 02.2019 139.6333  
  444-12235140 03.2019 212.6667  
  444-12235140 04.2019 415.7000  
  444-12235140 05.2019 433.5333  
  444-12235140 06.2019 439.8000 

dataframe 1

     KEY       CAL      OAS 
556-11337513 21.2019   0.00000 
556-11337513 22.2019   0.00000 
556-11337513 23.2019   0.00000 
556-11337513 24.2019  57.00000 
556-11337513 25.2019  17.20909 
556-11337513 26.2019 130.01818 

dataframe 2

The CAL ranges from 1 to 52 (with values missing for some weeks) and year changes as well from 2019 to 2021. I have tried altering the approach (as suggested by a previous generous contributor):

 tt <-list() 
 for (i in 1:num_of_unique_keys) {  
 week <-as.integer(ds2[[i]][[2]]) 
 year <- as.numeric(sub("...", "",ds2[[i]][[2]])) 
 zz <- zoo(ds2[[i]][[3]], year + (week - 1) / 52)  
  tt[[i]] <- ts(zz, frequency = 52) 
 }

num_of_unique_keys is the number of dataframes. But I am facing the error

Error in seq.default(head(tt, 1), tail(tt, 1), deltat) : 'from' must be a finite number

Any ideas as to how it can be resolved?

charu1313
  • 29
  • 6
  • First, your for loop is off. You just give `num_of_unique_keys`, but that has to be a sequence not just a number. Try `i in 1:num_of_unique_keys` – astrofunkswag Mar 05 '20 at 16:35
  • But when posting on this site, please post a portion of your data, perhaps using `dput`, that can just be copied and pasted. A picture of your data isn't much good because somebody would have to manually enter it all, and you're much less likely to get assistance – astrofunkswag Mar 05 '20 at 16:36
  • Thanks for the suggestions. I have made changes to data so that it can be easily copied – charu1313 Mar 05 '20 at 16:50
  • Since `tt` is a list, you need double brackets on the last line of your loop `tt[[i]]`. Your `sub` line I think also should change to `gsub(".*\\.","",ds2[[i]][[2]])`, see [this post](https://stackoverflow.com/questions/12297859/remove-all-text-before-colon). – astrofunkswag Mar 05 '20 at 17:07
  • @astrofunkswag I am still getting the same error. I've looked up in my dataset as well, there are no NaN values. – charu1313 Mar 06 '20 at 10:32
  • Since you don't provide it in your post, I made the assumption that your list `ds2` was defined like `df2 <- list(df1, df2)`, and `num_of_unique_keys = length(ds2)`. When I run it with the double brackets for `tt` I get no errors. – astrofunkswag Mar 07 '20 at 16:04
  • Try to debug the loop to see exactly where the error is being generated – astrofunkswag Mar 07 '20 at 16:04
  • for creating ds2 i have used as.data.frame and not list. Also I changed the last line of code to tt[[i]] <- ts(zz, frequency = 52) to make it work for at least those dataframes where week.year starts with 01.xxxx. I have narrowed down the problem to the issue that when CAL starts with the first week i.e. 01.2020 it works fine, but if CAL starts with week other than 01 like 26.2020 in the second dataframe, then it gives error- "Error in stl(tt[[i]], s.window = "periodic", robust = TRUE) : series is not periodic or has less than two periods" – charu1313 Mar 09 '20 at 08:48
  • how exactly do you define `ds2`? Please include the code. You say "list of dataframes" in your post title, so your last comment doesn't make sense to me. – astrofunkswag Mar 09 '20 at 21:09

0 Answers0