0

My raw dataset has multiple product Id, monthly sales and corresponding date arranged in a matrix format. I wish to create individual dataframes for each product_id along with the sales value and dates. For this, I am using a for loop.

base is the base dataset. x is the variable that contains the unique product_id and the corresponding no of observation points.

 for(i in 1:nrow(x)){
   n <- paste("df", x$vars[i], sep = "")
     assign(n, base[base[,1] == x$vars[i],])
     print(n)}

This is a part of the output:

[1] "df25"
[1] "df28"
[1] "df35"
[1] "df37"
[1] "df39"

So all the dataframe names are saved in n. This, I think is a string vector.

When I write df25 outside the loop, I get the dataframe I want:

> df25
# A tibble: 49 x 3
       ID date        Sales
    <dbl> <date>     <dbl>
 1     25 2014-01-01     0
 2     25 2014-02-01     0
 3     25 2014-03-01     0
 4     25 2014-04-01     0
 5     25 2014-05-01     0
 6     25 2014-06-01     0
 7     25 2014-07-01     0
 8     25 2014-08-01     0
 9     25 2014-09-01     0
10     25 2014-10-01     0
# ... with 39 more rows

Now, I want to use each of these dataframes seperately to perform a forecast analysis. For doing this, I need to get to the values in individual dataframes. This is what I have tried for the same:

for(i in 1:4) {print(paste0("df", x$vars[i]))}
[1] "df2"
[1] "df3"
[1] "df5"
[1] "df14"

But I am unable to refer to individual dataframes. I am looking for help on how can I get access to the dataframes with their values for further analysis? Since there are more than 200 products, I am looking for some function which deals with all the dataframes.

First, I wish to convert it to a TS, using year and month values from the date variable and then use ets or forecast, etc.

SAMPLE DATASET:

set.seed(354)
df <- data.frame(Product_Id = rep(1:10, each = 50), 
                     Date = seq(from = as.Date("2010/1/1"), to = as.Date("2014/2/1") , by = "month"), 
                     Sales = rnorm(100, mean = 50, sd= 20))
df <- df[-c(251:256, 301:312) ,]

As always, any suggestion would be highly appreciated.

user10579790
  • 333
  • 1
  • 10
  • 1
    side note: usually it is better practice not to assing new variables in a for loop, but use lists instead. – Wimpel Mar 07 '19 at 14:10
  • I agree with @Wimpel. consider using lists and the purrr package. that should make this a lot easier from the start – Benjamin Schwetz Mar 07 '19 at 14:31
  • Hey, can you please help me with some example? Are you suggesting to totally abandon for loop (I am quite new to R)? – user10579790 Mar 07 '19 at 14:38
  • @user10579790 please provide some sample-data, using `dput()`. Further: read this: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Wimpel Mar 07 '19 at 15:00
  • Hey, I am adding a sample dataset as an edit to the question. – user10579790 Mar 08 '19 at 10:41

1 Answers1

0

I think this is one way to get an access to the individual dataframes. If there is a better method, please let me know:

    (Var <- get(paste0("df",x$vars[i])))
user10579790
  • 333
  • 1
  • 10