My raw dataset has multiple product Id, monthly sales and corresponding date arranged in a matrix format. I wish to create individual dataframes for each product_id along with the sales value and dates. For this, I am using a for loop.
base is the base dataset. x is the variable that contains the unique product_id and the corresponding no of observation points.
for(i in 1:nrow(x)){
n <- paste("df", x$vars[i], sep = "")
assign(n, base[base[,1] == x$vars[i],])
print(n)}
This is a part of the output:
[1] "df25"
[1] "df28"
[1] "df35"
[1] "df37"
[1] "df39"
So all the dataframe names are saved in n. This, I think is a string vector.
When I write df25 outside the loop, I get the dataframe I want:
> df25
# A tibble: 49 x 3
ID date Sales
<dbl> <date> <dbl>
1 25 2014-01-01 0
2 25 2014-02-01 0
3 25 2014-03-01 0
4 25 2014-04-01 0
5 25 2014-05-01 0
6 25 2014-06-01 0
7 25 2014-07-01 0
8 25 2014-08-01 0
9 25 2014-09-01 0
10 25 2014-10-01 0
# ... with 39 more rows
Now, I want to use each of these dataframes seperately to perform a forecast analysis. For doing this, I need to get to the values in individual dataframes. This is what I have tried for the same:
for(i in 1:4) {print(paste0("df", x$vars[i]))}
[1] "df2"
[1] "df3"
[1] "df5"
[1] "df14"
But I am unable to refer to individual dataframes. I am looking for help on how can I get access to the dataframes with their values for further analysis? Since there are more than 200 products, I am looking for some function which deals with all the dataframes.
First, I wish to convert it to a TS
, using year
and month
values from the date
variable and then use ets
or forecast,
etc.
SAMPLE DATASET:
set.seed(354)
df <- data.frame(Product_Id = rep(1:10, each = 50),
Date = seq(from = as.Date("2010/1/1"), to = as.Date("2014/2/1") , by = "month"),
Sales = rnorm(100, mean = 50, sd= 20))
df <- df[-c(251:256, 301:312) ,]
As always, any suggestion would be highly appreciated.