I am new to programming and am attempting to create a prediction model for multiple articles. Unfortunately, using Excel or similar software is not possible for this task. Therefore, I have installed Rstudio to solve this problem. My goal is to make a 18-month prediction for each article in my dataset using an ARIMA model.
However, I am currently facing an issue with the format of my data frame. Specifically, I am unsure of how my CSV should be structured to be read by my code.
I have attached an image of my current dataset in CSV format : https://i.stack.imgur.com/AQJx1.png
Here is my dput(sales_data) :
structure(list(X.Article.1.Article.2.Article.3 = c("janv-19;42;49;55", "f\xe9vr-19;56;58;38", "mars-19;55;59;76")), class = "data.frame", row.names = c(NA, -3L))
And also provided the code I have constructed so far with the help of blogs and websites :
library(forecast)
library(reshape2)
sales_data <- read.csv("sales_data.csv", header = TRUE)
sales_data_long <- reshape2::melt(sales_data, id.vars = "Code Article")
for(i in 1:nrow(sales_data_long)) {
sales_data_article <- subset(sales_data_long, sales_data_long$`Code Article` == sales_data_long[i,"Code Article"])
sales_ts <- ts(sales_data_article$value, start = c(2010,6), frequency = 12)
arima_fit <- auto
arima_forecast <- forecast(arima_fit, h = 18)
print(arima_forecast)
print("Article: ", Code article[i])
}
With this code, RStudio gives me the following error : "Error: id variables not found in data: Code Article"
Currently, I am not interested in generating any plots or outputs. My main focus is on identifying the appropriate format for my data.
Do I need to modify my CSV file and separate each column using "," or ";"? Or, can I keep my data in its current format and make adjustments in the code instead?