2

We have a multi-column CSV file of the following format:

id1,id2,id3,id4
1,2,3,4
,,3,4,6
2,,3,4

These missing values are to be assumed as a '0' when reading the CSV column by column. The following is the script we currently have:

data <- read.csv("data.csv")

dfList <- lapply(seq_along(data), function(i) {
    seasonal_per <- msts(data[, i], seasonal.periods=c(24,168))
    best_model <- tbats(seasonal_per)
    fcst <- forecast.tbats(best_model, h=24, level=90)
    dfForec <- print(fcst)
    result <- cbind(0:23, dfForec[, 1])
    result$id <- names(df)[i]

    return(result[c("id", "V1", "V2")])
})

finaldf <- do.call(rbind, dfList)
write.csv(finaldf, file = "out.csv", row.names = FALSE)

This script breaks when the CSV has missing values giving the error Error in tau + 1 + adj.beta + object$p : non-numeric argument to binary operator. How do we tell R to assume a '0' when it encounters a missing value?

I tried the following:

library("forecast")
D <- read.csv("data.csv",na.strings=".")
D[is.na(D)] <- 0

dfList <- lapply(seq_along(data), function(i) {
  seasonal_per <- msts(data[, i], seasonal.periods=c(24,168))
  best_model <- tbats(seasonal_per)
  fcst <- forecast.tbats(best_model, h=24, level=90)
  dfForec <- print(fcst)
  result <- cbind(0:23, dfForec[, 1])
  result$id <- names(df)[i]

  return(result[c("id", "V1", "V2")])
})

finaldf <- do.call(rbind, dfList)
write.csv(finaldf, file = "out.csv", row.names = FALSE)

but it gives the following error:

Error in data[, i] : object of type 'closure' is not subsettable

learnerX
  • 1,022
  • 1
  • 18
  • 44
  • 3
    Possible duplicate of [Replace all NA with FALSE in selected columns in R](http://stackoverflow.com/questions/7279089/replace-all-na-with-false-in-selected-columns-in-r) – Weihuang Wong Sep 01 '16 at 02:03
  • Start a new `R` session and try again - your new error is regarding `data` but "what I tried" refers to `D`. Print the object after the first block of code and see if the missing value -> 0 problem is resolved. – Jonathan Carroll Sep 01 '16 at 02:30
  • Possible duplicate of [Time series prediction / forecast with TBATS failing with 'Error in tau + 1 + adj.beta + object$p'](http://stackoverflow.com/questions/38334600/time-series-prediction-forecast-with-tbats-failing-with-error-in-tau-1-ad) – Jonathan Carroll Sep 01 '16 at 02:59

2 Answers2

3

If you're certain that any NA value should be 0, and that's the only issue, then

data <- read.csv("data.csv")
data[is.na(data)] <- 0
Jonathan Carroll
  • 3,897
  • 14
  • 34
  • I tried that and got the following error: `Error in data[is.na(data)] <- 0 : object of type 'closure' is not subsettable In addition: Warning message: In is.na(data) : is.na() applied to non-(list or vector) of type 'closure'` – learnerX Sep 01 '16 at 02:17
  • At a guess, you haven't actually loaded `data` into your workspace (or at least correctly), in which case `R` is trying to use the function `data()`. Try the `read.csv` again then print the `data` object. – Jonathan Carroll Sep 01 '16 at 02:20
  • changed all instance to D and restarted the R session. Now I have: `D <- read.csv("data.csv") D[is.na(D)] <- 0` and then the rest of the script starting from `dfList....` However, I still get the following error: `Error in tau + 1 + adj.beta + object$p : non-numeric argument to binary operator` I checked D but seems like all missing values now have a '0'. Why do I still get this error then? – learnerX Sep 01 '16 at 02:56
  • That's a different problem, and a solution is here: http://stackoverflow.com/questions/38334600/time-series-prediction-forecast-with-tbats-failing-with-error-in-tau-1-ad but for now if this solves the question as asked, please consider accepting it and starting a new question if you still have issues. – Jonathan Carroll Sep 01 '16 at 02:59
0

If you're working in the tidyverse (or just with dplyr), this option works well:

library(tidyverse)
data <- read_csv("data.csv") %>% replace(is.na(.), 0)
tgraybam
  • 160
  • 1
  • 2
  • 9