0

I'm very new to R and seem to be having an issue with the data that I am trying to analyse.

Firstly, I am following the guide to CausalImpact detailed here: https://google.github.io/CausalImpact/CausalImpact.html#installing-the-package

In the example above, a sample data set is created. I have this data already, so I have imported it into a variable 'mydata' using read.csv.

This has worked and now I have a dataset of 2 columns, labelled 'Control' and 'Test' with numerical data in each column. The 'Test' data set is basically a copy of the control until half way through, where I added +10% to fake an uplift.

I then generated my pre.period and post.period.

So when it comes to analyse this I use:

CausalImpact(mydata, pre.period, post.period)

This returns an error message:

Error in '-default'(y, y.mu): non-numeric arguement to binary operator In addition: Warning message: In mean.default(coredata(x),...) : argument is not numeric or logical: returning NA

Any help much appreciated.

Example as requested:

pre.period <- c(1, 31)
post.period <- c(32, 70)
impact <- CausalImpact (mydata, pre.period, post.period)

The above code works on the created dataset below:

mydata = matrix(1:140, ncol = 2)

So I assume there is an issue with the CSV data I have used - I just don't know what.

To import the CSV data I used:

mydata = read.csv("/location/myfile.csv")

Additional Info

dput(head(mydata,10))

structure(list(Control = structure(c(4L, 6L, 29L, 44L, 5L, 8L, 7L, 10L, 11L, 36L), .Label = c("1,016", "1,172", "1,232", "421", "428", "432", "433", "450", "460", "463", "465", "466", "474", "476", "477", "483", "487", "491", "495", "510", "515", "517", "522", "526", "535", "536", "539", "543", "549", "555", "567", "569", "570", "572", "577", "585", "590", "592", "593", "595", "596", "600", "602", "603", "608", "610", "623", "626", "637", "639", "640", "690", "701", "703", "717", "735", "816", "831", "890", "909", "924"), class = "factor"), Test = c(421, 432, 549, 603, 428, 450, 433, 463, 465, 585)), .Names = c("Control", "Test" ), row.names = c(NA, 10L), class = "data.frame")

str(mydata)

'data.frame': 66 obs. of 2 variables:

$ Control: Factor w/ 61 levels "1,016","1,172",..: 4 6 29 44 5 8 7 10 11 36 ...

$ Test : num 421 432 549 603 428 450 433 463 465 585 ...

Answer

I had imported data from a CSV where there was originally 3 columns, Text,Num,Num.

I deleted the Text column but there type didn't change to num.

double_dd
  • 1
  • 2
  • Please include a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610), so we are able to reproduce your problem. Based on the error-message my first guess would be that one your variables isn't in numeric format. – Jaap Feb 08 '17 at 08:25
  • Could you include the following info as well: (1) the output of `dput(head(mydata,10))` and (2) the output of `str(mydata)` – Jaap Feb 08 '17 at 09:47
  • 1
    As I expected: your `Control` variable is a factor and not a nmeric variable, you need to convert it to numeric with `as.numeric(as.character(mydata$Control))` – Jaap Feb 08 '17 at 10:46
  • you'll need to remove the comma first : `as.numeric(as.character(gsub(",", "", mydata$Control)))` – Cath Feb 08 '17 at 10:55
  • Thanks @Jaap I now have my imported CSV working correctly. I believe this was a hangover from a column that originally contained text. – double_dd Feb 08 '17 at 11:15

0 Answers0