4

I encountered a weird error message in data.table

I modified a data.table using := , and it is totally OK without any error. When I trying to put the code into a function, the following error message comes out.

Error in `:=`(date, as.Date(as.character(date), "%Y%m%d") - 1) : 
:= and `:=`(...) are defined for use in j, once only and in particular ways. See     help(":="). Check is.data.table(DT) is TRUE.

Here's reproducible example

testdat <- data.table(ID = c(1:10), date = c(20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101), Number = rnorm(10))
# The single line command works fine. 
testdat[, date := as.Date(as.character(date),"%Y%m%d") - 1][, Number:= NULL]
# But if I wrote them into a function, it failed. 
# ( In this case, it worked as well.. So I got totally lost. ) 
test2 <- data.frame(ID = c(1:10), date = c(20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101), Number = rnorm(10))
readdata <- function(fn){
      DT <- data.table(fn)
      DT[, date:= as.Date(as.character(date),"%Y%m%d") - 1][, Number:= NULL]
      return(DT)
}

To better description, I put parts of my original code here. So you may understand where goes wrong.

readdata <- function(fn){
   DT <- fread(fn, sep=",")
   # DT <- fread("1202.txt")
   setnames(DT, paste0("V",c(1:12)), column_names)
   # Modification on date
   setkey(DT,uid)
   DT[,date := as.Date(as.character(date),"%Y%m%d") - 1][, ignore:= NULL] #ignore is the name of one column
...}

I have a list of txt files, and I want to do the calculation for each of them. First step is using fread, and proceed one by one. Suppose now the I want to do the calculation based on "1202.txt" file. If I start from DT <- fread("1202.txt") and then proceeded. It will not come up this error.

If I want to use readdata("1202.txt") the error message comes out. Most weird is that, I used the readdata before without any errors.

So what's going on here? Any suggestions? Thanks.

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.8.11

loaded via a namespace (and not attached):
[1] tools_3.0.2

EDIT

After some trials, I found that if I modified code as the following, it worked

   readdata <- function(fn){
   DT <- fread(fn, sep=",")
   DT <- data.table(DT) ## Just add this line compared to the original one.
   # DT <- fread("1202.txt")
   setnames(DT, paste0("V",c(1:12)), column_names)
   # Modification on date
   setkey(DT,uid)
   DT[,date := as.Date(as.character(date),"%Y%m%d") - 1][, ignore:= NULL] #ignore is the name of one column
...}

So the error is due to the fread? After fread, it should be a data.table. Why I need to use data.table(DT) to convert it ?

EDIT

Thanks for attention. Here's an update on Feb 4th, 2014

I first uninstalled my 1.8.11, and followed the instructions of Matt. Install 1.8.10 from CRAN again, and then followed his code step by step. It turns out totally OK without any error.

Then I uninstalled 1.8.11, and then tried to install 1.8.11 again using the precomplied zip file.

As usual, there's a warning message:

> install.packages("~/Desktop/data.table_1.8.11.zip", repos = NULL)
Warning in install.packages :
package ~/Desktop/data.table_1.8.11.zip?is not available (for R version 3.0.2)
Installing package into C:/Users/James/R/win-library/3.0?(as lib?is unspecified)
package data.table?successfully unpacked and MD5 sums checked

> require(data.table)
Loading required package: data.table
data.table 1.8.11  For help type: help("data.table")

It seems that the warning message is wrong, it is totally OK when I loaded the package. And at this time, it is totally OK for the whole process. Thanks for the patience of Matt, and Arun, and all other warmhearted ones. I'm a beginner of data.table. And your kindness is really appreciated.

Here's one more thing, as I have already reported in this link, and still unsolved.

> ?melt.data.table
No documentation for 憁elt.data.table?in specified packages and libraries:
you could try ??melt.data.table?

It's really a pity. Any ideas?

I reported my sessionInfo in that link. And I used Win8.1 64bit

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
Bigchao
  • 1,746
  • 3
  • 15
  • 31

2 Answers2

5

After reinstalling the data.table v1.8.10 / v1.8.11 (I tried for both the two versions), and restarted a new R session. The problem solved.

It turns out my problem was caused by a 5 month old development version being installed.

The data.table homepage was slightly misleading :

Last recommended snapshot precompiled for Windows: v1.8.11 rev931 04 Sep 2013

The [homepage][1] has been improved and now reads :

install.packages("data.table", repos="http://R-Forge.R-project.org")
Or, if that fails, the last precompiled .zip for Windows copied to this homepage may suffice: v1.8.11 rev1110 04 Feb 2014

Thanks for all of you for valuable answers and comments.

joran
  • 169,992
  • 32
  • 429
  • 468
Bigchao
  • 1,746
  • 3
  • 15
  • 31
  • The homepage was misleading you as well, though i.e. the answer is more specific than just reinstalling (you had to install from the right place). Anyway, I'll vote to close this question as it's more likely to mislead (given the title) than help others in future. One of the close reasons is specific to this situation. Thanks for your patience on this one. – Matt Dowle Feb 04 '14 at 12:16
  • Thanks Matt! I posted another answers about the ?melt.data.table in more details. Please close that too. Thanks again:) – Bigchao Feb 04 '14 at 12:21
  • 1
    @MattDowle this one got me today, too! – Statwonk Jun 04 '14 at 17:09
3

(This is too long for a comment so I put it as an answer). I can't reproduce your error. (Maybe some data.table experts can give you better explanation). This works fine for me:

readdata <- function(fn){
  DT <- fread(fn)   ## no need to put a sep here, fread guess it
  DT[, date:= as.Date(as.character(date),"%Y%m%d") - 1][, Number:= NULL]
  return(DT)
}

write.csv(test2,'test2.csv',row.names=F)  ## fread works better without rownames
 readdata('test2.csv')
    ID       date
 1:  1 2012-12-31
 2:  2 2012-12-31
 3:  3 2012-12-31
 4:  4 2012-12-31
 5:  5 2012-12-31
 6:  6 2012-12-31
 7:  7 2012-12-31
 8:  8 2012-12-31
 9:  9 2012-12-31
10: 10 2012-12-31

[Edit from Matt] I can't reproduce either. As per comment, here is precisely what I did. How does yours differ?

$ R
R version 3.0.2 (2013-09-25) -- "Frisbee Sailing"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

> require(data.table)
Loading required package: data.table
data.table 1.8.10  For help type: help("data.table")
> test2 <- data.frame(ID = c(1:10), date = c(20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101), Number = rnorm(10))
> test2
   ID     date      Number
1   1 20130101  0.26937712
2   2 20130101  0.72113244
3   3 20130101 -0.66086356
4   4 20130101  0.47507096
5   5 20130101  0.69400777
6   6 20130101 -1.26948436
7   7 20130101  1.75919781
8   8 20130101 -0.05306206
9   9 20130101  1.59880358
10 10 20130101  0.69531516
> write.csv(test2,'test2.csv',row.names=FALSE)
> readdata <- function(fn){
+   DT <- fread(fn)
+   DT[, date:= as.Date(as.character(date),"%Y%m%d") - 1][, Number:= NULL]
+   return(DT)
+ }
> readdata("test2.csv")
    ID       date
 1:  1 2012-12-31
 2:  2 2012-12-31
 3:  3 2012-12-31
 4:  4 2012-12-31
 5:  5 2012-12-31
 6:  6 2012-12-31
 7:  7 2012-12-31
 8:  8 2012-12-31
 9:  9 2012-12-31
10: 10 2012-12-31
> 
Matt Dowle
  • 58,872
  • 22
  • 166
  • 224
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • Hey, agstudy! Thanks for working around. So which version do you use 1.8.11 from R-forge or 1.8.10 from CRAN? The same code worked pretty good previously, and failed today. It's really weird. And I don't know why attempt of adding `data.table(DT)` could be the solution..Thanks a again! – Bigchao Feb 02 '14 at 12:49
  • I have 1.8.11 from R-forge. – agstudy Feb 02 '14 at 12:51
  • @Bigchao, are you sure your R-installation and data.table installation are done properly? Given that you also had [another issue here](http://stackoverflow.com/q/21219222/559784)... – Arun Feb 02 '14 at 13:28
  • @Arun, thanks for your time. I'm not sure whether my data.table installation is done properly. I tried to reinstall several times, and still the same result.. – Bigchao Feb 02 '14 at 13:33
  • @Bigchao Works fine for me using v1.8.10 from CRAN. Try starting from a fresh R session and pasting exactly what you see without cutting anything. I'll edit into agstudy's study my output so you can see what we need. How does yours differ? – Matt Dowle Feb 02 '14 at 17:13
  • @MattDowle Sure, thanks for your comment. I refreshed the R session and `Error in `:=`(date, as.Date(as.character(date), "%Y%m%d") - 1) : := and `:=`(...) are defined for use in j, once only and in particular ways. ` As @Arun said, I think my installation of data.table1.8.11 was not correct. But I don't know where went wrong..I report a error [here](http://stackoverflow.com/questions/21219222/documentation-about-melt-data-table-in-data-table). Could you please take a look? I tried several times all with the same results. – Bigchao Feb 03 '14 at 03:50
  • 3
    @Bigchao You're going wrong by not being precise enough and not doing as I've asked. Please read my comment above again. I need you to actually do this. Type R at the prompt, paste your commands and send me the output and show it in the way I have in my edit. Have you installed v1.8.10 from CRAN to match my output? I mentioned v1.8.10 in my comment, and it appears in the output, but you're still talking about v1.8.11. – Matt Dowle Feb 03 '14 at 09:34
  • @MattDowle Thanks for the patience, Matt. Now I followed your comment, and output is just same as you report. Also, I edited my original post for the details. Also, there's my unsolved probelm of the installation s of 1.8.11. I also put the information in the post. Could you take a look? Thanks! – Bigchao Feb 04 '14 at 08:56
  • @Bigchao Good. Where is that .zip downloaded from? The usual way to install from R-Forge is `install.packages("data.table", repos="http://R-Forge.R-project.org")`. – Matt Dowle Feb 04 '14 at 10:52
  • @Bigchao The .zip from the data.table homepage labelled "v1.8.11 rev931 04 Sep 2013"? That is quite old now and would explain a lot. I've improved the wording on the homepage now so it's clearer that is just a copy placed (by me) on the homepage for when R-Forge is down. I've put the correct command on the homepage too, to head off that confusion. – Matt Dowle Feb 04 '14 at 11:17
  • 1
    @Arun See comments above. Hoping this clears up these two questions! – Matt Dowle Feb 04 '14 at 11:18
  • 2
    @MattDowle Matt, I tried immediately after seeing your comments. First installed the precompiled one for windows, and installed by `install.packages("data.table", repos="http://R-Forge.R-project.org")` , ` It works! Thanks!! – Bigchao Feb 04 '14 at 11:43
  • @Bigchao Great. So `?melt.data.table` works as well now? – Matt Dowle Feb 04 '14 at 11:52
  • 1
    @Bigchao, if both of them work, then probably you could answer yourself and accept it? Matt, thanks for the tag. Understood the issue now.. – Arun Feb 04 '14 at 11:59
  • @Arun Glad it's sorted. I'm thinking best to close it. It's unlikely to help anyone in future (especially given the title). – Matt Dowle Feb 04 '14 at 12:19
  • @MattDowle, agreed. Voted to close. – Arun Feb 04 '14 at 12:20
  • @MattDowle I actually just had this exact problem. Let me see if I can put a reproducible example together. Using `data.table_1.9.2` on OS X (Mavericks). Works when I hop inside the function, but same error as OP when I call the function. – Statwonk Jun 20 '14 at 05:58