2

can someone please shed some light on this error message (or just a message?).

Sample code and output (I have verbose=TRUE but it does not seem to matter):

Define the data.table:

DT <- data.table(a=1:10, b=letters[1:2])

This code works as expected:

DT[, a, by=b, verbose=TRUE]
Finding groups (bysameorder=FALSE) ... done in 0secs. bysameorder=FALSE and o__ is length 10
Detected that j uses these columns: a 
Optimization is on but j left unchanged as 'a'
Starting dogroups ... done dogroups in 0 secs
    b  a
 1: a  1
 2: a  3
 3: a  5
 4: a  7
 5: a  9
 6: b  2
 7: b  4
 8: b  6
 9: b  8
10: b 10

This code seems to work but produces the message in the end, which I cannot comprehend:

DT[, cat("\nSome text"), by=b]
Finding groups (bysameorder=FALSE) ... done in 0secs. bysameorder=FALSE and o__ is length 10
Detected that j uses these columns: <none> 
Optimization is on but j left unchanged as 'cat("\nSome text")'
Starting dogroups ... 
Some text
Some textdone dogroups in 0 secs
Empty data.table (0 rows) of 1 col: b

Why is this Empty data.table (0 rows) of 1 col: message, and what is it telling me? There is no reference to b in i or j, and each row in the table does have a value for b. The code does seem to do what is requested, but I would prefer to understand if there is a problem before applying to my real dataset (which is several thousand rows and hence the result cannot be easily verified by hand).

Setting a key for DT does not help.

A quick check in google and here showed several cases with the same message but they do not seem relevant to this simple case (they all involve doing something with empty tables with 0 rows, or something in i and !NA's).

Thanks!

I am using the latest development version of data.table 1.8.9 from r-forge (All 985 tests in inst/tests/tests.Rraw completed ok).

R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bit64_0.9-2      bit_1.1-10       xts_0.9-5        zoo_1.7-10       nlme_3.1-110     hexbin_1.26.2    lattice_0.20-15  ggplot2_0.9.3.1  reshape_0.8.4    plyr_1.8         foreign_0.8-54  
[12] data.table_1.8.9

loaded via a namespace (and not attached):
 [1] colorspace_1.2-2   dichromat_2.0-0    digest_0.6.3       gtable_0.1.2       labeling_0.2       MASS_7.3-27        munsell_0.4.2      proto_0.3-10       RColorBrewer_1.0-5
[10] reshape2_1.2.2     scales_0.2.3       stringr_0.6.2      tools_3.0.1     
Peter
  • 1,016
  • 9
  • 20
  • 2
    What would you expect to see as the result? You gave it some string that it cannot do anything with... see `?data.table` and specifically the description for what to pass to `j`. – Justin Jul 16 '13 at 17:06
  • I expect it (in this example) to print out the string once for every value of b, which it does. The actual use case is more complex but this is the simplest code I found to produce the message. – Peter Jul 16 '13 at 17:19
  • 2
    The function `cat` doesn't return anything. It just writes to the command line. The return of the function you've called is a data table of zero rows and one column, named `b` since you grouped by it. It is identical to calling `DT[, 'foo', by=b]` except there is "nothing" in the space where `foo` is... If you care to explain your more complex example, in a separate question, I think you'll find you get a much better answer. – Justin Jul 16 '13 at 17:24
  • Yes, writing to the command line is what I expect it to do. I was thinking along the lines of the following snippet from the help on data.table: "The j expression does not have to return data; e.g., `DT[,plot(colB,colC),by=colA]`". Using `require(stats); DT[, plot(cars), by=b]` produces the two plots as expected and also gives the same message as above. The actual use case is not that interesting, and after some experimenting I found that using `DT[, paste0("Some text"), by=b]` does not produce the message. – Peter Jul 16 '13 at 17:41
  • Correct, because `paste(...)` has a return value. – Justin Jul 16 '13 at 17:47
  • 2
    (Sigh) Need to go sleep I guess, I am not seeing the obvious. I was thinking the _Empty data.table (0 rows) of 1 col:_ is an (error) message while it simply is the **result** of the operation within DT. This is what you get when `j` is `NULL`. Thank you, Justin. – Peter Jul 16 '13 at 18:22

0 Answers0