0

Language: R

Package: data.table

When I use data.table "by" to group several columns to calculate let's say the number of observations in each group, if a combination has zero entries, it does not show up in the output. However, even if it is empty I still want to see (as zero). Is there a way to do this?

# Example: 
DT = data.table(A = c(1,1,1,1,2,2,2), B = c(F,F,T,F,T,T,T))

   A     B
1: 1 FALSE
2: 1 FALSE
3: 1  TRUE
4: 1 FALSE
5: 2  TRUE
6: 2  TRUE
7: 2  TRUE

DT[, j = .(.N), by = .(A,B)]

   A     B N
1: 1 FALSE 3
2: 1  TRUE 1
3: 2  TRUE 3

As you can see above the factor 2 in A has no corresponding F observation in column B. Thus, when grouped using data.table this entry will not be shown.

Edit:

It turns out a similar question was asked and answered before.

setkey(DT, A, B)
DT[CJ(A,B, unique = TRUE), j= .(.N), by= .EACHI]

   A     B N
1: 1 FALSE 3
2: 1  TRUE 1
3: 2 FALSE 0
4: 2  TRUE 3
ilyas
  • 609
  • 9
  • 25
  • Like `melt(dcast(dt, A~B), 1)`? – lukeA Apr 25 '16 at 14:47
  • 1
    `DT[CJ(A, B, unique = TRUE), do_some_stuff, on=c("A","B"), by=.EACHI]` probably – Frank Apr 25 '16 at 15:04
  • You can't write `data.table(Person = A)` since A is not an object. If you want the letter, put quotes around it or subset `LETTERS` or `letters`. Also, you might want to show what your desired output looks like. – Frank Apr 25 '16 at 17:05
  • Anyway, having two tables like this seems like a good idea to me. I think you're looking for `DT[CountryKey, on=c(Birth = "Country"), .N, by=.EACHI]`. There might be a dupe for that, too. – Frank Apr 25 '16 at 17:07
  • Here it is, in case someone wants to change the marked dupe: http://stackoverflow.com/q/25869543/ Fyi, @ilyas, you could've just posted a new question instead of changing an old one. – Frank Apr 25 '16 at 17:09
  • OK I will make separate question. Thank you. – ilyas Apr 25 '16 at 17:25

0 Answers0