I want to count the amount of different values in a combination of columns in a data.table within a certain group.
Simple example: the data looks like
room | object
-----------------------
kitchen | dishwasher
kitchen | oven
livingRoom | sofa
Now I want to know: How many different objects are in every room? The answer is simple:
library(data.table)
dt = data.table(room = c("kitchen", "kitchen", "livingRoom"), object = c("dishwasher", "oven", "sofa"))
dt[, .(amount = uniqueN(object)), by=room]
However, the situation is slightly more tricky if the object is described by multiple columns. For example: The objects have a color and the data looks like this:
room | object | color
-------------------------------
kitchen | dishwasher | white
kitchen | oven | white
livingRoom | toy | red
livingRoom | toy | red
livingRoom | toy | green
Now I want to know: How many distinct object-color combinations are there is every room? I.e. the answer I want is:
room | amount
-------------------
kitchen | 2
livingRoom | 2
I tried to do the natural thing: just write more columns in uniqueN
but it did not work:
dt = data.table(room = c("kitchen", "kitchen", "livingRoom", "livingRoom", "livingRoom")
,object = c("dishwasher", "oven", "toy", "toy", "toy")
,color = c("white", "white", "red", "red", "green"))
dt[, .(amount = uniqueN(object, color)), by=room] # error
dt[, .(amount = uniqueN(.(object, color))), by=room] # error
Surely I could just put 'object' and 'color' together into one column and then just use uniqueN
on the single combination column but that is the poor mans choice for something I am sure is there but I can't figure out...
Anyone knows how to do this?
THX