Make multiple column value to one row with group by

Question

I am using the data.table package to speed up some summary statistic collection on a data set.

I'm curious if there's a way to group by more than one column. My data looks like this:

Date                      Value  
2016-12-11                 36
2016-12-11                 40
2016-12-12                 17
2016-12-12                 41
2016-12-12                 27
...
2017-2-21                  22
2017-2-21                  53
2017-2-21                  19
2017-2-21                  20
2017-2-21                  32

Can I get the data like this:

Date                              Value
2016-12-11                      c(36, 40)
2016-12-12                      c(17, 27, 41)
2016-2-21                       c(19, 20, 22, 32, 53)

Attention:

Each date row number is not equal. That make me go crazy.

I don't really see a lot of benefit for this kind of storage. It's certainly possible, but why? — thelatemail, Mar 14 '17 at 03:28
Sepcial requirement. It is just a intermediate result. The final result is not like that. Thank You. — lojunren, Mar 14 '17 at 03:52
@thelatemail - it's also being used for [`simple features`](https://github.com/edzer/sfr) (the 'new' format in R for spatial data) — SymbolixAU, Mar 14 '17 at 05:28

score 3 · Accepted Answer · answered Mar 14 '17 at 03:17

3

We can do a group by operation to either create a string concatenation

library(data.table)
setDT(df1)[, .(Value = toString(Value)), by = Date]

or create the 'Value' column as a list

setDT(df1)[,  list(Value = list(Value)), by = Date]

answered Mar 14 '17 at 03:17

akrun

874,273
37
540
662

2

I don't think there's a specific need for `setDT`, as the OP specifies they are already using `data.table` – SymbolixAU Mar 14 '17 at 03:44
1

@SymbolixAU I don't want to fight with you regarding `setDT`. It is for other users that don't know how a data.frame is converted to data.table – akrun Mar 14 '17 at 03:45

Make multiple column value to one row with group by

1 Answers1