spread values across undefined columns

Question

I have a data.frame of the following structure:

   value group1 group2
1:     1      A     a1
2:     2      A     a2
3:     3      A     a3
4:     4      B     b1
5:     5      B     b2

I would like to spread it out to this:

   group1 var1 var2 var3 value1 value2 value3
1:      A   a1   a2   a3      1      2      3
2:      B   b1   c2   NA      4      5     NA

So basically there is an unspecified number of varX columns based on the number of unique group2 in each group1, and then an accompanying valueX column as well.

Is there a good way to accomplish this? spread from tidyr doesn't quite do what I want as I understand it. Thanks!

...

Here you can build the first data.frame:

data.frame(value=1:5, group1=c("A","A","A","B","B"), group2=c("a1","a2","a3","b1","b2"))

score 2 · Accepted Answer · answered Oct 11 '16 at 16:47

2

We need to create a sequence column, using the development version of data.table, this can be done with rowid function. Also, as the dcast from data.table takes multiple value.var columns, it can be done in a single line.

library(data.table)#v1.9.7+
dcast(setDT(df1), group1~rowid(group1), value.var = c("value", "group2"), sep="")

answered Oct 11 '16 at 16:47

akrun

874,273
37
540
662

is that the github version of data.table? – moman822 Oct 11 '16 at 16:48
@moman822 Yes, if you have 1.9.6, create the grouping variable, i.e. `dcast(setDT(df1)[, , rn := 1:.N< , by = group1], group1~rn, value.var = c("value", "group2"))` – akrun Oct 11 '16 at 16:49
is there an extra comma in the first bracket perhaps? – moman822 Oct 11 '16 at 16:52
@moman822 Yes, that was a typo. Sorry i.e. `dcast(setDT(df1)[, rn := 1:.N, by = group1], ..` – akrun Oct 11 '16 at 16:54

spread values across undefined columns

1 Answers1