Transposing an R data frame maintaining the first two columns as identifiers

Question

I have a bigger dataset, but I made this smaller one for the purpose of this example. My dataset looks like this

 df <- data.frame(ID = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3),
                  APPT_ID = c(11, 11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14),
                  Variable = c(letters[1:3], letters[1:3], letters[1:3], letters[1:3]),
                  Value = c(41:52))

The first two columns (ID and APPT_ID) are identifiers for each observation, so I would like to maintain those as columns, while transposing the second two columns (variable and value) such that each of the variables is its own column showing its value. I would like there to be only one observation row for every unique combination of ID and APPT_ID.

I'd like my output dataset to look like this:

df2 <- data.frame(ID = c(1, 1, 2, 3), APPT_ID = c(11, 12, 13, 14), 
                  a = c(41, 44, 47, 50), b = c(42, 45, 48, 51),
                  c = c(43, 46, 49, 52) )

Whats the best why to do this?

possible duplicate of [Pivoting a large data set](http://stackoverflow.com/questions/31206735/pivoting-a-large-data-set) — SabDeM, Aug 17 '15 at 16:55
possible duplicate of [Reshape three column data frame to matrix](http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix) — Jaap, Aug 19 '15 at 10:09

score 2 · Accepted Answer · answered Aug 17 '15 at 13:47

2

I think this will do the job

library(tidyr)

df %>% 
 spread(Variable, Value)

answered Aug 17 '15 at 13:47

giac

4,261
5
30
59

1

Thank you! This gives me what I want – Mi109N Aug 17 '15 at 13:58
1

no problem - accept the answer if you are satisfied – giac Aug 17 '15 at 14:01

score 2 · Answer 2 · answered Aug 17 '15 at 14:20

In base R, you can use reshape():

reshape(df,dir='w',idvar=c('ID','APPT_ID'),timevar='Variable');
##    ID APPT_ID Value.a Value.b Value.c
## 1   1      11      41      42      43
## 4   1      12      44      45      46
## 7   2      13      47      48      49
## 10  3      14      50      51      52

You can use the varying argument to control the names of the resulting varying (non-identifier) columns.

score 0 · Answer 3 · answered Aug 17 '15 at 13:48

0

With reshape2

dcast(df, ID+APPT_ID~Variable, value.var="Value")
#   ID APPT_ID  a  b  c
# 1  1      11 41 42 43
# 2  1      12 44 45 46
# 3  2      13 47 48 49
# 4  3      14 50 51 52

answered Aug 17 '15 at 13:48

Pierre L

28,203
6
47
69

Transposing an R data frame maintaining the first two columns as identifiers

3 Answers3