1

This probably has incredibly simple answer, but I've been working on this for hours and I still can't figure it out. All I want to do is take a data frame with two columns (BEFORE table) and alter it so that each unique value in the first column becomes the name of each new column and the second column values fill in below their respective column name (AFTER table).

BEFORE

ID  age
N1   7
N1   8
N2   5
N3   9
N3   4
N3   9

AFTER

N1   N2   N3
7    5    9
8    NA   4
NA   NA   9

I've tried the melt() and cast() functions in the package reshape2, but they don't seem to do what I want. Any suggestions? Thank you in advance!

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485

3 Answers3

2

You're missing unique IDs, which you can easily create with ave. Once you have those, you can use reshape() from base R or dcast() from "reshape2" to get what you are looking for:

mydf$ID2 <- ave(as.character(mydf$ID), mydf$ID, FUN = seq_along)
reshape(mydf, direction = "wide", idvar="ID2", timevar="ID")
#   ID2 age.N1 age.N2 age.N3
# 1   1      7      5      9
# 2   2      8     NA      4
# 6   3     NA     NA      9

library(reshape2)
dcast(mydf, ID2 ~ ID, value.var="age")
#   ID2 N1 N2 N3
# 1   1  7  5  9
# 2   2  8 NA  4
# 3   3 NA NA  9
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
1

acast will get you partially there:

Original data:

df <- structure(list(ID = structure(c(1L, 1L, 2L, 3L, 3L, 3L), .Label = c("N1", 
    "N2", "N3"), class = "factor"), age = c(7L, 8L, 5L, 9L, 4L, 5L
    )), .Names = c("ID", "age"), class = "data.frame", row.names = c(NA, 
    -6L))
> df
  ID age
1 N1   7
2 N1   8
3 N2   5
4 N3   9
5 N3   4
6 N3   5

df2 <- acast(df, age~ID)

> df2
  N1 N2 N3
4 NA NA  4
5 NA  5  5
7  7 NA NA
8  8 NA NA
9 NA NA  9
harkmug
  • 2,725
  • 22
  • 26
1

Using cbind.fill from here, you can do:

do.call(cbind.fill, split(df$age, df$ID))
#     [,1] [,2] [,3]
#[1,]    7    5    9
#[2,]    8   NA    4
#[3,]   NA   NA    5
Community
  • 1
  • 1
eddi
  • 49,088
  • 6
  • 104
  • 155