-1

I need to reorganize my dataframe intro multiple columns (based on the values of column "x" and "y" (ignore column "z"):

dataframe <- data.frame(
  x = c("apple", "apple","apple", "orange", "orange","orange","banana", "banana","strawberry"),
  y = c("a", "d", "b", "c","e","f","g","h","i"),
  z = c(9:1))

> dataframe
           x y z
1      apple a 9
2      apple d 8
3      apple b 7
4     orange c 6
5     orange e 5
6     orange f 4
7     banana g 3
8     banana h 2
9 strawberry i 1

So, my new dataframe would be like:

    > new_df
    apple orange banana strawberry
1    a       c       g       i
2    d       e       h       NA 
3    b       f       NA      NA   
ASF
  • 269
  • 2
  • 10

2 Answers2

0

Something like this...?

> library(reshape2)
> dcast(dataframe, y~x, value.var = "y")
  y apple banana orange strawberry
1 a     a   <NA>   <NA>       <NA>
2 b     b   <NA>   <NA>       <NA>
3 c  <NA>   <NA>      c       <NA>
4 d     d   <NA>   <NA>       <NA>
5 e  <NA>   <NA>      e       <NA>
6 f  <NA>   <NA>      f       <NA>
7 g  <NA>      g   <NA>       <NA>
8 h  <NA>      h   <NA>       <NA>
9 i  <NA>   <NA>   <NA>          i
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
0

You can also create a simple function

max_val = 3 # Max number of rows your final dataframe will have
mdf = as.data.frame(matrix(nrow = max_val))[-1] # Create empty df

for (item in unique(df$x)) {
    vals = df[df$x == item, ]$y
    while (length(vals) != max_val){
        vals = c(array(vals), NA)
    }
    n = names(mdf)
    mdf = cbind(mdf, vals)
    names(mdf) <- c(n, item)
}

Output:

  apple orange banana strawberry
1     a      c      7          9
2     d      e      8         NA
3     b      f     NA         NA
rafaelc
  • 57,686
  • 15
  • 58
  • 82