1

I have a data frame that looks like this:

  class id
1   foo  1
2   bar  1
3   baz  1
4   baz  2
5   bar  2
6   foo  2
7   foo  3
8   foo  3
9   foo  3

My goal is to reshape it into a data frame that gathers the classes into a list, in the order that they are given. For example, the output would look like:

> output
  id var1 var2 var3
1  1  foo  bar  baz
2  2  baz  bar  foo
3  3  foo  foo  foo

or, alternatively, a two-column data frame with the first column containing the id and the second column containing a list of the id variables in order.

I've tried using dcast(test, id ~ class) from the reshape library but that doesn't quite return the output that I need.

Any ideas of how to do this in R? Here is the data:

dput(test)
structure(list(class = c("foo", "bar", "baz", "baz", "bar", "foo", 
"foo", "foo", "foo"), id = c(1, 1, 1, 2, 2, 2, 3, 3, 3)), row.names = c(NA, 
-9L), class = "data.frame")
iskandarblue
  • 7,208
  • 15
  • 60
  • 130

2 Answers2

2

We create a sequence column by 'id' and then use the spread

library(tidyverse)
test %>% 
     group_by(id) %>% 
     mutate(rn = str_c("var", row_number())) %>% 
      spread(rn, class)
# A tibble: 3 x 4
# Groups:   id [3]
#     id var1  var2  var3 
#  <dbl> <chr> <chr> <chr>
#1     1 foo   bar   baz  
#2     2 baz   bar   foo  
#3     3 foo   foo   foo  

Just in case

test %>%
     group_by(id) %>%
     mutate(rn = paste0("var", row_number())) %>%
     spread(rn, class)

Or

test %>%
    group_by(id) %>%
    mutate(rn = paste("var", row_number(), sep="")) %>%
    spread(rn, class)

Or with data.table, create the sequence with rowid and dcast

library(data.table)
dcast(setDT(test), id ~ paste0("var", rowid(id)), value.var = 'class')
#    id var1 var2 var3
#1:  1  foo  bar  baz
#2:  2  baz  bar  foo
#3:  3  foo  foo  foo

If we want to use base R, an option is ave with reshape

reshape(transform(test, rn = paste0("var", ave(seq_along(id), id,
   FUN = seq_along))), idvar = 'id', direction = 'wide', timevar = 'rn')

NOTE: All the methods work when there are unequal number of replicates as well

akrun
  • 874,273
  • 37
  • 540
  • 662
0

You could split the data frame by id and cbind the interesting columns.

data.frame(id=unique(d$id), t(do.call(cbind, split(d$class, d$id))))
#   id  X1  X2  X3
# 1  1 foo bar baz
# 2  2 baz bar foo
# 3  3 foo foo foo

Note: use cbind.data.frame case you don't want factors.

Data

d <- structure(list(class = c("foo", "bar", "baz", "baz", "bar", "foo", 
"foo", "foo", "foo"), id = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 
3L)), row.names = c(NA, -9L), class = "data.frame")
jay.sf
  • 60,139
  • 8
  • 53
  • 110