2

I've got this data.frame, ff:

ff<-data.frame(dest = c("NY", "NY", "LA", "LA"), 
st_ti = c("ok", "bad", "ok",   "bad"), 
qty = c(2,2,2,1))

ff:

 dest st_ti qty
1   NY    ok   2
2   NY   bad   2
3   LA    ok   2
4   LA   bad   1

that I would like to expand so it's all categorical data, like this:

gg<-data.frame(dest = c("NY", "NY", "NY", "NY", "LA", "LA","LA"), 
st_ti = c("ok", "ok", "bad", "bad", "ok", "ok", "bad"))

gg:

  dest st_ti
1   NY    ok
2   NY    ok
3   NY   bad
4   NY   bad
5   LA    ok
6   LA    ok
7   LA   bad

I'd like to do something like gather from the tidyr package but I don't believe this option is available, here.

lmo
  • 37,904
  • 9
  • 56
  • 69
jmb277
  • 558
  • 4
  • 19

2 Answers2

4

You can repeat the row names by the qty column and then pick up the rows with the expanded row names:

ff[rep(rownames(ff), ff$qty), c("dest", "st_ti")]

#    dest st_ti
#1     NY    ok
#1.1   NY    ok
#2     NY   bad
#2.1   NY   bad
#3     LA    ok
#3.1   LA    ok
#4     LA   bad

To reset the rownames:

ff1 <- ff[rep(rownames(ff), ff$qty), c("dest", "st_ti")]
rownames(ff1) <- NULL
ff1

#  dest st_ti
#1   NY    ok
#2   NY    ok
#3   NY   bad
#4   NY   bad
#5   LA    ok
#6   LA    ok
#7   LA   bad
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • 1
    One finds nothing with a search on `rep.data.frame`. This behaves the way I would have expected it to look. – IRTFM Mar 03 '17 at 03:17
  • @42- Thanks for the comment! Obviously, R doesn't allow duplicated row names. – Psidom Mar 03 '17 at 03:27
  • is it easy to add more columns? As if there were an `st_ti1`, `st_ti2`,.. `st_tin`, etc that were also going to be repeated as a result of this expansion? – jmb277 Mar 03 '17 at 03:35
  • 1
    You can do something like this, `ff[rep(rownames(ff), ff$qty), -3]` where 3 is the column position, or `ff[rep(rownames(ff), ff$qty), ]` and then drop *qty* column later. `ff[rep(rownames(ff), ff$qty), names(ff) != "qty"]` would be one option. – Psidom Mar 03 '17 at 03:38
  • understood - thank you! – jmb277 Mar 03 '17 at 03:43
2

We can do this using expandRows

library(splitstackshape)
setDT(expandRows(ff, 'qty'))[]
#   dest st_ti
#1:   NY    ok
#2:   NY    ok
#3:   NY   bad
#4:   NY   bad
#5:   LA    ok
#6:   LA    ok
#7:   LA   bad
akrun
  • 874,273
  • 37
  • 540
  • 662