Alternative to expand.grid for data.frames

Question

I have a data.frame df and I want that every row in this df is duplicated lengthTime times and that a new column is added that counts from 1 to lengthTime for each row in df.

I know, it sounds pretty complicated, but what I basically want is to apply expand.grid to df. Here is an ugly workaround and I have the feeling that there most be an easier solution (maybe even a base-R function?):

df <- data.frame(ID   = rep(letters[1:3], each=3),
                 CatA = rep(1:3, times = 3),
                 CatB = letters[1:9])
lengthTime <- 3
nrRow <- nrow(df)
intDF <- df
for (i in 1:(lengthTime - 1)) {
  df <- rbind(df, intDF)
}
df$Time <- rep(1:lengthTime, each=nrRow)

I thought that I could just use expand.grid(df, 1:lengthTime), but that does not work. outer did not bring any luck either. So does anyone know a good solution?

score 65 · Answer 1 · answered Feb 20 '14 at 14:48

It's been a while since this question was posted, but I recently came across it looking for just the thing in the title, namely, an expand.grid that works for data frames. The posted answers address the OP's more specific question, so in case anyone is looking for a more general solution for data frames, here's a slightly more general approach:

expand.grid.df <- function(...) Reduce(function(...) merge(..., by=NULL), list(...))

# For the example in the OP
expand.grid.df(df, data.frame(1:lengthTime))

# More generally
df1 <- data.frame(A=1:3, B=11:13)
df2 <- data.frame(C=51:52, D=c("Y", "N"))
df3 <- data.frame(E=c("+", "-"))
expand.grid.df(df1, df2, df3)

Worked really nice ... I used this method for [this SO post](https://stackoverflow.com/q/42112654/10059841), in my solution here: [SO link](https://stackoverflow.com/a/53808294/10059841) — Marian Minar, Dec 18 '18 at 02:47

Holger Brandl · Answer 2 · 2020-01-10T08:05:09.577

25

You can also just do a simple merge by NULL (which will cause merge to do simple combinatorial data replication):

merge(data.frame(time=1:lengthTime), iris, by=NULL)

edited Jan 10 '20 at 08:05

answered Nov 03 '14 at 14:27

Holger Brandl

10,634
3
64
63

I like this solution the most but for the sake of simplicity (avoiding dependencies) you could have just wrote `merge(data.frame(time=1:lengthTime), iris, by=NULL)` – jakob-r Nov 07 '18 at 13:05

score 19 · Accepted Answer · answered Jul 27 '12 at 18:39

19

Why not just something like df[rep(1:nrow(df),times = 3),] to extend the data frame, and then add the extra column just as you have above, with df$Time <- rep(1:lengthTime, each=nrRow)?

answered Jul 27 '12 at 18:39

joran

169,992
32
429
468

score 13 · Answer 4 · answered Aug 16 '16 at 12:13

Quick update

There is now also the crossing() function in package tidyr which can be used instead of merge, is somewhat faster, and returns a tbl_df / tibble.

data.frame(time=1:10) %>% merge(iris, by=NULL) 

data.frame(time=1:10) %>% tidyr::crossing(iris)

score 2 · Answer 5 · answered Jul 27 '12 at 18:40

2

This works:

REP <- rep(1:nrow(df), 3)
df2 <- data.frame(df[REP, ], Time = rep(1:3, each = 9))
rownames(df2) <- NULL
df2

answered Jul 27 '12 at 18:40

Tyler Rinker

108,132
65
322
519

Mine really isn't any different than joran's who beat me by 40 some seconds but I'll leave it as it's slightly more explicit. – Tyler Rinker Jul 27 '12 at 18:44
First come, first serve, so I accepted his answer ;-) But +1 for both of you. That's a very neat solution! – Christoph_J Jul 27 '12 at 19:00

Stéphane Laurent · Answer 6 · 2015-08-25T18:26:52.340

A data.table solution:

> library(data.table)
>  ( df <- data.frame(ID   = rep(letters[1:3], each=3),
+                  CatA = rep(1:3, times = 3),
+                  CatB = letters[1:9]) )
  ID CatA CatB
1  a    1    a
2  a    2    b
3  a    3    c
4  b    1    d
5  b    2    e
6  b    3    f
7  c    1    g
8  c    2    h
9  c    3    i
> ( DT <- data.table(df)[, lapply(.SD, function(x) rep(x,3))][, Time:=rep(1:3, each=nrow(df0))] )
    ID CatA CatB Time
 1:  a    1    a    1
 2:  a    2    b    1
 3:  a    3    c    1
 4:  b    1    d    1
 5:  b    2    e    1
 6:  b    3    f    1
 7:  c    1    g    1
 8:  c    2    h    1
 9:  c    3    i    1
10:  a    1    a    2
11:  a    2    b    2
12:  a    3    c    2
13:  b    1    d    2
14:  b    2    e    2
15:  b    3    f    2
16:  c    1    g    2
17:  c    2    h    2
18:  c    3    i    2
19:  a    1    a    3
20:  a    2    b    3
21:  a    3    c    3
22:  b    1    d    3
23:  b    2    e    3
24:  b    3    f    3
25:  c    1    g    3
26:  c    2    h    3
27:  c    3    i    3

Another one :

> library(data.table)
>  ( df <- data.frame(ID   = rep(letters[1:3], each=3),
+                  CatA = rep(1:3, times = 3),
+                  CatB = letters[1:9]) )
> DT <- data.table(df)
> rbindlist(lapply(1:3, function(i) cbind(DT, Time=i)))
    ID CatA CatB Time
 1:  a    1    a    1
 2:  a    2    b    1
 3:  a    3    c    1
 4:  b    1    d    1
 5:  b    2    e    1
 6:  b    3    f    1
 7:  c    1    g    1
 8:  c    2    h    1
 9:  c    3    i    1
10:  a    1    a    2
11:  a    2    b    2
12:  a    3    c    2
13:  b    1    d    2
14:  b    2    e    2
15:  b    3    f    2
16:  c    1    g    2
17:  c    2    h    2
18:  c    3    i    2
19:  a    1    a    3
20:  a    2    b    3
21:  a    3    c    3
22:  b    1    d    3
23:  b    2    e    3
24:  b    3    f    3
25:  c    1    g    3
26:  c    2    h    3
27:  c    3    i    3

Alternative to expand.grid for data.frames

6 Answers6

Linked

Related