0

What I'd like to obtain is a dataset like this:

ID var
1  t0
1  t1
1  t2
2  t0
2  t1
2  t2

where t restart from 0 for each ID.

If I use:

all <- data.frame(ID=character(),var=numeric())

for (i in 1:2) {
  for (j in 0:3) {
    df <- data.frame(matrix(c(rep(i,3),paste0("t",j)), byrow = F, ncol = 2, nrow = 6))
    all <- rbind(all,df) 
  }
}

there is something clearly wrong. How can I manage it?

jeff
  • 323
  • 1
  • 7
  • you don't need the inner-loop or the matrix to dataframe cohercion Try this: `j = 0:2; for (i in 1:2) { df <- data.frame(ID = rep(i,3), var = paste0("t",j)) all <- rbind(all,df) };` – Nate Feb 08 '21 at 14:14

3 Answers3

2

1) Change the line marked ## as shown.

all <- data.frame(ID=character(),var=numeric())

for (i in 1:2) {
  for (j in 0:3) {
    df <- data.frame(ID = i, var = paste0("t", j))  ##
    all <- rbind(all,df) 
  }
}

2) or use expand.grid

expand.grid(var = paste0("t", 0:3), ID = 1:2)[2:1]

3) Another possibility is crossing from the tidyr package:

library(tidyr)
crossing(ID = 1:2, var = paste0("t", 0:3))

4) there are a number of packages that support python and haskell like comprehensions (comprehenr, eList, listcompr). Using the first:

library(comprehenr)
to_df(for(i in 1:2) for(j in 0:3) list(ID = i, var = paste0("t", j)))

Also note comments below this answer for an additional option.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
0

I recommend against building a data.frame iteratively like that: it works as expected initially, but as it grows larger, it starts to get much slower, quickly (because every time you add a row, it makes a complete copy of all rows). Better to form a list of frames, and then combine them once. See https://stackoverflow.com/a/24376207/3358227.

To build your frame, here are a couple of ways.

  1. Build it that way from the start, all are groups of three:

    data.frame(ID = rep(seq_len(3), each = 3), var = paste0("t", rep(seq_len(3) - 1, times = 3)))
    #   ID var
    # 1  1  t0
    # 2  1  t1
    # 3  1  t2
    # 4  2  t0
    # 5  2  t1
    # 6  2  t2
    # 7  3  t0
    # 8  3  t1
    # 9  3  t2
    
  2. If your data already exists and you just want to add the var field:

    dat <- data.frame(ID = rep(seq_len(3), each = 3))
    dat$var <- paste0("t", ave(dat$ID, dat$ID, FUN = seq_along) - 1)
    
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thanks, I'd be faster but in my real data the ID variable it is not just 1,2,etc, so I need the foor loop at least for the i. – jeff Feb 08 '21 at 14:32
  • In your real data, you build a `data.frame` from scratch with arbitrary numbers? If you're doing it in a `for` loop, then you must know the IDs and each length beforehand, in which case you don't need a `for` loop. – r2evans Feb 08 '21 at 15:02
0

Using the newest version (0.3.0) of the listcompr package, you can use {} brackets within characters as placeholders for variables. For instance "t{j}" is evaluated to "t1", "t2", ... for j = 1, 2, ....

Your example data set can be created using gen.data.frame from the listcompr package:

> library(listcompr)
> gen.data.frame(c(ID = i, var = "t{j}"), j = 0:2, i = 1:2)
  ID var
1  1  t0
2  1  t1
3  1  t2
4  2  t0
5  2  t1
6  2  t2
Patrick Roocks
  • 3,129
  • 3
  • 14
  • 28