6

I have many data frames named repeatably:

df.1 <- data.frame("x"=c(1,2), "y"=2)
df.2 <- data.frame("x"=c(2,4), "y"=4)
df.3 <- data.frame("x"=2, "y"=c(4,5))

All data frames have the same number of rows and columns. I want to bind them, adding a column with the id of the data frame. The id would be the name of the source data frame.

I know I could do this manually:

rbind(data.frame(id = "df.1", df.1),
      data.frame(id = "df.2", df.2),
      data.frame(id = "df.3", df.3))

But there's a lot of them and their number will change in the future.

I tried writing for loops but they didn't work. I suppose that's because I'm basing them on a list of strings containing data frames' names rather than a list of data frames themselves.

df_names <- ls(pattern = "df.\\d+")

for (i in df_names) {
  i$id <- i
  i
}

...but I also haven't found any automated way of creating a list of data frames with repeatable names. And even if I do, I'm not that sure the for-loop above would work :)

Kuba Krukar
  • 163
  • 1
  • 9

3 Answers3

5

You could use parse and eval to get the data frames from df_names:

do.call(rbind, lapply(df_names, function(x){data.frame(id=x, eval(parse(text=x)))}))


    id x y
1 df.1 1 2
2 df.1 2 2
3 df.2 2 4
4 df.2 4 4
5 df.3 2 4
6 df.3 2 5
user1981275
  • 13,002
  • 8
  • 72
  • 101
  • If data frames have more than one row it adds ids as a sequence. I've modified the example to account for that. So you get 2 first rows coming from df.1, but the id column will have values "df.1" in row 1 and "df.2" in row 2. – Kuba Krukar Feb 01 '14 at 20:55
  • It works for me, for example with `df.1 <- data.frame("x"=c(1,2), "y"=c(2,3))`. – user1981275 Feb 01 '14 at 20:58
  • 3
    you could use `get` instead of `eval(parse())`: `do.call(rbind, lapply(df_names, function(x){data.frame(id=x, get(x))}))`. – GSee Feb 01 '14 at 21:09
  • Great! Now I can even understand what it all means (and works identically) :) Thank you. – Kuba Krukar Feb 01 '14 at 21:28
5

There is also combine from the "gdata" package:

library(gdata)
combine(df.1, df.2, df.3)
#   x y source
# 1 1 2   df.1
# 2 2 2   df.1
# 3 2 4   df.2
# 4 4 4   df.2
# 5 2 4   df.3
# 6 2 5   df.3
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • +1 I wasn't aware of this function. – Sven Hohenstein Feb 02 '14 at 13:21
  • That's what I love and hate R for - there're solutions simple enough out there but not always easy to find and at the cost of loading another package.... I'll stay with the first approach as understanding it teaches me more flexibility for future usage. – Kuba Krukar Feb 02 '14 at 15:54
  • @KubaKrukar, I think the `get` solution posted by GSee and the `AppendMe` function I proposed in the linked duplicate question (which also uses `get`) seem to be more suitable for your needs. – A5C1D2H2I1M1N2O1R2T1 Feb 03 '14 at 12:59
2

Another approach using mget:

dat <- do.call(rbind, mget(df_names))
dat$id <- sub("\\.\\d+$", "", rownames(dat))

#        x y   id
# df.1.1 1 2 df.1
# df.1.2 2 2 df.1
# df.2.1 2 4 df.2
# df.2.2 4 4 df.2
# df.3.1 2 4 df.3
# df.3.2 2 5 df.3
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168