0

I have several data.frames in my Global Environment that I need to merge. Many of the data.frames have identical column names. I want to append a suffix to each column that marks its originating data.frame. Because I have many data.frames, I wanted to automate the process as in the following example.

df1 <-  data.frame(id = 1:5,x = LETTERS[1:5])
df2 <-  data.frame(id = 1:5,x = LETTERS[6:10])

obj <- ls()

for(o in obj){
  s <- sub('df','',eval(o))
  names(get(o))[-1] <- paste0(names(get(o))[-1],'.',s)
}

# Error in get(o) <- `*vtmp*` : could not find function "get<-"'

But the individual pieces of the assignment work fine:

names(get(o))[-1]
# [1] "x"
paste0(names(get(o))[-1],'.',s)
# [1] "x.1"

I've used get in a similar way to write.csveach object to a file.

for(o in obj){
  write.csv(get(o),file = paste0(o,'.csv'),row.names = F)
}

Any ideas why it's not working in the assignment to change the column names?

John Jones
  • 57
  • 4
  • 3
    The solution is to put your data.frames together into a list when you create them. – Roland Dec 07 '17 at 15:32
  • Seems like this stems from a poor design choice. Having a bunch of data.frames as separate variables in your environment isn't that helpful to work with. You'd make your code much happier if you stored related data.frames in list. Then you can just `lapply()` transformations over that list. There are almost always better (more R-like) ways of going things than using `get()` and `assign()`. – MrFlick Dec 07 '17 at 15:33
  • @MrFlick - Thanks for the observation. The link that Gregor posts at the end of his answer gives some good advice for avoiding this problem altogether. – John Jones Dec 07 '17 at 16:22

3 Answers3

1

The error "could not find function get<-" is R telling you that you can't use <- to update a "got" object. You could probably use assign, but this code is already difficult enough to read. The better solution is to use a list.

From your example:

df1 <-  data.frame(id = 1:5,x = LETTERS[1:5])
df2 <-  data.frame(id = 1:5,x = LETTERS[6:10])

# put your data frames in a list
df_names = ls(pattern = "df[0-9]+")
df_names   # make sure this is the objects you want
# [1] "df1" "df2"
df_list = mget(df_names)

# now we can use a simple for loop (or lapply, mapply, etc.)
for(i in seq_along(df_list)) {
    names(df_list[[i]])[-1] =
        paste(names(df_list[[i]])[-1],
              sub('df', '', names(df_list)[i]),
              sep = "."
        )
}

# and the column names of the data frames in the list have been updated
df_list
# $df1
#   id x.1
# 1  1   A
# 2  2   B
# 3  3   C
# 4  4   D
# 5  5   E
# 
# $df2
#   id x.2
# 1  1   F
# 2  2   G
# 3  3   H
# 4  4   I
# 5  5   J

It's also now easy to merge them:

Reduce(f = merge, x = df_list)
#   id x.1 x.2
# 1  1   A   F
# 2  2   B   G
# 3  3   C   H
# 4  4   D   I
# 5  5   E   J

For more discussion and examples, see How do I make a list of data frames?

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • OK. I was trying to use `get()` sort of like passing by reference; it doesn't do that! Thanks for the example (nice catch on removing the suffix from 'id')! Also, thanks for the `Reduce()` example. I've never really known what circumstances `Reduce()` could be useful. – John Jones Dec 07 '17 at 16:03
1

Using setnames from library(data.table) you can do

for(o in obj) {
  oldnames = names(get(o))[-1]
  newnames = paste0(oldnames, ".new")
  setnames(get(o), oldnames, newnames)
}
dww
  • 30,425
  • 5
  • 68
  • 111
0

You can use eval which evaluate an R expression in a specified environment.

df1 <- data.frame(id = 1:5,x = LETTERS[1:5])
df2 <- data.frame(id = 1:5,x = LETTERS[6:10])

obj <- ls()

for(o in obj) {
  s <- sub('df', '', o)
  new_name <- paste0(names(get(o))[-1], '.', s)
  eval(parse(text = paste0('names(', o, ')[-1] <- ', substitute(new_name))))
}

modify df1 and df2

  id x.1
1  1   A
2  2   B
3  3   C
4  4   D
5  5   E
myincas
  • 1,500
  • 10
  • 15