4

I would like to use the names function to apply the same column names to multiple dataframes, all of which have the same number of columns. I can of course do this the wrong way by calling names for each dataframe, but I'd like to do it correctly. Here's the setup:

library(tidyverse)

df1 <- tibble(1,2,3,4,5)
df2 <- tibble(6,7,8,9,10)
df3 <- tibble(11,12,13,14,15)
df4 <- tibble(16,17,18,19,20)

column_names <- c("Alpha","Bravo","Charlie","Delta","Echo")
tibbles_list <- (c("df1","df2","df3","df4"))

The wrong way is of course:

names(df1) <- column_names
names(df2) <- column_names
names(df3) <- column_names
names(df4) <- column_names

I'd like to somehow use the list of dataframes in tibbles_list (through as.name or rlang::syms or similar) to apply column_names to all the dataframes in one line of code, perhaps using some species of purrr's map or one of the apply functions in base R, but I'm completely at a loss as to how.

jbfink
  • 735
  • 1
  • 7
  • 14
  • How come you're making a list of the names of data frames, and not just the data frames themselves? For example `tibbles_list <- list(df1, df2, df3, df4)` – camille Jul 03 '19 at 16:43
  • Hi camille -- no idea. I'm probably doing many things wrong. – jbfink Jul 03 '19 at 18:19

2 Answers2

5

The tibbles_list is just a vector of object identifiers as strings. With mget, we get the values of the objectss in a list, loop through the list with map and use rename_all to change the names

lst1 <- map(mget(tibbles_list), ~ .x %>%
                   rename_all(~ column_names))
list2env(lst1, .GlobalEnv)

Or use set_names

map(mget(tibbles_list), ~ .x %>% 
            set_names(column_names))

NOTE: It is better to keep it in a list and not modifying the objects in the global env

akrun
  • 874,273
  • 37
  • 540
  • 662
  • first example fails -- complains about "Object 'Alpha' of mode 'function' was not found -- but the second example works! Thanks! – jbfink Jul 03 '19 at 16:17
  • 1
    @jbfink Sorry, there should be `~` Try now – akrun Jul 03 '19 at 16:18
  • Actually @akrun there's a remaining problem -- the tibbles aren't actually "saved" when I do the map function -- executing either of these maps, and then calling a tibble directly (e.g. just by typing "df1" in the console) shows the old column names. How can I fix that? – jbfink Jul 03 '19 at 16:49
  • 1
    @jbfink I was on a meeting. Forgot that tibbles_list was a character vector. Try with `list2env` updated the meeting – akrun Jul 03 '19 at 16:56
1

First, you'll be much better off if you're working with a list of data frames, rather than a list of names of data frames that you need to then pull out of your environment. If you have choice over this matter, great; if not, you can copy those data frames into a single list.

The post How do I make a list of data frames? has 7 answers with a variety of ways of doing this and reasons why, including methods for if you don't have the luxury of starting from a list.

Once that's taken care of, you can set the names with the base setNames (or the rlang wrapper set_names, whose powers aren't really needed here), which itself is a wrapper around names. Use a purrr mapping function, or lapply for a base version.

library(dplyr)

dfs <- list(df1, df2, df3, df4)
dfs %>%
  purrr::map(~setNames(., column_names))
#> [[1]]
#> # A tibble: 1 x 5
#>   Alpha Bravo Charlie Delta  Echo
#>   <dbl> <dbl>   <dbl> <dbl> <dbl>
#> 1     1     2       3     4     5
##### cutting remaining output

lapply(dfs, function(x) setNames(x, column_names))
# same output as above

Since setNames is a wrapper around names:

lapply(dfs, function(x) {
  names(x) <- column_names
  x
})
# same output again
camille
  • 16,432
  • 18
  • 38
  • 60