0

I am trying to convert the variable names of multiple datasets to all lower case.

names(data1) <- tolower(names(data1)) 

That works for any single dataset, but I have not figured out how to loop over datasets to do it to multiple files. Here is a for loop I attempted (I also tried with lapply but did not have any luck).

data_list <- c('data1', 'data2', 'data3')
for (file in data_list) {
    names(file) <- tolower(names(file))
}

I also tried:

data_list <- list('data1', 'data2', 'data3')
for (file in data_list) {
    names(file) <- tolower(names(file))
}

2 Answers2

1

This can in fact be done with lapply to iterate over all data frames and, since you mentioned dplyr, its proper syntax for this case, i.e. all variable names as lower case.

data_list <- list(iris, mtcars)

data_list_lower <- lapply(data_list, function(data) {
  data %>% 
    rename_with(tolower, .cols = everything()) # default is also everything()
})

For harmonizing variable names also have a look at the snakecase package which allows also for various other kinds of transformation

mnist
  • 6,571
  • 1
  • 18
  • 41
  • 1
    Or more succinctly (and using `purrr`), `map(data_list, ~ rename_with(., tolower))`. Or without `purrr`, then `lapply(data_list, rename_with, tolower)`. – r2evans Jan 07 '21 at 17:34
  • Indeed the pipe is certainly not necessary here. However, maybe your lapply solution is a little bit t0o concise – mnist Jan 07 '21 at 17:44
0

You can try to use a sequence inside the for-loop 1:length(dataset_names) for a list of df's and then rename names() inside the loop:

data_list <- list(data1, data2, data3)

for(i in 1:length(dataset_names)){
  
  names(data_list[[i]]) <-  tolower(names(data_list[[i]]))
  
  
}

column names after applying loop

for(i in 1:length(data_list)){
  print(names(data_list[[i]]))
}

#[1] "a" "b"
#[1] "c" "d"
#[1] "g" "h"

data

data1 <- data.frame(A = rep(1:3),
                    B = rep(1:3))

data2 <- data.frame(C = rep(1:3),
                    D = rep(1:3))

data3 <- data.frame(G = rep(1:3),
                    H = rep(1:3))

AlSub
  • 1,384
  • 1
  • 14
  • 33
  • `data_list` doesn't have names, so the second line doesn't seem to do anything. Also, the elements of `data_list` are character strings instead of data frames, so its' not clear what the loop is doing. – Charlie Gallagher Jan 07 '21 at 17:17