1

I am trying to change the variable names in all data frames in a for loop. Any example of the data is:

df1 <- data.frame(
  Number = c(45,62,27,34,37,55,40),
  Day = c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun"))
df2 <- data.frame(
  Number = c(15,20,32,21,17,18,13),
  Day = c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun"))
df3 <- data.frame(
  Number = c(12,32,22,14,16,21,30),
  Day = c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun")

L <- list(df1,df2,df3)

My current attempt is:

for(i in L){
colnames(L) <- c("NewName1", "NewName2")
}

Which is not working, I do not understand why it is not working. Please let me know if someone can guide me in the right direction.

Jake
  • 454
  • 5
  • 26
  • 2
    If your for loop is `for (i in L)`, then you need to use `i` inside the loop. In this case, you're better off using integer indexes to loop over: `for(i in seq_along(L)){colnames(L[[i]]) = c("NewName1", "NewName2")}`. – Gregor Thomas Apr 05 '18 at 15:41
  • Generally avoid for loops, they're slower in R than other solutions, like the apply function in Jilber's answer below. For some info on why, [see here](https://swcarpentry.github.io/r-novice-inflammation/15-supp-loops-in-depth/). – Anonymous coward Apr 05 '18 at 15:47
  • @Anonymouscoward: `apply` is not faster than `for loop`. The "`apply` function has a for loop in its definition. The `lapply` function buries the loop, but execution times tend to be roughly equal to an explicit `for loop`" (https://www.burns-stat.com/pages/Tutor/R_inferno.pdf). What makes a `for loop` slow and memory hog is growing object within the loop. – Tung Apr 05 '18 at 16:51
  • @Tung, we may be getting into the weeds here. `apply` does hide a `for` loop, but `lapply` does not, [at least explicitly](https://stackoverflow.com/questions/28983292/is-the-apply-family-really-not-vectorized). – Anonymous coward Apr 05 '18 at 20:17

1 Answers1

5
L <- lapply(L, function(x){
  colnames(x) <- c("NewName1", "NewName2")
  x
} )
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
  • I think it would be much better if you can provide more explanation about why and how your codes work but his don't. – ytu Apr 05 '18 at 15:48
  • That makes sense, my only question about the answer is why have the "x" at the end of the function? Thank you for the answer it worked great and I was able to use "do.call("rbind", lapply(L, as.data.frame))" to get the list as one dataset with the changed variables. – Jake Apr 05 '18 at 16:13
  • 1
    @Jake you need to have the `x` and the end of the function because the anonymous function in lapply needs to return `x`, otherwise the output won't be the right one. – Jilber Urbina Apr 05 '18 at 16:27
  • Alright that makes sense as well. I appreciate the explanation. – Jake Apr 05 '18 at 16:51
  • I've seen this solution in multiple posts, but i am still confused. Shouldn't I expect to see changed colnames() in each of the dataframes after running this function? Of course, the `lapply` returns a list, yet even when I convert the list back to a dataframe, the colnames remain the same. Help? – Ben Jan 30 '20 at 14:51