How to add a column to every dataframe in the workspace based on its name?

Question

Background

First I'll initialize some dummy dataframes (NOTE: in the real example there will be >40 dataframes):

colOne <- c(1,2,3)
colTwo <- c(6,5,4)
df_2004 <- data.frame(colOne,colTwo)
df_2005 <- data.frame(colTwo,colOne)

Problem

Now what I want to do is loop through every data frame in the workspace and add a column called year to them, filled with 2004 if the suffix is _2004 and 2005 if the suffix is _2005.

I can start by getting a list of all of the data frames in the workspace.

dfs <- ls()[sapply(ls(),function(t) is.data.frame(get(t)))]
dfs

[1] "df_2004" "df_2005"

But that's as far as I've managed to get.

Attempted Solution

This is what I tried:

for (d in dfs) {
  d <- lapply(d, function(x){
    t <- get(x)
    if (grepl('2004',x)) {
      t$year <- 2004
    } else {
      t$year <- 2005
    }
    t
  })
}

This does not throw an error, but it doesn't do anything either other than set d to "2005".

If I add a line print(t) right before the line returning t, I get this output in the console:

  colOne colTwo year
1      1      6 2004
2      2      5 2004
3      3      4 2004
  colTwo colOne year
1      6      1 2005
2      5      2 2005
3      4      3 2005

suggesting that the code gets through that part fine, because that's exactly what I want df_2004 and df_2005 to look like respectively. But df_2004 and df_2005 are not actually changed, which is what I want.

Are your data frames actually in a list? What's the name of the list? It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. — MrFlick, Nov 04 '21 at 04:43

score 0 · Answer 1 · answered Nov 04 '21 at 04:44

0

Let's say two list as d_2018 and d_2019. Using lapply,

d_2018 <- lapply(d_2018, function(x){
  x$year <- 2018
  x
})

d_2019 <- lapply(d_2010, function(x){
  x$year <- 2019
  x
})

will helps

answered Nov 04 '21 at 04:44

Park

14,771
6
10
29

Ronak Shah · Answer 2 · 2021-11-04T06:07:14.127

0

Here is one way using purrr to add new column from the dataframe name.

library(purrr)

year_data <- list(data_2018, data_2019)

res <- map(year_data, function(x) 
        imap(x, function(data, name) {
        transform(data, year = sub('.*?_', '', name))
        }))

edited Nov 04 '21 at 06:07

answered Nov 04 '21 at 06:00

Ronak Shah

377,200
20
156
213

How to add a column to every dataframe in the workspace based on its name?

Background

Problem

Attempted Solution

2 Answers2