0

I have >100 dataframes loaded into R. I want to remove all the columns from all data frames containing a certain pattern, in the example case below "abc".

df1 <- data.frame(`abc_1` = rep(3, 5), `b` = seq(1, 5, 1), `c` = letters[1:5])
df2 <- data.frame(`d` = rep(5, 5), `e_abc` = seq(2, 6, 1), `f` = letters[6:10])
df3 <- data.frame(`g` = rep(5, 5), `h` = seq(2, 6, 1), `i_a_abc` = letters[6:10])

I would thus like to remove the column abc_1 in df1, e_abc in df2 and i_a_abc in df3. How could this be done?

Pontus Hedberg
  • 301
  • 1
  • 2
  • 9

1 Answers1

1

Do all of your dataframes start with or contain a shared string (e.g., df)? If yes, then it might be easier to put all your dataframes in a list by using that shared string and then apply the function to remove the abc columns in every dataframe in that list.

You can then read your dataframes back into your environment with list2env(), but it probably is in your interest to keep everything in a list for convenience.

library(dplyr)
df1 <- data.frame(`abc_1` = rep(3, 5), `b` = seq(1, 5, 1), `c` = letters[1:5])
df2 <- data.frame(`d` = rep(5, 5), `e_abc` = seq(2, 6, 1), `f` = letters[6:10])
df3 <- data.frame(`g` = rep(5, 5), `h` = seq(2, 6, 1), `i_a_abc` = letters[6:10])

dfpattern <- grep("df", names(.GlobalEnv), value = TRUE)
dflist <- do.call("list", mget(dfpattern))

dflist <- lapply(dflist, function(x){ x <- x %>% select(!contains("abc")) })
list2env(dflist, envir = .GlobalEnv)
jrcalabrese
  • 2,184
  • 3
  • 10
  • 30
  • This is a very good suggestion, however I realize my example was not the best since all dataframs does not contain a shared string. – Pontus Hedberg Feb 12 '23 at 19:21
  • `mget(ls(pattern = 'df'))` should get the dfs in a list – Onyambu Feb 12 '23 at 19:24
  • 1
    @PontusHedberg, are the only objects loaded into your environment right now the 100+ dataframes? Also, have you looked into the option of reading all your dataframes as a list directly into R (i.e., not loading them in as dataframes at all)? That way you can bypass the hard part and just use the code starting with `lapply`. – jrcalabrese Feb 12 '23 at 19:26
  • That is actually a great idea. I tried to find how to read multiple excel-files into a list directly, but could not find an answer. Anything you know how to do? – Pontus Hedberg Feb 12 '23 at 19:31
  • [Have you looked here?](https://stackoverflow.com/a/11433532/14992857) This does assume that all your files are in the same folder/location. Instead of `.csv`, use `.xlsx` (or whatever is appropriate) and instead of `read.delim`, try [`read_excel`](https://readxl.tidyverse.org/reference/read_excel.html) or [`read.xlsx`](https://ycphs.github.io/openxlsx/reference/read.xlsx.html). – jrcalabrese Feb 12 '23 at 20:05