0

I want to subset a dataframe whereby I select columns based on the fact that the colname contains a certain string or not. These strings that it must contain are stored in a separate list.

This is what I have now:

colstrings <- c('A', 'B', 'C')

for (i in colstrings){
   df <- df %>% select(-contains(i))
}

However, it feels like this shouldn't be done with a for loop. Any suggestions on how to make this code shorter?

joran.g
  • 89
  • 6

1 Answers1

1

Here's an answer adapted from a previous SO post:

library(dplyr)

df <-
  tibble(
    ash = c(1, 2),
    bet = c(2, 3),
    can = c(3, 4)
  )

df

substr_list <- c("sh", "an")

df %>% 
  select(matches(paste(substr_list, collapse="|")))

See more here: select columns based on multiple strings with dplyr contains()

cardinal40
  • 1,245
  • 1
  • 9
  • 11