This seems simple enough but I could not find a solution on this site. I would simply like to remove all columns from a dataframe if they contain a particular partial string match of "gs://". The table has 100's of columns and looks like this:
Asked
Active
Viewed 85 times
0
-
Replace image in your question with its text and use its field separator. – Cyrus Jan 31 '23 at 22:03
-
1In the future, please post copy-pasteable data using `dput()`, [not an image](https://meta.stackoverflow.com/a/285557/17303805). See [How to make a great R reproducible example](https://stackoverflow.com/q/5963269/17303805) for more details. – zephryl Jan 31 '23 at 22:07
3 Answers
2
Using this example data:
dat <- data.frame(
x = c("gs://red", "orange"),
y = c("yellow", "gs://green"),
z = c("blue", "indigo")
)
dat
# x y z
# 1 gs://red yellow blue
# 2 orange gs://green indigo
Index the dataframe using grepl()
:
dat[!sapply(dat, \(x) any(grepl("gs://", x)))]
# z
# 1 blue
# 2 indigo

zephryl
- 14,633
- 3
- 11
- 30
-
1@RitchieSacramento I see, thank you! Changed to a safer approach using `sapply()`. – zephryl Jan 31 '23 at 22:43
-
I like the other approach! Why wouldn't it find the character "c"? – Moneeb Irshad Bajwa Jan 31 '23 at 22:44
-
2The problem is that it would find `"c"`s even if not present in your data. This is because `grepl()` coerces the dataframe using `as.character()`, which (to my surprise) yields `"c(\"gs://red\", \"orange\")" "c(\"yellow\", \"gs://green\")" "c(\"blue\", \"indigo\")"`. – zephryl Jan 31 '23 at 22:46
0
alternatively with dplyr::select_if
library(dplyr)
dat %>% select_if(~!any(str_detect(.,'gs://')))
Created on 2023-01-31 with reprex v2.0.2
z
1 blue
2 indigo

jkatam
- 2,691
- 1
- 4
- 12
0
Using select
with where
library(dplyr)
library(stringr)
df1 %>%
select(where(~ all(str_detect(.x, fixed("gs://"), negate = TRUE))))
-output
z
1 blue
2 indigo

akrun
- 874,273
- 37
- 540
- 662