I have a data frame with roughly 20 000 rows and 215 columns and need to search, in which columns certain keywords occur (if they exist).
There are lots of suggestions for partial matches in a specified column, for example
Selecting data frame rows based on partial string match in a column
Alas, none of these functions seem to allow to search ALL columns. One option is of course to write several nested loops.
However, I wonder whether there is a much more efficient way == already existing function to search a) all columns of a data frame (or: all lists within a list)? b) and possibly not to search only for one phrase, but for a list of keywords?
For example
# some data
Species <- c("Acanthurus dussumieri", "Callionymus maculatus", "Eviota prasina", "Gymnogobius urotaenia", "Kyphosus bigibbus")
Column1 <- c(60.1, 106, 78.6, 21.5, 71)
ColumnEgg <- c(11.2, 14.5, 12, 8, NA)
Add_Info <- c("Spawns when water temperatures reach above 15°C.", NA, "females deposit eggs of 1.5 mm diameter on plants. Larvae hatch after 3-13 days.", NA, "55 cm TL newborn weighs 380 g")
df <- data.frame(Species, Column1, ColumnEgg, Add_Info)
df
Now it is easy to search, if one knows in which column to look for a pattern, e.g.
library(stringr)
library(dplyr)
df%>%
filter(str_detect(Species,"Aaptosyax"))
However: how to search all columns for a phrase or a list of keywords, like
df%>%
filter(str_detect(df[1:4],"Aaptosyax"))
or
keywords <- c("Aaptosyax", "egg")
df%>%
filter(str_detect(df[1:4],keywords))
Thanks a lot for any help!