There are a number of solutions for using grepl()
, but none which solves my problem (that I have come across so far). I have two data frames. The first labelled x
containing a set of combinations associated with a letter:
structure(list(variable = c("A", "B", "C", "D"), combinations = c("16, 17, 18",
"17,18", "16,18", "12,3")), class = "data.frame", row.names = c(NA,
-4L))
> x
variable combinations
1 A 16, 17, 18
2 B 17,18
3 C 16,18
4 D 12,3
The second data frame is the results. It is a set of observations showing the letters that a species interacted with. Below is just one set of observations:
structure(list(variable = c("A, C", NA, NA), species = c("16",
"17", "18"), active = c("16", NA, NA)), class = "data.frame", row.names = c(NA,
-3L))
> y
variable species active
1 A, C 16 16
2 <NA> 17 <NA>
3 <NA> 18 <NA>
This was the original structure of y
:
> y
variable species.active species.present
1 A, C 16 17,18
The structure was changed to add more columns associated to each species (so each species had a row), thus the structure serves a specific purpose.
What I want is to have a binary column (T/F or 0/1) to show whether or not each species are in the combinations associated with the variable.
This is what I have managed so far:
library(zoo)
library(dplyr)
#carry locf so that each species are assigned the same variables
y <- y %>%
mutate(variable = zoo::na.locf(variable))
#separate each row to separate combinations
library(tidyr)
y <- separate_rows(y, variable)
#match x$variable by y$variable to add associated combinations in a new column in y
y$combinations <- ifelse(y$variable %in% x$variable, x$combinations)
#return true or false if each species is in the combination
y$type <- grepl(y$species, y$combinations);y
> y
variable species active combinations type
<chr> <chr> <chr> <chr> <lgl>
1 A 16 16 16, 17, 18 TRUE
2 C 16 16 17,18 FALSE
3 A 17 NA 16,18 TRUE
4 C 17 NA 12,3 FALSE
5 A 18 NA 16, 17, 18 TRUE
6 C 18 NA 17,18 FALSE
As you can see, the combinations are wrong and the gprel()
returns incorrect T/F (refer to row 3 where it says it is true but '17' is not in the combination anyway.
If anyone could help, that would be greatly appreciated.