I need a way to use 'Or' statements with entire words outside capture groups in tidyr::extract, like in the next example.
Suppose i have the next strings:
string1 <- data.frame (col = "asdnajksdn**thingA**asdnaksjdnajksn")
string2 <- data.frame (col = "asdnajksdn**itemA**asdnaksjdnajksn")
i want to use tidyr::extract() to extract 'A' and 'B' with the same regular expressions, but i DONT want to extract 'word' or 'thing', the desired output would be:
string1 %>% extract(col = 'col', regex = regex, into = "var")
> NewColumn
"A"
string2 %>% extract(col = 'col', regex = regex, into = "NewColumn")
> NewColumn
"B"
The answer would be something like that:
extract(string, col = "col", into = "NewColumn",
regex = "(word)|(thing)(.)")
But i can't do that because it would result in:
NewColumn NA
word A
I know that in the example i could just use something like
"[ti][ht][ie][nm]g?(.)"
but i'm looking for a more general solution.