You can do that with a mere sub
:
> sub(".*?chr(.*?)\\.recalibrated.*", "\\1", myvec)
[1] "10" "11" "Y"
The pattern matches any symbols before the first chr
, then matches and captures any characters up to the first .recalibrated
, and then matches the rest of the characters. In the replacement pattern, we use a backreference \1
that inserts the captured value you need back into the resulting string.
See the regex demo
As an alternative, use str_match
:
> library(stringr)
> str_match(myvec, "chr(.*?)\\.recalibrated")[,2]
[1] "10" "11" "Y"
It keeps all captured values and helps avoid costly unanchored lookarounds in the pattern that are necessary in str_extract
.
The pattern means:
chr
- match a sequence of literal characters chr
(.*?)
- match any characters other than a newline (if you need to match newlines, too, add (?s)
at the beginning of the pattern) up to the first
\\.recalibrated
- .recalibrated
literal character sequence.