0

Imagine we want to find all of the FOOs and subsequent numbers in the string below and return them as a vector (apologies for unreadability, I wanted to make the point there is no regular pattern before and after the FOOs):

xx <- "xasdrFOO1921ddjadFOO1234dakaFOO12345ndlslsFOO1643xasdf"

We can use this to find one of them (taken from 1)

gsub(".*(FOO[0-9]+).*", "\\1", xx)
[1] "FOO1643"

However, I want to return all of them, as a vector.

I've thought of a complicated way to do it using strplit() and gregexpr() - but I feel there is a better (and easier) way.

oguz ismail
  • 1
  • 16
  • 47
  • 69
Jim Bo
  • 657
  • 3
  • 9
  • 16

3 Answers3

6

You may be interested in regmatches:

> regmatches(xx, gregexpr("FOO[0-9]+", xx))[[1]]
[1] "FOO1921"  "FOO1234"  "FOO12345" "FOO1643" 
sebastian-c
  • 15,057
  • 3
  • 47
  • 93
  • Perfect, I didn't realise there was anything "built in". I should have read the help page for grep more closely though, I would have found this myself! -1 for me – Jim Bo Nov 30 '12 at 15:32
3
xx <- "xasdrFOO1921ddjadFOO1234dakaFOO12345ndlslsFOO1643xasdf"
library(stringr)
str_extract_all(xx, "(FOO[0-9]+)")[[1]]
#[1] "FOO1921"  "FOO1234"  "FOO12345" "FOO1643" 

this can take vectors of strings as well, and mathces will be in list elements.

jem77bfp
  • 1,270
  • 11
  • 13
2

Slightly shorter version.

library(gsubfn)
strapplyc(xx,"FOO[0-9]*")[[1]]
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
Wojciech Sobala
  • 7,431
  • 2
  • 21
  • 27