0

How can I access all matches with groupings (i.e. label) using gregexpr()in R?

s <- "aaa123bbb345ccc"
p <- "(?<label>\\d+)"
m <- gregexpr(p, s, perl = TRUE)

I am interested in printing the matches in m but referencing the group name <label1>. This can be easily done in C# but I'm struggling in R and I cannot figure out how to do this from the CRAN docs.

Edit: C# code below requested by G Grothendieck:

string s = "aaa123bbb345ccc";
string p = @"(?<label>\d+)";
Regex r = new Regex(p);
Match m = r.Match(s);
if (m.Success)
{
    Console.WriteLine(m.Groups["label"].Value);
}
Community
  • 1
  • 1
Hahnemann
  • 4,378
  • 6
  • 40
  • 64
  • ["The best way to use regular expressions with R is to pass the perl=TRUE parameter. This tells R to use the PCRE regular expressions library."](http://www.regular-expressions.info/rlanguage.html) – marekful Dec 02 '16 at 22:34
  • 2
    See `?regmatches` – nrussell Dec 02 '16 at 22:36
  • Can you use `["label"]` ? – Nicolas Dec 02 '16 at 22:36
  • 1
    `regmatches(s, m)` or avoid the awkward two-function approach with `stringr::str_extract_all(s, p)` – alistaire Dec 02 '16 at 22:37
  • When you use `gregexpr` with `regmatches`, only matches are preserved. The easiest is to use stringr `str_match_all`. `regexec` could be an alternative, but it does not work with PCRE regexps. – Wiktor Stribiżew Dec 02 '16 at 22:40
  • Maybe a dupe of [Regex group capture in R with multiple capture-groups](http://stackoverflow.com/questions/952275/regex-group-capture-in-r-with-multiple-capture-groups) – Wiktor Stribiżew Dec 02 '16 at 22:53

1 Answers1

1

This will return only the substrings associated with label.

st <- attr(m[[1]], "capture.start")[, "label"]
len <- attr(m[[1]], "capture.length")[, "label"]
substring(s, st, st + len - 1)
## [1] "123" "345"
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341