Yup. `[:digit:]` is the named class (see `?regex`) you're searching for, but those named classes have to be enclosed in brackets.
– BenjaminJan 13 '22 at 20:23
1
[*POSIX character classes must be inside bracket expressions*](https://stackoverflow.com/a/42013383/3832970)
– Wiktor StribiżewJan 13 '22 at 20:23
1
It can be confusing, but realize that the named class can be both "positive" as in `[[:digit:]]` and "negative" as `[^[:digit:]]`, as well as part of a bigger class `[^[:digit]_+a-f]`.
– r2evansJan 13 '22 at 20:24
@r2evans There is no such a thing as a "named class". `[:name:]` is a *POSIX character class*. `[...]` is a *bracket expression*. POSIX character classes are never positive or negative, there are normal and negated bracket expressions. There can be *reverse* POSIX character classes though, but those are not POSIX compliant though, they look like `[^:alpha:]`.
– Wiktor StribiżewJan 13 '22 at 20:26
[`?regex`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html) refers to *"Certain named classes of characters are predefined"*. Is there a better way to refer to it that is compatible with R's regex and regex as a whole? Or should I have extended that to say "named class of characters"? I'm comfortable with regex but certainly not a guru on it, I'd prefer to say the right thing that is both R-friendly and POSIX-compliant.
– r2evansJan 13 '22 at 20:28
1
@r2evans From all I read in regex docs and reference on the Web, the terminology related to POSIX and non-POSIX regex differs. Here, the right term is *POSIX character class*, and thus all other terms should be POSIX-"friendly". In R, there are 4 different regex engines used across multiple packages. `grepl("[:digit:]", '1')` implies TRE library is used. So, we are talking POSIX regex here (as TRE is an obsolete regex library based on POSIX regex with some extensions).
– Wiktor StribiżewJan 13 '22 at 20:30
1
It seems that the differences are in verbiage. `?regex` references "character classes" and then "named classes ... POSIX locale"; I've seen "reverse" before, but the same doc references "negation", as in `its negation ('[^[:alnum:]_]')`. I really do not want to start or be part of a war on this, and I respect your experience on this ... I just contend that my statement was based on one perspective after reading that doc. I'll keep in mind that it is just one perspective, and not guaranteed to be perfectly aligned with the POSIX standard (in verbiage if not in execution). @WiktorStribiżew
– r2evansJan 13 '22 at 20:34
5
@r2evans I do not think it is any "war" here, just the fact there are so many regex libraries in R makes it a challenge to describe regex usage in R docs. That was probably a result of some compromise. In my opinion, R docs contain the least understandable regex description and only in R "close-to-official" online resource I found bad/erroneous regexp examples. And it is not me, it is people who use POSIX regex in *nix tools mostly usually start disputing about the right terminology about character classes. I just wanted to throw light on this here.
– Wiktor StribiżewJan 13 '22 at 20:38