From Ruby docs:
A character class may contain another character class. By itself this isn't useful because [a-z[0-9]]
describes the same set as [a-z0-9]
. However, character classes also support the &&
operator which performs set intersection on its arguments.
So, "punctuation but not apostrophe" is:
[[:punct:]&&[^']]
EDIT: By demand from revo in question comments, on my machine this benchmarks lookahead as ~10% slower, and lookbehind as ~20% slower:
require 'benchmark'
N = 1_000_000
STR = "Mr. O'Brien! Please don't go, Mr. O'Brien!"
def test(bm, re)
N.times {
STR.scan(re).size
}
end
Benchmark.bm do |bm|
bm.report("intersection") { test(bm, /[[:punct:]&&[^']]/) }
bm.report("lookahead") { test(bm, /(?!')[[:punct:]]/) }
bm.report("lookbehind") { test(bm, /[[:punct:]](?<!')/) }
end