Use a string to hold those characters and interpolate that into regexes as needed. Ruby is trying to cover some bases with (?mix:)
but it isn't anticipating that the regex is going into a character set inside the other regex.
Background Info
Here's what's really happening:
In many cases, if you interpolate a regex into a regex, it makes sense. Like this
a = /abc/ #/abc/
b = /#{a}#{a}/ #/(?-mix:abc)(?-mix:abc)/
'hhhhabcabchthth'.gsub(/abcabc/, '_') # "hhhh_hthth"
'hhhhabcabchthth'.gsub(b, '_') # "hhhh_hthth"
It works as expected. The whole (?-mix:
thing is a way of encapsulating the rules for a
, just in case b
has different flags. a
is case sensitive, because this is the default. But if b
was set to case insensitive, the only way for a
to continue matching what it matched before is to make sure it is case sensitive using -i
. Anything inside (?-i:)
after the colon will be matched with case sensitivity. This is made more clear by the following
e = /a/i # e is made to be case insensitive with the /i
/#{e}/ # /(?i-mx:a)/
You can see above that when interpolating e
into something, you now have (?i-mx:)
. Now the i
is to the left of the -
, which means it turns case insensitivity on instead of off (temporarily), in order for e
to match as it normally would.
Also, in order to avoid messing up the capture order, (?:
is added in to make an uncaptured group. All of that is a rough attempt to make a
and e
variables match what you expect them to match when you stick them into a larger regex.
Unfortunately, if you put it inside a character set match, meaning []
, this strategy completely fails. [(?-mix:)]
is now interpreted completely differently. [^?-m]
indicates everything that is NOT between "?" and "m" (inclusive), which means, for example, the letter "c" is no longer in your character set. Which means "c" doesn't get replaced with underscore as you see in your example. You can see the same thing happening with the letter "x". It also doesn't get replaced with a underscore, because it is within the negated character set, and therefore not in the characters being matched.
Ruby doesn't bother to parse the regular expression to figure out that you're interpolating your regular expression into a character set, and even if it did, it would still have to parse out the v
variable to figure out that it is also a character set, and that therefore all you really want is to take the characters from the character set in v
and put them with all the other characters there.
My advice is that since aeiouAEIOUäöüÄÖÜ
is just a bunch of characters anyway, you can store it in a string and interpolate that into any character set in a regular expression. And be careful about interpolating a regex into a regex in the future. Avoid it unless you are really certain about what it's going to do.