[Edit: After posting this I see it is very similar to @Lucas' answer. I will let it stand, however, for the alternative presentation.]
I would try constructing a regex for each of the allowed patterns and then take their union to obtain a single regex.
We see that all of the allowable numbers not beginning with +
have 10 digits, so I will assume that's a requirement. If different numbers of digits are permitted, that can be dealt with easily.
1. Include 0665363636, exclude 2336653636 and 0065363636
I assume this means the number must begin with the digit 0
and the second digit must not be 0
. That's easy:
r1 = /
^ # match start of string
0 # match 0
[1-9] # match any digit 1-9
\d{8} # match 8 digits
$ # match end of string
/x
Test:
'0665363636' =~ r1 #=> 0
'2336653636' =~ r1 #=> nil
'0065363636' =~ r1 #=> nil
That seems to work.
2. Include 06 65 36 36 36, exclude 06 65 36 36
Another easy one:
r2 = /
^ # match start of string
0 # match 0
[1-9] # match any digit 1-9 # or \d if can be zero
(?: # begin a non-capture group
\s # match one whitespace
\d{2} # match two digits
) # end capture group
{4} # match capture group 4 times
$ # match end of string
/x
Test:
'06 65 36 36 36' =~ r2 #=> 0
'06 65 36 36' =~ r2 #=> nil
Another apparent success!
We see that 06-65-36-36-36
should also be permitted. That's such a small variant of the above we don't have to bother creating another regex to include in the union; instead we just modify r2
ever-so-slightly:
r2 = /^0[1-9](?:
[\s-] # match one whitespace or a hyphen
\d{2}){4}$
/x
Notice that we don't have to escape the hyphen when it's in a character class.
Test:
'06 65 36 36 36' =~ r2 #=> 0
'06-65-36-36-36' =~ r2 #=> 0
Yes!
3. Include +33 6 65 36 36 36, exclude +3366536361
It appears that, when the number begins with a +
, +
must be followed by two digits, a space, one digit, a space, then four pairs of numbers separated by spaces. We can just write that down:
r3 = /
^ # match start of string
\+ # match +
\d\d # match two digits
\s\d # match one whitespace followed by a digit
(?: # begin a non-capture group
\s # match one whitespace
\d{2} # match two digits
) # end capture group
{4} # match capture group 4 times
$ # match end of string
/x
Test:
'+33 6 65 36 36 36' =~ r3 #=> 0
'+3366536361' =~ r3 #=> nil
Nailed it!
Unionize!
r = Regexp.union(r1, r2, r3)
=> /(?x-mi:
^ # match start of string
0 # match 0
[1-9] # match any digit 1-9
\d{8} # match 8 digits
$ # match end of string
)|(?x-mi:^0[1-9](?:
[\s-] # match one whitespace or a hyphen
\d{2}){4}$
)|(?x-mi:
^ # match start of string
\+ # match +
\d\d # match two digits
\s\d # match one whitespace followed by a digit
(?: # begin a non-capture group
\s # match one whitespace
\d{2} # match two digits
) # end capture group
{4} # match capture group 4 times
$ # match end of string
)/
Let's try it:
['0665363636', '06 65 36 36 36', '06-65-36-36-36',
'+33 6 65 36 36 36'].any? { |s| (s =~ r).nil? } #=> false
['06 65 36 36', '2336653636', '+3366536361',
'0065363636'].all? { |s| (s =~ r).nil? } #=> true
Bingo!
Efficiency
Unionizing individual regexes may not produce the most efficient single regex. You must decide if the benefits of easier initial initial construction and testing, and ongoing maintenance, are worth the efficiency penalty. If efficiency is paramount, you might still construct the r
this way, then tune it by hand.