2

I can't seem to find out why this regular expression is not working in PL/SQL.

if ( REGEXP_LIKE(v,'/^(?>(?>([a-f0-9]{1,4})(?>:(?1)){7}|(?!(?:.*[a-f0-9](?>:|$)){8,})((?1)(?>:(?1)){0,6})?::(?2)?)|(?>(?>(?1)(?>:(?1)){5}:|(?!(?:.*[a-f0-9]:){6,})(?3)?::(?>((?1)(?>:(?1)){0,4}):)?)?(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(?>\.(?4)){3}))$/iD') ) then

It's for validating IPv4 and IPv6, it came from here: https://stackoverflow.com/a/1934546/3112803

Not sure if this has anything to do with it but I also asked this question about the D flag on the end: What Does This Regular Expression (RegEx) Flag Mean /iD

For some reason this regular expression works for most of my tests on this site: http://regex101.com/ but in PL/SQL everything is invalid.

What I mean by most is that there are some cases where I find it fails, but I've been searching for days and this is the best one I could find that is under 512 characters (512 is the limit when using REGEXP_LIKE in PL/SQL)

I'd appreciate any help. Thanks!

These are the test cases I'm using...

{1: Initial address, regex should say valid/match}
select isValid('2001:0db8:0000:0000:0000:ff00:0042:8329','ipv6') from dual;

{2: After removing all leading zeroes, regex should say valid/match}
select isValid('2001:db8:0:0:0:ff00:42:8329','ipv6') from dual;

{3: After omitting consecutive sections of zeroes, regex should say valid/match}
select isValid('2001:db8::ff00:42:8329','ipv6') from dual;

{4: The loopback address, regex should say valid/match}
select isValid('0000:0000:0000:0000:0000:0000:0000:0001','ipv6') from dual;

{5: The loopback address be abbreviated to ::1 by using both rules, regex should say valid/match}
select isValid('::1','ipv6') from dual;

{6: This should be valid/match}
select isValid('ABCD:ABCD:ABCD:ABCD:ABCD:ABCD:192.168.158.190','ipv6') from dual;

{7: This should be valid/match}
select isValid('::','ipv6') from dual;

{8: IPv6 applications to communicate directly with IPv4 applications, regex should say valid/match}
select isValid('0:0:0:0:0:ffff:192.1.56.10','ipv6') from dual;

{9: should NOT be valid/match}
select isValid('::ffff:192.1.56.10/96','ipv6') from dual;

{old formats used for tunneling, these should NOT be valid/matches}
{10}
select isValid('0:0:0:0:0:0:192.1.56.10','ipv6') from dual;
{11}
select isValid('::192.1.56.10/96','ipv6') from dual;

{These 4 should be valid/match}
{12}
select isValid('::FFFF:129.144.52.38','ipv6') from dual;
{13}
select isValid('::129.144.52.38','ipv6') from dual;
{14}
select isValid('::FFFF:d','ipv6') from dual;
{15}
select isValid('1080:0:0:0:8:800:200C:417A','ipv6') from dual;

{These 4 should NOT be valid/match}
{16}
select isValid('::FFFF:d.d.d','ipv6') from dual;
{17}
select isValid('::FFFF:d.d','ipv6') from dual;
{18}
select isValid('::d.d.d','ipv6') from dual;
{19}
select isValid('::d.d','ipv6') from dual;

I was told test #6 was wrong, ABCD:ABCD:ABCD:ABCD:ABCD:ABCD:192.168.158.190 is not a valid IPv6 address, is that correct?

Test cases 8-11 came from here: http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Frzai2%2Frzai2ipv6addrformat.htm but I was told 10&11 are no longer used.

Community
  • 1
  • 1
gfrobenius
  • 3,987
  • 8
  • 34
  • 66
  • 1
    Unfortunately, Oracle supports only narrow part of RegExp language. You regexp would not work in Oracle. – Egor Skriptunoff Feb 21 '14 at 17:36
  • For the second time, test case 6 is not valid. 10 and 11 are not either. – Michael Hampton Feb 21 '14 at 22:03
  • Yes, I am aware. It was mentioned below as well. I'll update the question so there is no confusion. This accepted answer is the best I've found so far. @nhahtdh however disagrees with you about the format of item #6. – gfrobenius Feb 21 '14 at 22:14

3 Answers3

5

Instead of doing everything in a single regex, it is better to break the regex into smaller ones and test them:

if (
    /* IPv6 expanded */
    REGEX_LIKE(v, '\A[a-f0-9]{1,4}(:[a-f0-9]{1,4}){7}\z', 'i')
    /* IPv6 shorthand */
    OR (NOT REGEX_LIKE(v, '\A(.*?[a-f0-9](:|\z)){8}', 'i')
        AND REGEX_LIKE(v, '\A([a-f0-9]{1,4}(:[a-f0-9]{1,4}){0,6})?::([a-f0-9]{1,4}(:[a-f0-9]{1,4}){0,6})?\z', 'i'))
    /* IPv6 dotted-quad notation, expanded */
    OR REGEX_LIKE(v, '\A[a-f0-9]{1,4}(:[a-f0-9]{1,4}){5}:(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}\z', 'i')
    /* IPv6 dotted-quad notation, shorthand */
    OR (NOT REGEX_LIKE(v, '\A(.*?[a-f0-9]:){6}', 'i')
        AND REGEX_LIKE(v, '\A([a-f0-9]{1,4}(:[a-f0-9]{1,4}){0,4})?::([a-f0-9]{1,4}:){0,5}(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}\z', 'i'))
   ) then

This only tests for IPv6. IPv4 is not allowed.

Since PL/SQL flavor doesn't have subroutine calls (?n), there is no choice but to expand everything out. And the lack of negative look-ahead (?!pattern) forces us to simulate it with 2 regex testing operations.

\A and \z are used for matching beginning and the end of the string, since both of them are not affected by flags, and \z behavior is the same as $ under D mode in PCRE.

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
  • 1
    nicely done! note: [a-f0-9] could be replaced by [[:xdigit:]] what could be more readable if you go crazy by all the digits (and you could leave out the i switch) – Dieter DHoker Feb 21 '14 at 20:15
  • This is great! This is the closest answer so far and I've been searching for a long time. I added my test cases to the original question. Your pattern says number 6 is a match, but I was told that's not a valid IPv6 address? It also fails on tests 9 & 11. But I'm not sure if those are valid formats anymore. I have another post very similar to this one for the same thing but for ColdFusion also (in case you want points there as well): http://stackoverflow.com/questions/21631669/regular-expression-regex-for-ipv6-separate-from-ipv4 Thank you! – gfrobenius Feb 21 '14 at 20:20
  • 1
    @gfrobenius: I am not sure why 6 is not a valid IPv6. Format-wise, it should not be wrong. Test 9 and 11 are for IPv6 address **prefix**, not an IPv6 address. – nhahtdh Feb 21 '14 at 20:27
  • Thank you for **not** using [[:xdigit:]] - not supported in javascript (and had to swap `\A` for `^` and `\z` for `$`), but works perfectly! – random_user_name Jan 30 '16 at 02:07
  • @cale_b: `[[:xdigit:]]` is simply `[A-Fa-f0-9]`. – nhahtdh Feb 01 '16 at 02:55
3

You have to get rid of the / at the start and the /iD from the end this is part of the perl syntax indicating it is a regex.

the i switch at the end means ignore case and can be given as an extra argument of your regexp_like, so:

if ( REGEXP_LIKE(v,'^(?>(?>([a-f0-9]{1,4})(?>:(?1)){7}|(?!(?:.*[a-f0-9](?>:|$)){8,})((?1)(?>:(?1)){0,6})?::(?2)?)|(?>(?>(?1)(?>:(?1)){5}:|(?!(?:.*[a-f0-9]:){6,})(?3)?::(?>((?1)(?>:(?1)){0,4}):)?)?(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(?>\.(?4)){3}))$','i') ) the

There are more issues, as perl regular expressions are not 100% equivalent to oracle regular expressions, and I see patter used here that are not available like ?> Maybe you can split up the regexp between ipv4 and ipv6 to avoid hitting the limit in oracle. And just do REGEXP_LIKE(ip,'ipv4pattern') or REGEXP_LIKE(ip,'ipv6pattern')

Adjusting the ipv4 part of the above regex to something that works in oracle gives me:

REGEXP_LIKE(ip,'^((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$','i')
Dieter DHoker
  • 421
  • 3
  • 9
  • Ah, thanks, I forgot that flags needed to be separated by a comma in the `REGEXP_LIKE()` call. Unfortunately it still didn't help. All tests come back as invalid. I think @EgorSkriptunoff might be right. – gfrobenius Feb 21 '14 at 18:07
  • I edited my post to include a working regex in oracle for the ipv4 part. – Dieter DHoker Feb 21 '14 at 18:15
  • Yes, I already have a separate IPv4 working, but thanks, I'll compare yours to mine. And yes, I do want them separate. But I have been trying to find a **IPv6 only** solution for a couple weeks and still can't find one. Here was my initial attempt to find one: **http://stackoverflow.com/questions/21631669/regular-expression-regex-for-ipv6-separate-from-ipv4**. And here are my latest test cases: **http://regex101.com/r/sV5cZ3**. I still haven't found a IPv6 only RegEx that works in `PL/SQL` that works for those test cases. – gfrobenius Feb 21 '14 at 18:26
  • I found this post that provides functions for converting ipv6 strings, maybe you can use them and test if it returns NULL http://stackoverflow.com/questions/6470244/oracle-pl-sql-versions-of-inet6-aton-and-ntoa-functions – Dieter DHoker Feb 21 '14 at 19:34
1
REGEXP_LIKE(ip,'^(([\dA-F]{1,4}:([\dA-F]{1,4}:([\dA-F]{1,4}:([\dA-F]{1,4}:([\dA-F]{1,4}:[\dA-F]{0,4}|:[\dA-F]{1,4})?|(:[\dA-F]{1,4}){0,2})|(:[\dA-F]{1,4}){0,3})|(:[\dA-F]{1,4}){0,4})|:(:[\dA-F]{1,4}){0,5})((:[\dA-F]{1,4}){2}|:(25[0-5]|(2[0-4]|1\d|[1-9])?\d)(\.(25[0-5]|(2[0-4]|1\d|[1-9])?\d)){3})|(([\dA-F]{1,4}:){1,6}|:):[\dA-F]{0,4}|([\dA-F]{1,4}:){7}:)\z', 'i')

Modified from the XML regex at http://home.deds.nl/~aeron/regex/

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Aeron
  • 11
  • 1