0

How would I modify this IPv6 regex I wrote to either detect the address (ie the way the regex is written right now), but also accept "blank" ie the user did not specify an IPv6 address?

^[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}$

Right now, the regex is looking for a minimum of 0:0:0:0:0:0:0:0 or similar. Infact in addition to a blank address, I probably need to also be able to handle compression such as the following address:

FE80::1
or ::1
etc

Thanks!

* UPDATE *

So let me make sure I have this straight...

(^$|^IPV4)\|(^$|IPV6)\|REST OF STUFF$

That doesn't seem right. I feel like I have misplaced the ^ and $ and the very beginning and end of my entire regex.

Maybe this instead:

 ^(^$|IPV4)\|(^$|IPV6)\|REST$

* UPDATE *

Still no luck. Here is part of my code with the middles chopped out for sanity:

^(|[0-9]{1,3}.<<<OMIT MIDDLE IPV4>>>.[0-9]{1,3})\|(|(\A([0-9a-f]{1,4}:){1,1}<<<OMIT MIDDLE IPV6>>>[0-1]?\d?\d)){3}\Z))\|[a-zA-Z0-<<<MORE STUFF MIDDLE OMITTED>>>{0,50}$

I hope that isn't confusing. Thats the beginning and end of each regex with the middles omitted so you can see the ( ).

Perhaps I need to enclose the entire gigantic IPV6 regex in parenthesis?

* UPDATE *

Tried last statement above... no luck.

Atomiklan
  • 5,164
  • 11
  • 40
  • 62

2 Answers2

0

You can specify alternation with the | character, so a|b means "match either a or b". In this case it would look something like this:

^$|^[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}$

The regex ^$ will match empty strings, so ^$|<current-regex> means "match either an empty string, or whatever <current-regex> matches (in this case IPv6)". You could use ^\s*$ in place of ^$ if you want strings that only consist of whitespace character to also be considered "empty".

This just handles the first part of the question, handling the compression like FE80::1 is more complex and it looks like there are already some other good answers for that in comments (note that I don't think this question is a dupe, because the "also matching an empty string" part isn't present in those questions).

edit: If it is part of a larger regex, then you should wrap everything in a group and get rid of the ^$, so it would be something like (|<current-regex>). Since there is nothing before the |, it means that the group can match either empty strings or whatever your current regex would match.

Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
  • Thanks for the potential solution, but I dont think it will work for my case. I figured it was trivial so I didn't include extra details, but this IPv6 regex is only part of a larger regex. The problem is, the IPV6 portion is part of a | delimited string. For example: IPv4 Address|IPv6 Address|More stuff So in reality I need to allow individual pieces of the regex (the addresses between |'s) to be empty – Atomiklan Jan 03 '14 at 20:18
  • @Atomiklan See my edit, I described how you can use this approach even as part of a larger regex. – Andrew Clark Jan 03 '14 at 20:27
  • Excellent thank you, I think you're getting me close, but I'm still misplacing a few characters. Please see my update in main post. – Atomiklan Jan 03 '14 at 20:37
  • Oops, my bad. You will actually want to remove the anchors, so it will be something like `^(|IPV4)\|(|IPV6)\|REST$`. – Andrew Clark Jan 03 '14 at 21:32
  • Hmm something still missing. Coming back as not valid. I'll update post above. – Atomiklan Jan 03 '14 at 21:40
  • Could be an unbalanced parentheses thing, make sure that for every `(` you add to the regex you also add one and only one `)`. – Andrew Clark Jan 03 '14 at 21:59
  • I believe you are correct. I excluded the IPV6 regex that I copied from the answer above and it appears to be working now. I guess I just need to figure out what is wrong with the regex above. Do you see a missing or misplaced () above in the IPv6 regex? – Atomiklan Jan 03 '14 at 22:11
  • I just checked it in Notepad++ and all the () seem to match up. It must be something else particular with this large regex thats breaking the functionality you described to me. – Atomiklan Jan 03 '14 at 22:13
  • Ok I have confirmed the rest of my regex. Everything works except for the IPV6 part (which is currently excluded). There is something inside the IPV6 regex that breaks when I add your fix. – Atomiklan Jan 03 '14 at 22:19
  • I think it has something to do with how the regex for IPV6 is split up. The entire regex (IPv6) is being stored to a variable called IPV6. I am using the same process as in the post above. I'm breaking up the regex into parts. Each part has to be enclosed in "" for it to be stored to the variable in Bash. Perhaps thats where the disconnect is? – Atomiklan Jan 03 '14 at 22:24
  • Hmm yeah I'm really not sure, although I'm don't know if you want the `\A` and `\Z` in the IPv6 portion of your regex. I believe those are anchors to the beginning and end of the string (similar to `^` and `$`), so having them in the middle of a regex doesn't make much sense. – Andrew Clark Jan 03 '14 at 22:48
0

According to this post on this site called Stack Overflow this other site has an explanation & example of a huge—but very usable—regex which is this:

(\A([0-9a-f]{1,4}:){1,1}(:[0-9a-f]{1,4}){1,6}\Z)|
(\A([0-9a-f]{1,4}:){1,2}(:[0-9a-f]{1,4}){1,5}\Z)|
(\A([0-9a-f]{1,4}:){1,3}(:[0-9a-f]{1,4}){1,4}\Z)|
(\A([0-9a-f]{1,4}:){1,4}(:[0-9a-f]{1,4}){1,3}\Z)|
(\A([0-9a-f]{1,4}:){1,5}(:[0-9a-f]{1,4}){1,2}\Z)|
(\A([0-9a-f]{1,4}:){1,6}(:[0-9a-f]{1,4}){1,1}\Z)|
(\A(([0-9a-f]{1,4}:){1,7}|:):\Z)|
(\A:(:[0-9a-f]{1,4}){1,7}\Z)|
(\A((([0-9a-f]{1,4}:){6})(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})\Z)|
(\A(([0-9a-f]{1,4}:){5}[0-9a-f]{1,4}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3})\Z)|
(\A([0-9a-f]{1,4}:){5}:[0-9a-f]{1,4}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|
(\A([0-9a-f]{1,4}:){1,1}(:[0-9a-f]{1,4}){1,4}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|
(\A([0-9a-f]{1,4}:){1,2}(:[0-9a-f]{1,4}){1,3}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|
(\A([0-9a-f]{1,4}:){1,3}(:[0-9a-f]{1,4}){1,2}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|
(\A([0-9a-f]{1,4}:){1,4}(:[0-9a-f]{1,4}){1,1}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|
(\A(([0-9a-f]{1,4}:){1,5}|:):(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)|
(\A:(:[0-9a-f]{1,4}){1,5}:(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}\Z)
Community
  • 1
  • 1
Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
  • 1
    Thanks for the post. This solved the compression issue. I thought of it last minute when writing the post for empty fields so I didn't do a search first. Sorry. – Atomiklan Jan 03 '14 at 20:20