3

I've been reading about Regex to ONLY match PUBLIC IPv4 address and tried all solution given, but none of them actually can match the Public IP Address accurately.

Sample IP

[user@linux ~]$ cat ip.txt
1.1.1.1
8.8.8.8
10.1.1.1
127.0.0.1
[user@linux ~]$

Solution 1 - https://stackoverflow.com/a/39195704/11392987

[user@linux ~]$ egrep '^([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(?<!172\.(16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31))(?<!127)(?<!^
10)(?<!^0)\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(?<!192\.168)(?<!172\.(16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31))\.([0-9]|[1-
9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(?<!\.255$)$' ip.txt
[user@linux ~]$

Sol 2 - https://stackoverflow.com/a/33453740/11392987

[user@linux ~]$ egrep '(\d+)(?<!10)\.(\d+)(?<!192\.168)(?<!172\.(1[6-9]|2\d|3[0-1]))\.(\d+)\.(\d+)' ip.txt
[user@linux ~]$

Sol 3 - https://stackoverflow.com/a/46399203/11392987

[user@linux ~]$ egrep '(^0\.)|(^10\.)|(^100\.6[4-9]\.)|(^100\.[7-9]\d\.)|(^100\.1[0-1]\d\.)|(^100\.12[0-7]\.)|(^127\.)|(^169\.254\.)|(^172\.1[6-
9]\.)|(^172\.2[0-9]\.)|(^172\.3[0-1]\.)|(^192\.0\.0\.)|(^192\.0\.2\.)|(^192\.88\.99\.)|(^192\.168\.)|(^198\.1[8-9]\.)|(^198\.51\.100\.)|(^203.0\
.113\.)|(^22[4-9]\.)|(^23[0-9]\.)|(^24[0-9]\.)|(^25[0-5]\.)' ip.txt
10.1.1.1
127.0.0.1
[user@linux ~]$

10.1.1.1 & 127.0.0.1 are Private IP Address, not Public IP

Sol 4 - https://stackoverflow.com/a/57077560/11392987

[user@linux ~]$ egrep '^(?!^0\.)(?!^10\.)(?!^100\.6[4-9]\.)(?!^100\.[7-9]\d\.)(?!^100\.1[0-1]\d\.)(?!^100\.12[0-7]\.)(?!^127\.)(?!^169\.254\.)(?
!^172\.1[6-9]\.)(?!^172\.2[0-9]\.)(?!^172\.3[0-1]\.)(?!^192\.0\.0\.)(?!^192\.0\.2\.)(?!^192\.88\.99\.)(?!^192\.168\.)(?!^198\.1[8-9]\.)(?!^198\.
51\.100\.)(?!^203.0\.113\.)(?!^22[4-9]\.)(?!^23[0-9]\.)(?!^24[0-9]\.)(?!^25[0-5]\.)(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.([0-9]|[1-
9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))$' ip
.txt
[user@linux ~]$

Sol 5 - https://www.bigdatamark.com/regexp-for-extracting-public-ip-address/

[user@linux ~]$ egrep '\b(?!(10)|192\.168|172\.(2[0-9]|1[6-9]|3[0-2]))[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' ip.txt
[user@linux ~]$
  • Small nit... `10.1.1.1` and `127.0.0.1` are not public IPs. The first is a Private Address Space from [RFC 1918](https://tools.ietf.org/html/rfc1918). The second is Special Use IPv4 Addresses from [RFC 5735](https://tools.ietf.org/html/rfc5735). – jww Nov 12 '19 at 15:30

2 Answers2

4

Try this regex. It matches all public IPs and doesn't match reserved IPs.

Edit - Try using grep -P instead of egrep because egrep doesn't support lookahead.

grep -P '^(?!^0\.)(?!^10\.)(?!^100\.6[4-9]\.)(?!^100\.[7-9]\d\.)(?!^100\.1[0-1]\d\.)(?!^100\.12[0-7]\.)(?!^127\.)(?!^169\.254\.)(?!^172\.1[6-9]\.)(?!^172\.2[0-9]\.)(?!^172\.3[0-1]\.)(?!^192\.0\.0\.)(?!^192\.0\.2\.)(?!^192\.88\.99\.)(?!^192\.168\.)(?!^198\.1[8-9]\.)(?!^198\.51\.100\.)(?!^203.0\.113\.)(?!^22[4-9]\.)(?!^23[0-9]\.)(?!^24[0-9]\.)(?!^25[0-5]\.)(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))$' ip.txt

Output

1.1.1.1
8.8.8.8
Karmveer Singh
  • 939
  • 5
  • 16
0

The real question is why does it have to be one single regex?

If the input is validated a simpler regex can be achieved, by validated I mean there is no out of bounds ip addresses in the input. If so something like this could work.

grep "\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}" ip.txt | 
grep -v "10\(\.[0-9]\{1,3\}\)\{3\}\|192\.168\(\.[0-9]\{1,3\}\)\|172\.\(1[6-9]\|2[0-9]\|3[01]\)\.[0-9]\{1,3\}\.[0-9]\{1,3\}\|127.0.0.1"

if validation is required than the regex gets much more complicated

grep  "\([0-9]\.\|[1]\{,1\}[0-9]\{2\}\.\|2[0-4][0-9]\.\|25[0-5]\.\)\{3\}\([0-9]\|[1]\{,1\}[0-9]\{2\}\|2[0-4][0-9]\|25[0-5]\)" ip.txt |
grep -v "10\(\.[0-9]\|\.[1]\{,1\}[0-9]\{2\}\|\.2[0-4][0-9]\|\.25[0-5]\)\{3\}" |
grep -v "172\.\(1[6-9]\|2[0-9]\|3[01]\)\.\([0-9]\.\|[1]\{,1\}[0-9]\{2\}\.\|2[0-4][0-9]\.\|25[0-5]\.\)\([0-9]\|[1]\{,1\}[0-9]\{2\}\|2[0-4][0-9]\|25[0-5]\)" |
grep -v "192.168\(\.[0-9]\|\.[1]\{,1\}[0-9]\{2\}\|\.2[0-4][0-9]\|\.25[0-5]\)\{2\}" |
grep -v "127.0.0.1"

I split up the 3 classes of private ip address spaces, with each another piped grep command for simplicity you could join them all with a

\|

and have only the two grep commands.

Chris
  • 443
  • 1
  • 5
  • 13