1

I want to extract the ip address from following string with Python re

"aa:xv172.31.2.27bb,"

Using the following pattern

ip_address = re.sub(r'.*([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*',
                    r'\1',ip_address)

results in

2.31.2.27

because the first operator is as greedy as possible. I want the ip address matcher to be "more greedy" to get the full ip address. How to do this?

Michael Hecht
  • 2,093
  • 6
  • 25
  • 37
  • 1
    why have the first and last matchers at all, if you are just using the middle part? (re.search instead of re.sub would work better) – HugoRune Jun 12 '21 at 06:46
  • Just using the middle part results in replacing the middle part by the middle part., i.e. I get the original string back. – Michael Hecht Jun 12 '21 at 07:19

1 Answers1

1

Use re.search when you want to extract something:

>>> s = "aa:xv172.31.2.27bb,"
>>> re.search(r'\d{1,3}(\.\d{1,3}){3}', s)[0]
'172.31.2.27'

If you want to know how to do it with re.sub for this case, use non-greedy for the first .*:

>>> re.sub(r'.*?(\d{1,3}(\.\d{1,3}){3}).*', r'\1', s)
'172.31.2.27'

\d isn't the right way to match IP address range. You can either construct the pattern yourself, or use a module such as https://github.com/madisonmay/CommonRegex (this one uses 25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]? which can be further simplified, but you get the idea).

See also: https://docs.python.org/3/howto/ipaddress.html

Sundeep
  • 23,246
  • 2
  • 28
  • 103