2

How do you incorporate a regular expression into the Python string.split method? Here is some sample code:

ip = '192.168.0.1:8080'
tokens = ip.split('[.|:]')
print tokens

This for some reason generates ['192.168.0.1:8080']. Can someone point out what I'm missing? I've tried escaping characters and using double quotes, but nothing seems to change anything.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Teofrostus
  • 1,506
  • 4
  • 18
  • 29

1 Answers1

5

You need to use re.split if you want to split a string according to a regex pattern.

tokens = re.split(r'[.:]', ip)

Inside a character class | matches a literal | symbol and note that [.:] matches a dot or colon (| won't do the orring here).

So you need to remove | from the character class or otherwise it would do splitting according to the pipe character also.

or

Use string.split along with list_comprehension.

>>> ip = '192.168.0.1:8080'
>>> [j for i in ip.split(':') for j in i.split('.')]
['192', '168', '0', '1', '8080']
Remi Guan
  • 21,506
  • 17
  • 64
  • 87
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
  • 1
    `re.split` is considerably more expensive than `string.split` IIRC. If you know your strings are going to be IP addresses with ports, it might be better to do `temp = ip.split(':')` and then `tokens = temp[1].split('.')` and then `tokens.append(ip[1])` – personjerry Nov 05 '15 at 04:21