1

Possible Duplicate:
Split string on whitespace in python

I have a string like so:

['.text      0x1000       0xb51b       0xb600       6.259216    ']

and I would like to split it into this:

[.text, 0x1000, 0xb51b... etc]

So far I've tried: re.split("( )", b) and re.split("[ \t]", b)

but to no avail. I get things like:

.['.text', ' ', '0x1000', ' ', '0xb51b', ' ', '0xb600', ' ', '6.259216', ' ', '']

or some others with even more whitespaces. I know I could just remove the whitespaces from the string but I would much rather just straight up use a RE to split them in the first place.

Community
  • 1
  • 1
Stupid.Fat.Cat
  • 10,755
  • 23
  • 83
  • 144

3 Answers3

7

Why not just use the regular str.split?

'.text      0x1000       0xb51b       0xb600       6.259216    '.split()

To quote the documentation:

if sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.


As an aside, I've found that using mystring.split(None) (as opposed to mystring.split()) is sometimes quite a useful thing to keep in mind as it allows you to not "hardcode" the splitting algorithm.

mgilson
  • 300,191
  • 65
  • 633
  • 696
2

Try this:

import re
re.split("\s*", yourInputString.strip()) # result as List
Usman Salihin
  • 291
  • 1
  • 8
1

Another option is to firstly clear the multiple whitespaces, and replace them with single white spaces. After that you you do a split(" ") with one whitespace:

re.sub(r"  +"," ", text).split(" ")
Nejc
  • 692
  • 6
  • 13
  • Why would you ever use this if you can do it all in 1 go with `str.split`? I suppose you if you wanted to make sure that you didn't split on `"\t"` or something ... – mgilson Oct 08 '12 at 14:04
  • Of course, in this case, it's easier to just use `re.split(' +',string_to_split)` – mgilson Oct 08 '12 at 14:06
  • I agree with you. For the specific purpose as in this question, the `str.split` is most appropriate. But if you have some other multicharacter occurences, then my solution is maybe more general. – Nejc Oct 08 '12 at 14:07
  • You're right... Totally forgot about `re.split` . – Nejc Oct 08 '12 at 14:08