0

I am retrieving a lot of data in the form

6800       MAIN ST

How can I format it so that it looks normal (one space between the number and the street name), such as:

6800 MAIN ST
user3236406
  • 579
  • 1
  • 4
  • 10
  • 1
    `' '.join(mystr.split())`? – g.d.d.c Feb 22 '14 at 17:44
  • Please try to search before asking a simple question. This one has been asked [so](http://stackoverflow.com/questions/4241757/python-django-how-to-remove-extra-white-spaces-tabs-from-a-string) [many](http://stackoverflow.com/questions/6130067/remove-extra-spaces-in-middle-of-string-split-join-python) [times](http://stackoverflow.com/questions/2077897/substitute-multiple-whitespace-with-single-whitespace-in-python). – Jussi Kukkonen Feb 22 '14 at 17:53

2 Answers2

6

use str.split and str.join:

In [733]: s='6800       MAIN ST'

In [734]: ' '.join(s.split())
Out[734]: '6800 MAIN ST'

You can also use re as @NPE mentioned, while it's not quite fast even if you get the regex pattern compiled. Benchmark:

In [746]: s='asdf             fasd zzzzzz          ddddddd      z'

In [747]: timeit ' '.join(s.split())
1000000 loops, best of 3: 545 ns per loop

In [748]: ptn=re.compile(r"\s+")

In [749]: timeit re.sub(ptn, ' ', s)
100000 loops, best of 3: 4.08 us per loop
zhangxaochen
  • 32,744
  • 15
  • 77
  • 108
  • I would say it's a tie between this and using regex, except it just looks funny to call `join()` on the space and not the array. – mgamba Feb 22 '14 at 17:56
  • Subjectively, join feels like an operation to be done on a series of things, regardless of whether there's another thing joining them. – mgamba Feb 22 '14 at 19:31
  • @mgamba It means joining the elements in the iterable with a whitespace. You'll get used to that ;) – zhangxaochen Feb 23 '14 at 01:01
5

One way is to use a regular expression:

In [8]: s = "6800       MAIN ST"

In [9]: re.sub(r"\s+", " ", s)
Out[9]: '6800 MAIN ST'
NPE
  • 486,780
  • 108
  • 951
  • 1,012