25

I'm building a website with Python/Django. Users submit tags. Each tag can contain multiple words. Each tag has an ID number. I want to make sure tags that are formatted slightly differently are still being recognized as the same tag.

For example, if one user submitted the tag "electric guitar" and the other submitted "electric   guitar" (2 white spaces between the 2 words) I want to be able to recognize they are the same tag.

How to I remove all the extra white spaces and tabs in this case? Thanks.

Continuation
  • 12,722
  • 20
  • 82
  • 106

6 Answers6

53

Split on any whitespace, then join on a single space.

' '.join(s.split())
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
20
>>> import re
>>> re.sub(r'\s+', ' ', 'some   test with     ugly  whitespace')
'some test with ugly whitespace'
ThiefMaster
  • 310,957
  • 84
  • 592
  • 636
7

I would use Django's slugify method, which condenses spaces into a single dash and other helpful features:

from django.template.defaultfilters import slugify
Marcus Whybrow
  • 19,578
  • 9
  • 70
  • 90
1

"electric guitar".split() will give you ['electric', 'guitar']. So will "electric \tguitar".

nmichaels
  • 49,466
  • 12
  • 107
  • 135
-2

This function removes everything which is not digit in a string. I use it all over the place.

def parseInt(string):
    if isinstance(string, (str, int, unicode)):
        try:
            digit = int(''.join([x for x in string if x.isdigit() ]))
        except ValueError:
            return False
        else:
            return digit
    else:
        return False   
zzart
  • 11,207
  • 5
  • 52
  • 47
-10

There could be many white spaces like below:

var = "         This      is the example  of how to remove spaces   "

Just do simple task like, use replace function:

realVar = var.replace(" ",'')

Now the outpur would be:

Thisistheexampleofhowtoremovespaces 

Just Chill......... :-)

Deepak 'Kaseriya'
  • 1,146
  • 1
  • 7
  • 3