Remove all whitespace in a string

Question

I want to eliminate all the whitespace from a string, on both ends, and in between words.

I have this Python code:

def my_handle(self):
    sentence = ' hello  apple  '
    sentence.strip()

But that only eliminates the whitespace on both sides of the string. How do I remove all whitespace?

What should your result look like? `hello apple`? `helloapple`? — Mark Byers, Nov 25 '11 at 13:57
@JoachimPileborg, not exactly I think, because it's also about reducung whitespace between the words. — wal-o-mat, Nov 25 '11 at 13:59
Correct me if wrong, but "whitespace" is not synonymous with "space characters". The current answer marked as correct does not remove all [whitespace](https://en.wikipedia.org/wiki/Whitespace_character). But, since it's marked as correct it must have answered the intended question? So we should edit the question to reflect the accepted answer? @Kalanamith Did, or do, you want to remove all whitespace or only spaces? — AnnanFay, Dec 06 '16 at 17:23

score 2288 · Accepted Answer · edited Mar 29 '22 at 11:43

2288

If you want to remove leading and ending spaces, use str.strip():

>>> "  hello  apple  ".strip()
'hello  apple'

If you want to remove all space characters, use str.replace() (NB this only removes the “normal” ASCII space character ' ' U+0020 but not any other whitespace):

>>> "  hello  apple  ".replace(" ", "")
'helloapple'

If you want to remove duplicated spaces, use str.split() followed by str.join():

>>> " ".join("  hello  apple  ".split())
'hello apple'

edited Mar 29 '22 at 11:43

Mateen Ulhaq

24,552
19
101
135

answered Nov 25 '11 at 13:56

Cédric Julien

78,516
15
127
132

66

The greatness of this function is that it also removes the '\r\n' from the html file I received from Beautiful Soup. – lsheng May 26 '14 at 08:16
67

I like "".join(sentence.split()), this removes all whitespace (spaces, tabs, newlines) from anywhere in sentence. – don May 25 '16 at 17:57
1

begginner here. Can someone explain me why print(sentence.join(sentence.split())) results to 'hello hello appleapple'? Just want to understand how code is processed here. – Yannis Dran Nov 22 '16 at 17:22
6

@YannisDran check the [str.join() documentation](https://docs.python.org/3/library/stdtypes.html#str.join), when you call `sentence.join(str_list)` you ask python to join items from str_list with `sentence`as separator. – Cédric Julien Nov 24 '16 at 16:24
12

`"".join(sentence.split())` is indeed the canonical solution, efficiently removing *all* whitespace rather than merely spaces. [Mark Byers](https://stackoverflow.com/users/61974/mark-byers)' [excellent answer](https://stackoverflow.com/a/8270124/2809027) should probably have been accepted in lieu of this less applicable answer. – Cecil Curry Jul 04 '17 at 06:44
in case of more than 2 whitespaces in the end string, extra `sentence.strip().rstrip()` can help same goes for `sentence.strip().lstrip()` for beginnig. Not perfect, but works without the need for `re` that's in case simple `.strip()` misses them – Rolands.EU Feb 25 '20 at 22:36
The solution works but is slightly confusing - when using replace, it returns the string, and doesn't replace in-place, so you need `sentence = sentence.replace(" ", "")` – Lenka Pitonakova Dec 16 '20 at 18:54
Agree with @CecilCurry If the string can be anything and the intention is to replace all whitespaces, it should be ` "".join(sentence.split()) ` – grimur82 Dec 04 '22 at 10:12

score 422 · Answer 2 · edited Jan 20 '14 at 23:45

422

To remove only spaces use str.replace:

sentence = sentence.replace(' ', '')

To remove all whitespace characters (space, tab, newline, and so on) you can use split then join:

sentence = ''.join(sentence.split())

or a regular expression:

import re
pattern = re.compile(r'\s+')
sentence = re.sub(pattern, '', sentence)

If you want to only remove whitespace from the beginning and end you can use strip:

sentence = sentence.strip()

You can also use lstrip to remove whitespace only from the beginning of the string, and rstrip to remove whitespace from the end of the string.

edited Jan 20 '14 at 23:45

Randall Cook

6,728
6
33
68

answered Nov 25 '11 at 13:54

Mark Byers

811,555
193
1,581
1,452

Note: You don't need to compile step, re.sub (and friends) cache the compiled pattern. See also, [Emil's answer](http://stackoverflow.com/a/28607213/1240268). – Andy Hayden Apr 22 '15 at 18:03
1

python3: `yourstr.translate(str.maketrans('', '', ' \n\t\r'))` – deed02392 Apr 17 '19 at 12:44

Emil Stenström · Answer 3 · 2018-07-26T08:55:52.997

153

An alternative is to use regular expressions and match these strange white-space characters too. Here are some examples:

Remove ALL spaces in a string, even between words:

import re
sentence = re.sub(r"\s+", "", sentence, flags=re.UNICODE)

Remove spaces in the BEGINNING of a string:

import re
sentence = re.sub(r"^\s+", "", sentence, flags=re.UNICODE)

Remove spaces in the END of a string:

import re
sentence = re.sub(r"\s+$", "", sentence, flags=re.UNICODE)

Remove spaces both in the BEGINNING and in the END of a string:

import re
sentence = re.sub("^\s+|\s+$", "", sentence, flags=re.UNICODE)

Remove ONLY DUPLICATE spaces:

import re
sentence = " ".join(re.split("\s+", sentence, flags=re.UNICODE))

(All examples work in both Python 2 and Python 3)

edited Jul 26 '18 at 08:55

answered Feb 19 '15 at 13:05

Emil Stenström

13,329
8
53
75

Did not work for "\u202a1234\u202c". Gives the same output: u'\u202a1234\u202c' – Sarang Jul 06 '16 at 17:19
1

@Sarang: Those are not whitespace characters (google them and you'll see) but "General Punctuation". My answer only deals with removing characters classified as whitespace. – Emil Stenström Jul 07 '16 at 18:04
This is the only solution I see here that removes those damn pesky unicode whitespace characters, thanks fam – CapnShanty Oct 16 '19 at 14:13

score 62 · Answer 4 · edited Oct 01 '20 at 06:48

62

"Whitespace" includes space, tabs, and CRLF. So an elegant and one-liner string function we can use is str.translate:

Python 3

' hello  apple '.translate(str.maketrans('', '', ' \n\t\r'))

OR if you want to be thorough:

import string
' hello  apple'.translate(str.maketrans('', '', string.whitespace))

Python 2

' hello  apple'.translate(None, ' \n\t\r')

OR if you want to be thorough:

import string
' hello  apple'.translate(None, string.whitespace)

edited Oct 01 '20 at 06:48

ib.

27,830
11
80
100

answered Nov 28 '15 at 03:36

MaK

1,648
1
16
6

2

This won't help with Unicode whitespace like `\xc2\xa0` – Suzana Dec 29 '15 at 18:07
5

`ans.translate( None, string.whitespace )` produces only `builtins.TypeError: translate() takes exactly one argument (2 given)` for me. Docs says that argument is a translate table, see string.maketrans(). But see comment by Amnon Harel, below. – user405 Sep 03 '17 at 21:07
2

`' hello apple'.translate(str.maketrans('', '', string.whitespace))` Note: its better to make a variable to store the trans-table if you intend to do this multiple times. – Shogan Aversa-Druesne Dec 10 '18 at 19:12

score 19 · Answer 5 · answered Nov 25 '11 at 13:56

19

For removing whitespace from beginning and end, use strip.

>> "  foo bar   ".strip()
"foo bar"

answered Nov 25 '11 at 13:56

wal-o-mat

7,158
7
32
41

3

The question specifically asks for removing all of the whitespace and not just at the ends. Please take notice. – Shayan Shafiq Mar 04 '20 at 11:38
This answer is irrelevant to this question – Scott Sep 21 '21 at 09:50

score 12 · Answer 6 · edited Dec 25 '19 at 01:34

12

' hello  \n\tapple'.translate({ord(c):None for c in ' \n\t\r'})

MaK already pointed out the "translate" method above. And this variation works with Python 3 (see this Q&A).

edited Dec 25 '19 at 01:34

Asclepius

57,944
17
167
143

answered Sep 26 '16 at 09:54

Amnon Harel

151
1
4

2

Thanks! Or, `xxx.translate( { ord(c) :None for c in string.whitespace } )` for thoroughness. – user405 Sep 03 '17 at 21:10

score 10 · Answer 7 · answered Apr 06 '18 at 20:51

In addition, strip has some variations:

Remove spaces in the BEGINNING and END of a string:

sentence= sentence.strip()

Remove spaces in the BEGINNING of a string:

sentence = sentence.lstrip()

Remove spaces in the END of a string:

sentence= sentence.rstrip()

All three string functions strip lstrip, and rstrip can take parameters of the string to strip, with the default being all white space. This can be helpful when you are working with something particular, for example, you could remove only spaces but not newlines:

" 1. Step 1\n".strip(" ")

Or you could remove extra commas when reading in a string list:

"1,2,3,".strip(",")

score 7 · Answer 8 · edited Mar 14 '18 at 10:19

7

Be careful:

strip does a rstrip and lstrip (removes leading and trailing spaces, tabs, returns and form feeds, but it does not remove them in the middle of the string).

If you only replace spaces and tabs you can end up with hidden CRLFs that appear to match what you are looking for, but are not the same.

edited Mar 14 '18 at 10:19

Peter Mortensen

30,738
21
105
131

answered Nov 12 '14 at 19:30

yan bellavance

4,710
20
62
93

Although this is a good point, this isn't really an answer and should be a comment unless you provide a solution. Would you care to provide a solution for this is exactly what I'm looking for? Cheers – Dpedrinha Dec 10 '20 at 19:44

handle · Answer 9 · 2020-03-13T16:02:14.063

6

eliminate all the whitespace from a string, on both ends, and in between words.

>>> import re
>>> re.sub("\s+", # one or more repetition of whitespace
    '', # replace with empty string (->remove)
    ''' hello
...    apple
... ''')
'helloapple'

https://en.wikipedia.org/wiki/Whitespace_character

Python docs:

edited Mar 13 '20 at 16:02

answered Mar 13 '20 at 15:51

handle

5,859
3
54
82

I know `re` has been suggested before, but I found that the actual answer to the question title was a bit hidden amongst all the other options. – handle Mar 13 '20 at 15:58

score 5 · Answer 10 · answered Jul 29 '21 at 14:33

I use split() to ignore all whitespaces and use join() to concatenate strings.

sentence = ''.join(' hello  apple  '.split())
print(sentence) #=> 'helloapple'

I prefer this approach because it is only a expression (not a statement).
It is easy to use and it can use without binding to a variable.

print(''.join(' hello  apple  '.split())) # no need to binding to a variable

score 3 · Answer 11 · answered Oct 24 '16 at 12:46

3

import re    
sentence = ' hello  apple'
re.sub(' ','',sentence) #helloworld (remove all spaces)
re.sub('  ',' ',sentence) #hello world (remove double spaces)

answered Oct 24 '16 at 12:46

PrabhuPrakash

261
2
7

4

the question was too remove all white space which includes tabs and new line characters, this snippet will only remove regular spaces. – Maximilian Peters Oct 24 '16 at 16:59

Jane Kathambi · Answer 12 · 2022-02-28T08:04:44.960

3

In the following script we import the regular expression module which we use to substitute one space or more with a single space. This ensures that the inner extra spaces are removed. Then we use strip() function to remove leading and trailing spaces.

# Import regular expression module
import re

# Initialize string
a = "     foo      bar   "

# First replace any number of spaces with a single space
a = re.sub(' +', ' ', a)

# Then strip any leading and trailing spaces.
a = a.strip()

# Show results
print(a)

edited Feb 28 '22 at 08:04

answered Feb 19 '22 at 10:59

Jane Kathambi

695
6
8

2

It helps more if you supply an explanation why this is the preferred solution and explain how it works. We want to educate, not just provide code. – the Tin Man Feb 21 '22 at 05:21
@theTinMan thanks for the recommendation I just added the explanations. – Jane Kathambi Feb 28 '22 at 08:05

user856387 · Answer 13 · 2022-07-25T10:49:09.183

0

I found that this works the best for me:

test_string = '  test   a   s   test '
string_list = [s.strip() for s in str(test_string).split()]
final_string = ' '.join(string_array)
# final_string: 'test a s test'

It removes any whitespaces, tabs, etc.

edited Jul 25 '22 at 10:49

answered Jul 25 '22 at 10:08

user856387

51
3

score 0 · Answer 14 · answered Aug 31 '23 at 04:46

All string characters are unicode literal in Python 3; as a consequence, since str.split() splits on all white space characters, that means it splits on unicode white space characters. So split + join syntax (as in 1, 2, 3) will produce the same output as re.sub with the UNICODE flag (as in 4); in fact, the UNICODE flag is redundant here (as in 2, 5, 6, 7).

import re
import sys

# all unicode characters
sentence = ''.join(map(chr, range(sys.maxunicode+1)))

# remove all white space characters
x = ''.join(sentence.split())
y = re.sub(r"\s+", "", sentence, flags=re.UNICODE)
z = re.sub(r"\s+", "", sentence)

x == y == z      # True

In terms of performance, since Python's string methods are optimized, they are much faster than regex. As the following timeit test shows, when removing all white space characters from the string in the OP, Python string methods are over 7 times faster than re option.

import timeit

import timeit

setup = """
import re
s = ' hello  \t apple  '
"""

t1 = min(timeit.repeat("''.join(s.split())", setup))
t2 = min(timeit.repeat("re.sub(r'\s+', '', s, flags=re.UNICODE)", setup))


t2 / t1  # 7.868004799367726

score -2 · Answer 15 · answered Oct 10 '20 at 19:36

try this.. instead of using re i think using split with strip is much better

def my_handle(self):
    sentence = ' hello  apple  '
    ' '.join(x.strip() for x in sentence.split())
#hello apple
    ''.join(x.strip() for x in sentence.split())
#helloapple

Remove all whitespace in a string

15 Answers15

Linked

Related