23

I have some text which looks like this -

"    tushar is a good      boy     "

Using javascript I want to remove all the extra white spaces in a string.

The resultant string should have no multiple white spaces instead have only one. Moreover the starting and the end should not have any white spaces at all. So my final output should look like this -

"tushar is a good boy"

I am using the following code at the moment-

str.replace(/(\s\s\s*)/g, ' ')

This obviously fails because it doesn't take care of the white spaces in the beginning and end of the string.

tusharmath
  • 10,622
  • 12
  • 56
  • 83
  • 2
    Can you use string.trim() as a solution? Combining the two lines of code. – gbam Aug 05 '13 at 19:08
  • 1
    @gbam `trim()` would only trim the beginning and end of the string. that wouldn't account for the "good boy" – ddavison Aug 05 '13 at 19:09
  • @sircapsalot, correct. You would combine the two solutions. Trimming the middle ones using regex and the outer ones using trim. I'll edit my comment to clarify. – gbam Aug 05 '13 at 19:10
  • 1
    Do you have newlines that need to be preserved? Or is the only whitespace tabs/spaces? – Joseph Myers Aug 05 '13 at 19:12
  • 1
    @JosephMyers: I forgot to mention that, I want to remove new lines and tabs also. – tusharmath Aug 05 '13 at 19:18
  • So any group of new line(s) or tab(s) between words should be replaced by a single space? And any leading or trailing ones should be deleted completely? – Joseph Myers Aug 05 '13 at 19:22

7 Answers7

35

This can be done in a single String#replace call:

var repl = str.replace(/^\s+|\s+$|\s+(?=\s)/g, "");

// gives: "tushar is a good boy"
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    This is really awesome! – tusharmath Aug 05 '13 at 19:28
  • what does `(?=\s)` do? – Reactgular Aug 05 '13 at 19:30
  • 2
    (?=\s) makes sure that there is at least one following space NOT included in the match. This way only the extra spaces are removed and not the final space. However, the problem is that whatever the final whitespace character was (tab, newline, etc.,) it will remain and not be replaced with an actual space. – Joseph Myers Aug 05 '13 at 19:30
  • 1
    @MathewFoscarini: As Joseph commented, it is actually a positive lookahead that ensures that `\s+` is always followed by 1 space. Read more about it here: www.regular-expressions.info/lookaround.html – anubhava Aug 05 '13 at 19:32
  • @anubhava I did not say that the final whitespace will remain. I'm saying that whitespace that is separating words will not be changed to actual spaces if necessary. E.g., "hello[space][space][tab]world" will be changed to "hello[tab]world" – Joseph Myers Aug 05 '13 at 19:37
  • @JosephMyers: Oh ok got you, yes final whitespace between words will remain (removed my previous comment). I believe that it how OP wants it to behave. – anubhava Aug 05 '13 at 19:39
  • Yes, that is likely the OP wants that. Your answer would leave tab-separated values as tab-separated. He didn't reply to my question to specify for sure. Edit: Actually, he did, sorry. He said, "I want to remove new lines and tabs also." – Joseph Myers Aug 05 '13 at 19:40
  • @anubhava whats does /g means? because I test this regular expression in regex101 website and its show error when I type /g otherwise this regex is working fine. – Hassan Fayyaz Sep 08 '21 at 07:30
  • Don't type `/g` in regex field on regex101 site. `g` is for global mode. – anubhava Sep 08 '21 at 07:35
8

This works nicely:

function normalizeWS(s) {
    s = s.match(/\S+/g);
    return s ? s.join(' ') : '';
}
  • trims leading whitespace
  • trims trailing whitespace
  • normalizes tabs, newlines, and multiple spaces to a single regular space
Joseph Myers
  • 6,434
  • 27
  • 36
  • Would be be possible to keep line breaks? Ideally, it would keep single line breaks as is (and remove other whitespace around it) and collapse multiple line breaks with other optional whitespace inbetween into two line breaks. – CodeManX Oct 30 '14 at 20:10
  • 1
    @CoDEmanX Yes, that's possible as well. The efficient way to do what you are asking is to split the entire string into the contents of lines (using `\n+` or `(?:\r\n)+` as the line separators) and then apply `normalizeWS` to each line, and then rejoin the lines with a single `\n`. (Or a `\r\n` if you wish.) – Joseph Myers Oct 31 '14 at 04:26
  • Good idea, but it requires an additional filtering of the normalized lines to throw away subsequent empty strings to collapse 2+ line breaks into two. I found a way, maybe not the most efficient, but it works as desired: `str.match(/[^ \t]+/g).join(' ').replace(/(?:\n[ \t]*){2,}/, '\n\n')` – CodeManX Oct 31 '14 at 10:59
4

Try this:

str.replace(/\s+/g, ' ').trim()

If you don't have trim add this.

Trim string in JavaScript?

Community
  • 1
  • 1
Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
4

Since everyone is complaining about .trim(), you can use the following:

str.replace(/\s+/g,' ' ).replace(/^\s/,'').replace(/\s$/,'');

JSFiddle

DSurguy
  • 375
  • 1
  • 9
  • +1 Not sure why anyone downvoted this answer. Looks to be the most universally compatible and efficient single-statement answer out of the bunch to me. (Also see: [Faster JavaScript Trim](http://blog.stevenlevithan.com/archives/faster-trim-javascript) for an excellent discussion of the best way to trim a string using JavaScript) – ridgerunner Aug 05 '13 at 21:46
1

This regex may be useful to remove the whitespaces

/^\s+|\s+$/g
I_love_vegetables
  • 1,575
  • 5
  • 12
  • 26
-1

Try:

str.replace(/^\s+|\s+$/, '')
   .replace(/\s+/, ' ');
Jacob
  • 77,566
  • 24
  • 149
  • 228
  • 2
    I don't want to you multiple reg-ex statements. That doesn't seem elegant :) – tusharmath Aug 05 '13 at 19:19
  • 2
    Multiple reg-ex statements are often more efficient. And I've seen problems where huge, slow single regular expressions could be combined into a few simple ones, but *only* if the programmer was willing to eschew "elegance" and use multiple statements. Elegant is subjective. – Joseph Myers Aug 05 '13 at 19:35
-1

try

var str = "    tushar is a good      boy     ";
str = str.replace(/^\s+|\s+$/g,'').replace(/(\s\s\s*)/g, ' ');

first replace is delete leading and trailing spaces of a string.

cavaliercyber
  • 234
  • 1
  • 7