7

How do I replace multiple \n's with just one? So if a user enters

blah 

blahdy blah



blah blah

I want it to end up looking like.

blah
blahdy blah
blah blah

I know I could loop through with a while() but would rather use a regular expression, since I believe it's more efficient.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
mazlix
  • 6,003
  • 6
  • 33
  • 45

6 Answers6

18

This worked for me:

string = string.replace(/\n+/g, '\n');

As others have said, it replaces each occurrence of one or more consecutive newline characters (\n+) with just one.

The g effects a "global" replace, meaning it replaces all matches in the string rather than just the first one.

Edit: If you want to take into account other operating systems' line ending styles as well (e.g., \r\n), you could do a multi-step replace:

string = string.replace(/(\r\n)+/g, '\r\n') // for Windows
    .replace(/\r+/g, '\r')                  // for Mac OS 9 and prior
    .replace(/\n+/g, '\n');                 // for everything else

OR (thanks to Renesis for this idea):

string = string.replace(/(\r\n|\r|\n)+/g, '$1');

If you know in advance what sort of text you're dealing with, the above is probably overkill as it carries am obvious performance cost.

Community
  • 1
  • 1
Dan Tao
  • 125,917
  • 54
  • 300
  • 447
  • As I said on the other answer: If the input comes from an application on Windows that uses default OS line endings, [or any other OS with CR+LF line endings](http://en.wikipedia.org/wiki/Newline#Representations) this will not work, because the characters will look like: `\r\n\r\n\r\n`. Notice there are no multiple `\n` characters in a row. – Nicole Jul 13 '11 at 22:38
  • In your edit, `.replace(/(\r\n)+/g, '\r\n')` should be `.replace(/(\r\n)+/g, '\n')` so that the second `.replace()` doesn't need to deal with the `\r`. – user113716 Jul 13 '11 at 22:44
  • @patrick dw: Not if the OP wants to preserve the original line ending format, right? – Dan Tao Jul 13 '11 at 22:46
  • Yes, if you want to preserve it, then you'd keep it the same. – user113716 Jul 13 '11 at 22:48
  • @patrick I think the edit looks fine as is. This is safer for *displaying* the resulting string correctly in the originating operating system, since it preserves line ending. There is no conflict with the second `replace` because `\r\n` line endings have no repeating `\r` characters. @Dan +1 for your edits. – Nicole Jul 13 '11 at 22:48
  • @Dan thinking about the line-ending-preservation thing, what about `string.replace(/(\r\n|\r|\n)+/g, "$1");`? It works pretty well in my tests. – Nicole Jul 13 '11 at 22:58
  • @Renesis: I actually had a similar thought, but my RegEx-fu is weak and I forgot how you refer to the first captured group so I was like "Eh, screw it." But yeah, good call; that definitely *looks* a bit neater (at least) than chained `replace` calls. – Dan Tao Jul 13 '11 at 23:11
7

Simply use a character class to replace one or more \n or \r with a single \n:

var newString = oldString.replace(/[\n\r]+/g, "\n");

Test HTML:

<script>
function process(id) {
    var oldString = document.getElementById(id).value;
    var newString = oldString.replace(/[\n\r]+/g, "\n");
    return newString;
}
</script>
<textarea id="test"></textarea>
<button onclick="alert(process('test'));">Test</button>

Please note: In the modern web, and in most modern applications, a single \n line ending is handled gracefully, which is why I've used that as my replacement string here. However, some more primitive applications on CR+LF (\r\n) operating systems will display all text run together without the full \r\n, in which case you could use Dan Tao's answer, which provides a solution to preserve line endings, or the following, which accomplishes a similar thing in one call:

string.replace(/(\r\n|\r|\n)+/g, "$1");
Community
  • 1
  • 1
Nicole
  • 32,841
  • 11
  • 75
  • 101
3
var blah = "blah\n\nblahdy blah\n\n\n\nblah blah\n";
var squeezed = blah.replace(/\n+/g, "\n");
MarkD
  • 482
  • 2
  • 6
  • If the input comes from an application on Windows that uses default OS line endings, [or any other OS with CR+LF line endings](http://en.wikipedia.org/wiki/Newline#Representations) this will not work, because the text will look like: `\r\n\r\n\r\n`. Notice there are no multiple `\n` characters in a row. – Nicole Jul 13 '11 at 22:37
  • @Renesis you are correct. I don't do a lot of development in Windows anymore so neglected that. @mazlix you should accept Renesis' answer as it's both correct and has an example. – MarkD Jul 13 '11 at 22:44
2

You can use regex

xx.replace(/\n+/g, "\n");
Mo Valipour
  • 13,286
  • 12
  • 61
  • 87
1

Or without the overhead of the JavaScript engine setting up a finite state machine to run the regular expression, you can do it with arrays:

s = s.split("\n").join("");
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Griffin
  • 13,184
  • 4
  • 29
  • 43
  • Interesting, you make it sound heavyweight to run regular expressions, but is there real-life evidence on performance differences? – Nicole Jul 13 '11 at 22:40
  • Split and join are both O(n), with relatively little overhead, Matching a string to a regular expression is O(mn), m = number of states in the FSM, plus that needs to be set up. You don't need real world when the maths tells you otherwise ;) – Griffin Jul 13 '11 at 22:45
  • I beg to differ, as the constants to those operations play very heavily in situations like this. – Nicole Jul 13 '11 at 23:00
  • Well it's a case of whether the complexity of operations in the split and join (very simple procedures) outweighs the overhead in setting up and executing the regular expression or not. I know which I'd rather go for. Also note that n for the join is smaller than for the split, though the elements are bigger. – Griffin Jul 13 '11 at 23:07
  • I think it's worth being aware of the options. +1 – Nicole Jul 13 '11 at 23:26
  • Cheers, it's also good to question everything as well. It's a shame people just pick the thing that works without looking deeper into it. – Griffin Jul 13 '11 at 23:31
  • 1
    Except that it doesn't work, -1. And also I bet is slower on strings with large number of lines. – Qtax Jul 14 '11 at 01:37
  • @Qtax, 1) It does work, I've done it many times - feel free to try it, and 2) Do you understand asymptotic notation? – Griffin Jul 14 '11 at 01:45
  • @Griffin, http://jsfiddle.net/PyyCs/ And please feel free to demonstrate `s/\n+//g` vs `split("\n").join("")` performance (even tho a string replace would work much better). – Qtax Jul 14 '11 at 01:55
  • @Qtax, +1 yes you are correct. In my haste I read the question as wanting to remove all linebreaks. – Griffin Jul 14 '11 at 02:06
1

Doing the same for spaces with detailed examples: Regex to replace multiple spaces with a single space. Example:

var str = "The      dog        has a long tail,      and it is RED!";
str = str.replace(/ {2,}/g,' ');
Community
  • 1
  • 1
Horst Walter
  • 13,663
  • 32
  • 126
  • 228