1

I am looking to remove multiple line breaks using regular expression. Say I have this text:

"On the Insert tab\n \n\nthe galleries include \n\n items that are designed"

then I want to replace it with

"On the Insert tab\nthe galleries include\nitems that are designed"

So my requirement is:

  1. it will remove all multiple newlines and will replace with one newline
  2. It will remove all multiple spaces and will replace with one space
  3. Spaces will be trimmed as well

I do searched a lot but couldn't find solution - the closest I got was this one Removing redundant line breaks with regular expressions.

Community
  • 1
  • 1
ARIF MAHMUD RANA
  • 5,026
  • 3
  • 31
  • 58
  • here's for your space part:: http://stackoverflow.com/questions/2368539/php-replacing-multiple-spaces-with-a-single-space – Sudhir Bastakoti Sep 07 '12 at 10:12
  • @Sudhir I can remove multiple spaces but I need remove multiple newlines also. – ARIF MAHMUD RANA Sep 07 '12 at 10:16
  • See here:: http://stackoverflow.com/questions/6360566/replace-multiple-newline-tab-space – Sudhir Bastakoti Sep 07 '12 at 10:17
  • 1
    @ARIFMAHMUDRANA Here is a rundown of the technique: 1) Trim the string. 2) Find all blocks of more than one white space character. 3) Iterate over them, if the block contains an `\n` character replace it with a single `\n`, if not replace it with a single space. The easiest way to do this is with `preg_replace_callback()`. I'll even give you the regex you need for free: `/\s+/`. Try and implement it yourself and if you can't get it right show what you come up with, then we can help you see what went wrong. – DaveRandom Sep 07 '12 at 10:19
  • @Sudhir will it work for spaces in between newlines ? – ARIF MAHMUD RANA Sep 07 '12 at 10:21
  • @DaveRandom the text I am working on is article so can you tell me what will happen if the I have thousands of articles and millions of spaces in that articles – ARIF MAHMUD RANA Sep 07 '12 at 10:23
  • @ARIF see my answer, I tested it and it is working – Oussama Jilal Sep 07 '12 at 10:25
  • @ARIFMAHMUDRANA Well the quantity of articles is irrelevant, because no matter how you do it, you process them 1 at a time, so you only ever need to load 1 into PCRE at once. Your second point about `millions of spaces` is a good one though, and you have made me realise that `/\s{2,}/` would be a better regex for this purpose. – DaveRandom Sep 07 '12 at 10:27

3 Answers3

1

Use this :

echo trim(preg_replace('#(\s)+#',"$1",$string));
Oussama Jilal
  • 7,669
  • 2
  • 30
  • 53
  • This will replace `\n\n` with `` when it should be replaced with `\n`. – DaveRandom Sep 07 '12 at 10:31
  • @Yazmat The key to this is to examine each token you detect in the string and see if it contains a new line, return a single new line if it does and return a space if it doesn't. I'll also give you a hint as to how you can easily do that on the fly: `preg_replace_callback()`. Also note that while your expression is sound and will work, it will be quite inefficient because you will be examining every block of whitespace (including a single character) where you only need to examine blocks of two or more whitespace characters. – DaveRandom Sep 07 '12 at 10:43
0
$text = str_replace("\r\n", "\n", $text); // converts Windows new lines to Linux ones
while (strpos($text, "\n\n") != false)
    {
        $text = str_replace("\n\n", "\n", $text);
    }

That will sort out newline characters.

Eligos
  • 1,104
  • 1
  • 11
  • 26
0
$text = trim($text);
preg_replace('/\s+/', ' ', $text);
preg_replace('/(?:\s*(?:\r\n|\r|\n)\s*){2}/s', "\n", $text);

Thanks to Removing redundant line breaks with regular expressions

Community
  • 1
  • 1
ARIF MAHMUD RANA
  • 5,026
  • 3
  • 31
  • 58