0

I am doing a Find & Replace in Notepad++. Each of the pieces work correctly, but when I put the entire string together, it is not working.

I have an old HTML document I am editing where the original page cut up the text, thus:

<div>
    <h2>Title</h2>
    <p>This is a line of[CR][LF]
    text that was cut to[CR][LF]
    fit on screen.</p>
</div>

I want to find the line breaks that are cutting up the text and eliminate them, but not other line breaks.

My regular expression is:

([A-z0-9]+)[\r\n][ ]{3,}([A-z0-9]+)

It will be replaced with:

$1 $2

I have tried each of the pieces of my regular expression and they all find what I would expect: ([A-z0-9]+) finds text, [\r\n] is finding my line breaks, [ ]{3,} is finding their initial indents, and ([A-z0-9]+) again finds text.

I have even tried sets of expressions and they all work: ([A-z0-9]+)[\r\n] is finding text at the end of a line, [\r\n][ ]{3,} is finding line breaks with the indent at the beginning of the new line, and [ ]{3,}([A-z0-9]+) is finding initial indents followed by text.

Perhaps I have two questions: 1) Is this a Notepad++ bug, or have I missed something with my regular expression? 2) Any ideas on solving this by some other expression?

If it is a bug, I suppose I can just trial and error until something works. It would probably be well to report the bug, though, so if anyone can verify that, it would help.

Thomas
  • 438
  • 6
  • 14
  • Your regex doesn't match your sample text. Also, don't use `[A-z]` because it accepts more character than letters. Prefer `[A-Za-z]` – Toto Mar 27 '14 at 14:12
  • I'd emphasize it more strongly than that: `[A-z]` does *not* mean `[A-Za-z]` Don't use it. – aliteralmind Mar 27 '14 at 14:16
  • I edited my sample text, not sure if that is what you meant. What additional characters are accepted by `[A-z]`? – Thomas Mar 27 '14 at 14:16
  • 1
    See this: http://stackoverflow.com/questions/1658844/is-the-regular-expression-a-z-valid-and-if-yes-then-is-it-the-same-as-a-za-z – aliteralmind Mar 27 '14 at 14:18
  • Your regular expression and the accepted answer are both like `[A-Za-z0-9]+)important characters([A-Za-z0-9]+)`. The two `[A-Za-z0-9]+` sections are overkill, the `+` can be omitted from both. – AdrianHHH Mar 27 '14 at 17:05

1 Answers1

1

Your regex:

([A-z0-9]+)[\r\n][ ]{3,}([A-z0-9]+)

matches one of \r OR \n.

Use this:

([A-Za-z0-9]+)[\r\n]+[ ]{3,}([A-Za-z0-9]+)

or

([A-Za-z0-9]+)\R+[ ]{3,}([A-Za-z0-9]+)
Toto
  • 89,455
  • 62
  • 89
  • 125
  • I think that was it, thank you. I used `([A-Za-z0-9]+)\r\n[ ]{3,}([A-Za-z0-9]+)`. Does that seem like another proper solution to you? I am not familiar with `\R`, what is that? – Thomas Mar 27 '14 at 14:19
  • 1
    @Thomas: `\R` stands for any of `\n` or `\r` or `\r\n`. – Toto Mar 27 '14 at 14:21