1

I would like to replace all names in an XML file tagged by with let's say xyz. In other words, replace everything (including whitespace) between the tags. What am I doing wrong?

Search: (<name>)(.*)(</name>)
Replace: \1xyz\3
Eonasdan
  • 7,563
  • 8
  • 55
  • 82
Sceptical Jule
  • 889
  • 1
  • 8
  • 26

1 Answers1

13

You are trying to parse XML with regular expressions.

However, what you are doing wrong anyway is using a greedy repetition. This will go all the way from the first <name> to the very last </name> (even if they do not belong together), because the .* will try to consume as much as possible while still fulfilling the match condition.. Use this instead:

Search: (<name>).*?(</name>)
Replace: \1xyz\2

Or to be on the safe side you can also escape the < and >, since they are meta-characters in some specific cases (not in this one though):

Search: (\<name\>).*?(\</name\>)
Replace: \1xyz\2

In both cases, this makes the .* ungreedy, i.e. it will consume as little as possible.

And make sure you upgrade to Notepad++ 6, because before that there were a few issues with Notepad++'s regex engine.

Lastly, as hoombar pointed out in a comment . by default matches every character except line break characters. In Notepadd++ you can change this behavior by ticking the . matches newline checkbox.

Community
  • 1
  • 1
Martin Ender
  • 43,427
  • 11
  • 90
  • 130
  • 3
    you might also want to check the box ". matches new line" – Ben Pearson Nov 02 '12 at 12:36
  • I generally escape the `<>` to prevent unexpected behaviour `\<\>` – Ariaan Nov 02 '12 at 12:40
  • 1
    @Ariaan It's probably a matter of taste (I usually prefer the readability), and would only differ inside a named capturing group, wouldn't it? – Martin Ender Nov 02 '12 at 12:42
  • @m.buettner You're correct, but I'd prefer to be prepared for reuse in larger regular expressions, in case you'd need to extend your expression. I don't think it's a bad habit when writing complicated expressions. – Ariaan Nov 02 '12 at 12:53