2

In notepad++ I tried to make a regex to match <tr> tags. At first, I thought that sharp braces had to be escaped, so I tried \<tr\>. However, this matched not just the beginning tags as I would have expected, but all of the <tr>s (both <tr> and </tr>). Why is this?

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
1252748
  • 14,597
  • 32
  • 109
  • 229
  • These are slashes, not backslashes... Also, which language? Some languages treat the first character as a separator in regexes, and as such, the "pattern" part is "``" which would be the part to replace the matched pattern... – ppeterka Sep 06 '13 at 13:39
  • @ppeterka66 it was a typo, thanks. I'm not sure what regex processing Notepad++ uses. I'm looking into it. – 1252748 Sep 06 '13 at 13:40

2 Answers2

7

\< and \> mean "word boundary" in certain regex implementations, including Notepad++. From the Notepad++ documentation:

\< This matches the start of a word using Scintilla's definitions of words.

\> This matches the end of a word using Scintilla's definition of words.

A word boundary is a zero-width match between a non-word character and a word character. Scintilla's definition of "word character" is:

A word is defined to be a character string beginning and/or ending with the characters A-Z a-z 0-9 and _. Scintilla extends this definition by user setting. The word must also be preceded and/or followed by any character outside those mentioned.

Thus, your regex \<tr\> actually matches the word boundary between < (or /) and t, followed by tr, followed by the word boundary between r and >.

Pang
  • 9,564
  • 146
  • 81
  • 122
nneonneo
  • 171,345
  • 36
  • 312
  • 383
3
  • escaping needs backslashes, e.g. \{
  • you don't need (shouldn't) escape < or >
  • if you escaped them, it means word boundary, not only <tr>, </tr> but also ,tr, will be matched
Kent
  • 189,393
  • 32
  • 233
  • 301