4

What does it mean to have a \number in a regex in java.

Let's say I have something like \1 or \2. What does this mean and how is it used? An example would be really helpful.

Thanks

Mohammad Najar
  • 2,009
  • 2
  • 21
  • 31
  • `\number` means that you were too lazy to do a [simple google search](http://stackoverflow.com/questions/8624345/whats-the-meaning-of-a-number-after-a-backslash-in-a-regular-expression) for "regex backslash number". Hope this helps. – tenub May 12 '14 at 18:55
  • 1
    I did .. didn't find a quick and useful solution .. – Mohammad Najar May 12 '14 at 18:56
  • @tenub If I type in the title exactly as written, the search results page doesn't immediately appear to have a result that addresses the question. I'd have to guess which link is most likely to be useful. – ajb May 12 '14 at 19:09
  • Short answer is \ is character which octagonal ASCII code is . It can eighter be writen with ??? notation or simple (like 041 and 41). Anyway I think it's not too usefull generaly since most of people don't use it like that. Since more often used characters got shortcuts like "\n" which is LF and \t which is HT. Here is more about that use: http://docs.oracle.com/javase/tutorial/java/data/characters.html Best understending in standard string is that \ is escaping character. Java looks for some special meaning behind it. – KonradOliwer May 12 '14 at 19:12
  • @KonradOliwer you would be right for something like `Pattern.compile("((xyz)*)\1")`, but we're all guessing the code looks more like `Pattern.compile("((xyz)*)\\1")` or something to that effect. – ajb May 12 '14 at 19:14
  • Well I added it just becouse when I was looking at subject first time I've seen that we talk about String generaly, not regex particulary. Just adding, to give inquirer better view of whole problem. But well, since it look like Mohammad found what he needs, pprobably it won't be usefull anymore. – KonradOliwer May 12 '14 at 19:17

2 Answers2

5

Backreferences match the same text as previously matched by a capturing group. Suppose you want to match a pair of opening and closing HTML tags, and the text in between. By putting the opening tag into a backreference, we can reuse the name of the tag for the closing tag. Here's how:

<([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1>

This regex contains only one pair of parentheses, which capture the string matched by

[A-Z][A-Z0-9]*

The backreference \1 (backslash one) references the first capturing group. \1 matches the exact same text that was matched by the first capturing group. The / before it is a literal character. It is simply the forward slash in the closing HTML tag that we are trying to match.

For more details and examples check: http://www.regular-expressions.info/backref.html

M. A. Kishawy
  • 5,001
  • 11
  • 47
  • 72
  • 1
    Good answer, but I don't like encouraging readers to think you can use regexes to parse H#^@ ... parse HT*#% ... pa#*$ ... Oh Lord, I can't even say it ... please don't link to that question ... – ajb May 12 '14 at 18:53
  • 1
    Let's say we have `<(GROUP_1)>(GROUP_2)\2>`. Would this mean `\2` places the same regex defined in `GROUP_2` in `` ? – Mohammad Najar May 12 '14 at 18:54
0

\ usually is used at the start of the construction of a match. It also represents an escape character.

Josef E.
  • 2,179
  • 4
  • 15
  • 30