0

Suppose I have this java string. Note there are two <c></c> pairs one contains only numbers but the other one contains numbers and a string. How do I know if a string contains <c></c> with only numbers in Java? I did this but it didn't work.

    String keyPattern = "^<id>[0-9]</id>$";
    boolean hasKey = str.matches(keyPattern);

<start><a></a><b></b><c>addf123</c><d><d><c>1234</c><foo></foo><bar></bar></start>
codereviewanskquestions
  • 13,460
  • 29
  • 98
  • 167
  • 1
    In a general sense, you should really be parsing the data properly with an XML parser. Parsing using regex is not the right thing to do. – rolfl May 07 '13 at 22:27
  • 1
    Please do not use regex to parse XML. See http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Jim Garrison May 07 '13 at 22:34

5 Answers5

0

You were very close, you need to add a 'quantifier'. In your case, a '+'.

String keyPattern = "^<id>[0-9]+</id>$";

Link to the JavaDoc for Pattern

Edit: BUT, you also only going to successfully match a string where the only values in the String are "<id>123</id>" and no other text, because you have the ^ and $ anchors.

So, if you remove those, you have:

String keyPattern = ".*<id>[0-9]+</id>.*";

Which will match any string which contains a number-only ... tag anywhere.

I suspect that you want to get all the number-only id's, in which case, the matches(...) method is not the one you want to use.... but that's a different issue.

rolfl
  • 17,539
  • 7
  • 42
  • 76
0

If you want to match a string that can contain a number of any size you need to add a + after the [0-9] so it looks like this:

[0-9]+

The way it is now it will only match a single digit number.

cmbaxter
  • 35,283
  • 4
  • 86
  • 95
0

You have a couple of errors.

  1. You have no tags "< id>" or "< /id>" inside of your string, you have "< /c>".
  2. The special character "^" means it must match from the start of the string. "$" means it must match till then end of the string. So in this case you want to match everything, up until the "< c>digits< /c>", then match the rest. ".* " will match everything up until what you want, then you should search for the digits, then match the rest of the string using ".*"
  3. You are only searching with one digit with the "[0-9]" you need to search for 1 or more, this is done with the "+" symbol

revised matches that works.

String keyPattern = ".*<c>[0-9]+</c>.*";
boolean hasKey = str.matches(keyPattern);
greedybuddha
  • 7,488
  • 3
  • 36
  • 50
0

Don't use regular expressions to parse XML. Use a real XML API, such as SAX or DOM--or, if applicable, some higher-level API which is less tedious to use. For example, if you're using XML to serialize objects, you should look at JAXB.

Trying to do it with regular expressions is just asking for trouble. See related questions and their answers:

Why is it such a bad idea to parse XML with regex?

RegEx match open tags except XHTML self-contained tags

Community
  • 1
  • 1
rob
  • 6,147
  • 2
  • 37
  • 56
-1

[0-9] will match a single digit

[0-9]? will match 0 or 1 digits

[0-9]* will match 0 or many digits

[0-9]+ will match 1 or many digits

you can also replace [0-9] with \d for a digit

Craig
  • 1,390
  • 7
  • 12