0

I am currently developping a regex to catch a number in a xml test.

<items><item id='123' something_cool='61461651'></item><items>

I used '.*item id='\d'.*' and it works ! However is there a way to do it without '.*' at the begining and the end of the regex ?

Thomas Betous
  • 4,633
  • 2
  • 24
  • 45
  • 3
    Regex is not best tool to parse XML/HTML. Why not just use proper parser? (more info: [Can you provide some examples of why it is hard to parse XML and HTML with a regex?](https://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg)) – Pshemo Jun 10 '16 at 14:41
  • 1
    But to answer your question. `.*` are mandatory if you are using `matches` method and your regex describes only fragment, since purpose of `matches` is to check if entire string can be matched by regex. If you want to get only part of text which matches regex you need to use `Matcher#find`. – Pshemo Jun 10 '16 at 14:42
  • I don't want to parse all my message for only one field. Here we have a simple xml. But my true xml is really more complexe. And the regex is more complexe. – Thomas Betous Jun 10 '16 at 14:43
  • 1
    The question is: are you really sure that your XML is "specific" enough so that any regex you are using will be working for all potential inputs? – GhostCat Jun 10 '16 at 14:45
  • @Pshemo Awesome ! It is exactly what I needed ! – Thomas Betous Jun 10 '16 at 14:46
  • To be honest there are cases where parsing HTML with regex makes sense, but it may be hard to do it right. More info: http://stackoverflow.com/questions/4231382/regular-expression-pattern-not-matching-anywhere-in-string/4234491#4234491 – Pshemo Jun 10 '16 at 14:59

0 Answers0