0

I'm using the java class Pattern to match the strings in a text that start with a specific string, let's say abc, that has any text (containing any character) and that stop at the beginning of another different specified string, let's say def. How would you write this?

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Luca Carlon
  • 9,546
  • 13
  • 59
  • 91
  • Post the real string you have to match. Malvolio's answer is correct for the case you said. –  Dec 24 '10 at 17:13

2 Answers2

2

Unless your problem is more complicated than you've explained: "abc.*def"

Michael Lorton
  • 43,060
  • 26
  • 103
  • 144
  • I already tried this, but it seems it is not doing what I need. I suppose by character the dot is not taking something like the space. I tried abc[.\s]*def but it is not working as well, I can't understand why. An example of what I need it to is to select everything inside content in an HTML page. Thanks for the quick answer! – Luca Carlon Dec 24 '10 at 16:29
  • 5
    There's nothing wrong with the regexp. Why don't you post the simplest version possible of the code so we can help you spot the error. My last two questions on StackOverflow, I canceled before I ever posted, because in writing simple cases to demonstrate the problem, I answered my own question. – Michael Lorton Dec 24 '10 at 16:32
  • Code? I'm trying the regexp with the search capability of eclipse for the moment, that I suppose should use the same interpreter. I tried this as well abc[\w|\W]*def but it is not stopping at the first def. I thought I should use the ^, but I really can't understand how to use it with a string. – Luca Carlon Dec 24 '10 at 16:37
  • Could you have a greedy/lazy problem? If your string is "abcxxdefxxdef", the regexp 'abc.*def' will (with most libraries) match the *whole* string, not just the first eight character, as you might expect. A lazy match 'abc.*?def' would stop at the first 'def'. – Michael Lorton Dec 24 '10 at 16:48
  • It is not finding anything this way as well... It seems to me it should match, but it is not... – Luca Carlon Dec 24 '10 at 16:54
  • 2
    It's not finding anything or it's not stopping? Be specific and detailed. Try it with a simple case, like abc.*?def before going on to .*?
    or whatever. Don't forget the editor might be treating the other punctuation, in the regexp or the text, specially.
    – Michael Lorton Dec 24 '10 at 17:03
  • With this test I'm doing (Eclipse find capability which, until now, appears to work the same way of my code), it answers the string has not been found. I tried with a simpler example as you suggested, and indeed it is working. Any idea why it is not in the other case? Thanks for your help! – Luca Carlon Dec 24 '10 at 17:12
  • Step by step, use more complicated cases, working your way towards the case you actually need. With small enough changes, when it stops working, it will be obvious why. – Michael Lorton Dec 24 '10 at 17:15
  • Followed your instructions and... it seems the newlines are causing problems. I tried with abc[\n|.]*?def but it is not working as well... – Luca Carlon Dec 24 '10 at 17:17
  • Yeah, that's a common problem. See http://www.regular-expressions.info/dot.html for some suggestions – Michael Lorton Dec 24 '10 at 18:26
1

as a side note to your comment, using a regex to proccess html/xml is generally a bad idea. classic explanation here

Community
  • 1
  • 1
jtahlborn
  • 52,909
  • 5
  • 76
  • 118
  • There's an old saying, "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." It's not entirely true, though it's worth thinking about. Regexp are good for searching, a bit shaky for parse. The OP has a problem in the context of an editor and probably the hammer that is regexp is the only tool he has, so like it or not, every problem will demonstrate nail-like properties. – Michael Lorton Dec 24 '10 at 16:52
  • I didn't even think that HTML isn't a regular language... Anyway, I already have an entire "framework" which was created by others to work this way, and so, if possible, it would be better for me to keep things like they are. Thanks for pointing this out though! – Luca Carlon Dec 24 '10 at 17:05