0

I'd like to isolate a substring of a source code in ruby, but I can't have better than this http://rubular.com/r/ALngW9TOwy I'd like to stop my match at the end of the first

<p>[...]/n</p>

I tried some things, but I have to admit that I suck at regex. I know there's a lot of method, like using Regexp or a simple regex, but I'm lost. If somebody can help, it would be great ! Thanks a lot !

EDIT: Thanks to Mchl, I have the solution. I put my need in the commentary, but it'll be better here: so I use this

match(/<p>(.*?)<\/p>/m)[1].strip
Simon
  • 619
  • 2
  • 9
  • 23
  • Related question: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Andrew Grimm Jul 27 '11 at 23:27

2 Answers2

1

Not sure if I inderstood correctly, but it seems to me, you need to 'ungreed' the *

<p>(.*?)<\/p>

Mchl
  • 61,444
  • 9
  • 118
  • 120
  • Yay that's work ! But, could you explain to me ? I don't really understand why… It works at least on the site, irb seems to doesn't like it – Simon Jul 27 '11 at 22:01
  • I could, but I'd rather refer you to this excellent site: http://www.regular-expressions.info/repeat.html - see the paragraph called 'Watch Out for The Greediness!' – Mchl Jul 27 '11 at 22:05
  • interresting, didnt know that you can make .* nongreedy like that – petho Jul 27 '11 at 22:05
  • Thank you, I will read this carefully ;) And I make irb okay with this "match(/

    (.*?)<\/p>/m)[1].strip"

    – Simon Jul 27 '11 at 22:09
0

you should try negative lookahead

<p>((.(?!<\/p>))*.)<\/p>
petho
  • 677
  • 4
  • 10