2

How do I match the end of a regular expression by a word? For example:

<h1><a href=...></a>CONTENT</h1>

Given that <h1> is the start tag, how do I return <a href=...></a>CONTENT?

The expression /< h1>(([<\/h1>\b])*/ does not seem to work

animuson
  • 53,861
  • 28
  • 137
  • 147
user1124535
  • 765
  • 3
  • 9
  • 15
  • 3
    http://stackoverflow.com/a/1732454/383402 – Borealid Feb 04 '12 at 00:25
  • @Borealid: Without even clicking it, I knew which post that was :) – mpen Feb 04 '12 at 00:30
  • 3
    @Borealid: Give it a rest. There's no nested tags here, nor did the question ask for parsing. Also it's kind of rude to spam it onto every newbie question. http://meta.stackexchange.com/a/73168/148103 – mario Feb 04 '12 at 00:33
  • 1
    @mario The two web sites mentioned in that response are actually useful, thanks! I'll use them instead. As for applicability: in my experience, one tag becomes two, and then someone puts in an HTML entity, and then you end up with a comment, and before you know it, you're parsing HTML. – Borealid Feb 04 '12 at 00:40
  • @Borealid: Right, but that's true of everything. You start out with a simple `for` loop, and then requirements change, and new requirements are added, and before you know it you have a 2000-line function. Because everyone knows that you can never, *ever* change your design later. Refactoring is simply out of the question! Right? – ruakh Feb 04 '12 at 03:37

2 Answers2

3
/<h1>([\s\S]*)<\/h1>/

I think this should help you out.

FeRtoll
  • 1,247
  • 11
  • 26
0

What makes most regular expressions difficult, is the fact that they're greedy by default. By making them ungreedy using the U modifier, writing something like this becomes very trivial. The following should work.

/<h1>(.*)<\/h1>/U
kba
  • 19,333
  • 5
  • 62
  • 89
  • @M42 Yes it does, because I made it _ungreedy_. Please read my answer and the link about the U modifier. Or even better, test it yourself. – kba Feb 04 '12 at 12:54