Regex matching the end of a string with a word

Question

How do I match the end of a regular expression by a word? For example:

<h1><a href=...></a>CONTENT</h1>

Given that <h1> is the start tag, how do I return <a href=...></a>CONTENT?

The expression /< h1>(([<\/h1>\b])*/ does not seem to work

@Borealid: Without even clicking it, I knew which post that was :) — mpen, Feb 04 '12 at 00:30
@Borealid: Give it a rest. There's no nested tags here, nor did the question ask for parsing. Also it's kind of rude to spam it onto every newbie question. http://meta.stackexchange.com/a/73168/148103 — mario, Feb 04 '12 at 00:33
@mario The two web sites mentioned in that response are actually useful, thanks! I'll use them instead. As for applicability: in my experience, one tag becomes two, and then someone puts in an HTML entity, and then you end up with a comment, and before you know it, you're parsing HTML. — Borealid, Feb 04 '12 at 00:40
@Borealid: Right, but that's true of everything. You start out with a simple `for` loop, and then requirements change, and new requirements are added, and before you know it you have a 2000-line function. Because everyone knows that you can never, *ever* change your design later. Refactoring is simply out of the question! Right? — ruakh, Feb 04 '12 at 03:37

score 3 · Answer 1 · answered Feb 04 '12 at 00:33

3

/<h1>([\s\S]*)<\/h1>/

I think this should help you out.

answered Feb 04 '12 at 00:33

FeRtoll

1,247
11
26

Well then this should do it /
([\x20-\x7E]*)<\/h1>/g
– FeRtoll Feb 04 '12 at 15:11

score 0 · Accepted Answer · answered Feb 04 '12 at 01:02

0

What makes most regular expressions difficult, is the fact that they're greedy by default. By making them ungreedy using the U modifier, writing something like this becomes very trivial. The following should work.

/<h1>(.*)<\/h1>/U

answered Feb 04 '12 at 01:02

kba

19,333
5
62
89

@M42 Yes it does, because I made it _ungreedy_. Please read my answer and the link about the U modifier. Or even better, test it yourself. – kba Feb 04 '12 at 12:54

Regex matching the end of a string with a word

2 Answers2

([\x20-\x7E]*)<\/h1>/g