Answered:
This regex works:
<item>(?:(?!</item>).|\n)*?(?:(?=201[0-3]</pubDate>))(?:(?!</item>).|\n)*?</item>
while this one crashes the stack:
<item>(?:(?!</item>).|\n)*(?:(?=201[0-3]</pubDate>))(?:(?!</item>).|\n)*</item>
This also works, without lookaheads:
(?s)<item>.*?201[0-3]</pubDate>.*?</item>
Original question:
I have an XML file in Sublime Text 2 (example below). I want to find all the <item
> elements that contain a <pubDate
> element from the years 2010 through 2013.
The above regex works correctly, but when I do find all (the file is about 1MB with about 120 matches) ST2 runs out of stack space.
What horrible inefficiencies lurk above?
Example XML:
<?xml version="1.0" encoding="utf-8"?>
<channel>
<item>
<title>This will match</title>
<link>http://gcanyon.posterous.com/</link>
<pubDate>Sat Mar 10 10:22:00 -0800 2012</pubDate>
<dc:creator><![CDATA[Geoff Canyon]]></dc:creator>
</item>
<item>
<title>This won't</title>
<link>http://gcanyon.posterous.com/</link>
<pubDate>Tue Jun 30 05:01:32 -0700 2009</pubDate>
<dc:creator><![CDATA[Geoff Canyon]]></dc:creator>
</item>
</channel>
</rss>